[Twisted-Python] Surprises in twisted.web.woven
Tim Allen
screwtape at froup.com
Mon Aug 4 06:13:26 MDT 2003
On Monday, Aug 4, 2003, at 01:37 Australia/Sydney, Alex Levy wrote:
> On Sun, Aug 03, 2003 at 11:05:48PM +1000, Tim Allen wrote:
>> Finally, if I have a template that looks like this:
>>
>> <p>Have a look at
>> <a href="cat1.png">these</a>
>> <a href="cat2.png">three</a>
>> <a href="cat3.png">cats</a>.</p>
>>
>> then the output HTML looks like:
>>
>> <p>Have a look at <a href="cat1.png">these</a><a
>> href="cat2.png">three</a><a href="cat3.png">cats</a>.</p>
>
> This is another issue that I've wrangled with for some time. Feel free
> to
> make a lot of noise and hope somebody fixes it; I believe the problem
> lies
> in the XML parser itself. As Wayne so aptly put it, "We fear change."
As it turns out, I suspect this is false.
twisted.web.microdom.MicroDOMParser.shouldPreserveSpace() currently
looks like:
def shouldPreserveSpace(self):
for edx in xrange(len(self.elementstack)):
el = self.elementstack[-edx]
if el.tagName == 'pre' or el.getAttribute("xml:space", '')
== 'preserve':
return 1
return 0
I dare say that a simple modification of this method would be enough,
without delving into the depths of the XML parser (which doesn't
discard pure-whitespace text elements at all).
Off the top of my head, I can think of three whitespace-handling modes:
* Preserve all whitespace (as used in the HTML <pre> tag and elements
with xml:space='preserve'.)
* Collapse redundant whitespace (if not string.strip(): return ' ')
which most closely matches how HTML user-agents handle whitespace.
* Strip whitespace (run .strip() over all text nodes - I believe this
most closely matches how XML processors handle whitespace).
Is there any reason why white-space preserving should not always be on?
More information about the Twisted-Python
mailing list