[Twisted-Python] Contributing?
James Y Knight
foom at fuhm.net
Thu Aug 26 10:35:48 MDT 2004
On Aug 22, 2004, at 3:28 PM, angryhicKclown at netscape.net wrote:
> I was looking over the page on twistedmatrix.com on contributing, and
> it referred me to here. Over at the mono project, they have a
> todo-list sort of thing, that idle hackers such as myself can work on.
> I was wondering what the best way (besides monetary...I am a poor
> student) to contribute to the Twisted project is?
Welllll, since you ask.. :)
Here's a relatively self-contained project that could use working on:
twisted.web.microdom and twisted.web.sux is supposed to implement an
XML/XHTML and HTML parser. It is pretty useless as an XML parser, given
its relative slowness and the existence of expat/python xml libraries
which do already do a very good job of being an XML parser. Microdom is
*almost* a useful HTML parser, but it's missing support for a lot of
HTML peculiarities that really need to be handled
("<tr><td>foo<tr><td>bar" for one, strange whitespace collapsing rules,
for another, and I'm sure there's more). Perl has a very good HTML
parser in HTML::TreeParser whose algorithms could be duplicated.
This project isn't even very twisted specific (sux/microdom only have
very minor dependancies on the rest of twisted) so it could conceivably
be made into a general purpose python module in its own right. There
are a variety of other Python HTML parsers, but from what I can tell,
they're even worse than microdom is. It'd be way cool to have a python
HTML parser that actually works. Can't let perl win! Any
victi...volunteers? ;0
James
More information about the Twisted-Python
mailing list