[Twisted-Python] Unicode
Grant Baillie
grant at osafoundation.org
Mon Oct 3 19:01:15 MDT 2005
Well, I agree the message could be more brutal :).
What's the developer use case for "transparent exchange" of unicode
strings in a network framework? Every protocol and data format has
some different (sometimes goofy, and sometimes nonexistent) scheme
for encoding non-ASCII end-user strings. Since the internet only
understands bytes, it's almost certainly programmer error (omitting
to implement the protocol's encoding scheme) if you try to send a
unicode over the wire.
I no more expect
self.transport.write(u"Shoot me with a \u2022")
to work than
self.transport.write(7)
inside my protocol code, for exactly the same reason in both cases.
--Grant
Grant Baillie
Open Source Applications Foundation
http://www.osafoundation.org
PS: As an aside, I actually believe a "default encoding" (site-wide
or application-wide) scheme isn't so great either. It leads to
developers making assumptions about the global setting, and those
assumptions lead to different modules being incompatible.
On Oct 3, 2005, at 17:19, Ken Kinder wrote:
> Perhaps like many developers, I came across this surprising bit of
> code
> inside a couple of Twisted's methods:
>
> if isinstance(data, unicode): # no, really, I mean it
> raise TypeError("Data must be not be unicode")
>
> And of course, I simply removed those lines. But I'm sure if I submit
> that patch, a discussion similar to this one would develop, because
> it's
> unlikely that such code would have been accidentally included:
>
> http://twistedmatrix.com/pipermail/twisted-python/2005-April/
> 010199.html
>
> Python library will kindly cast unicode objects to strings when
> necessary, as is mentioned in the above thread. It *would* be fair to
> say that not implicitly deciding on an encoding type is "taking the
> high
> road" if the behavior of encoding weren't so uniformly explicit and
> consistent in Python and its standard library:
>
> http://www.python.org/peps/pep-0100.html
> http://docs.python.org/api/arg-parsing.html
> http://docs.python.org/api/stringObjects.html
>
> (There are more...)
>
> The purpose of Python's unicode type is transparent exchange of string
> objects, whether those string objects are of type str or type unicode.
> Pretending that isn't so and raising a TypeError is not helpful. I
> would
> urge you to AT LEAST provide a detailed explanation in that error,
> explaining the philosophical disagreement you have with Python's
> unicode-string conversion behavior and have a flag you can set to
> disable that check.
More information about the Twisted-Python
mailing list