[Twisted-Python] Twisted protocol as Django data source

Thu May 28 20:38:01 MDT 2009

Thanks for all your replies.

Greg wrote:
> You might want to look into Orbited, there are a number of Django  
> examples
> out there:
>
> http://www.orbited.org

This looks very cool.  http://preview.tinyurl.com/3suqth (CometDaily)  
makes it seem extremely easy to develop for. Gonna play with this when  
I get a chance.

Jean-Paul wrote:
> Another option is to have Django talk to a Twisted process via some  
> RPC
> mechanism that won't require you to use Twisted in the Apache process.
> For example, XML-RPC.

I should have thought of this, as we're already using XML-RPC  
extensively in the application. All I'd really need to do is write a  
get_latest_data function and have Ajax poll Django poll XML-RPC. I  
keep forgetting that the GIL isn't anywhere near as much of a problem  
for this application now that it's Twisted than it was back when it  
was done entirely in threads.

And Esteve, there's a very good chance I'm going to start using  
Thrift. Thanks for pointing that out.

Alex wrote:
> I wrote up some thoughts on this here:
> http://clemesha.org/blog/2009/apr/23/Django-on-Twisted-using-latest-twisted-web-wsgi/
>
> basically it comes down to running Django off the very latest WSGI
> code (in the trunk still)
> found in twisted.web, which I've found to work very well.
>
> Also see here:
> http://blog.dreid.org/2009/03/twisted-django-it-wont-burn-down-your.html

WSGI is scary to me mostly because it's done in threads. Trying to  
find GIL-related bugs in the old code cost me a LOT of time. And while  
presumably this integration is done by people who are far better  
coders than I, I can't shake the thread safety thing.

Also, I almost never use someone else's svn head code if I can avoid  
it. This is partly because I don't consider myself a good enough  
debugger to file good reports against somebody's enormous source tree,  
but mostly because I want to be spending most of my time worrying  
about my code than someone else's.

The other unfortunate restraint is that Apache in this instance is non- 
optional, as the box we're using also houses mod_svn repositories.

Dave wrote:
> I don't think the database option is a hack. Have twisted write the  
> live
> stream items into a ring-buffer SQL database table (eg. use records  
> 1-100
> over and over again), including a timestamp for each entry. Then  
> your django
> page can always retrieve the most current set of entries by  
> SELECTing * from
> the table in descending order by timestamp, which django is good at,  
> staying
> live with constant refreshing. There are no interprocess communication
> pipes, broken sockets and timeouts etc. to screw up, no flaky  
> javascript to
> worry about, and the DBMS will handle the multiple simultaneous  
> separate
> data sources correctly. The central twisted application only has to  
> keep
> track of the buffer position. It's just not as much fun, though.

glyph at divmod wrote:
> Personally, I don't like using databases as a point of integration.
> Inevitably your Django app or your Twisted app will want to enforce
> constraints on the data and model things about the relationships  
> between
> rows beyond what one can glean by inspecting the SQL schema.

Okay, so calling it a "hack" probably wasn't a good choice of words. I  
was actually leaning strongly towards using our Postgres installation  
for Dave's reasons. But the thought of using the database for  
integration bothers me for precisely glyph's reason, even though we  
already have the multiplexer sticking data into Postgres. That's  
pretty much the driving worry behind my initial post. I'm trying to  
avoid Django's nice builtin database API like the plague because we  
store a LOT of data. Our indices have to be optimized for INSERTs; a  
typical query on this data comes with a human-noticeable delay (I  
haven't timed it, but on the order of a second). Doing that for 30-odd  
satellites isn't going to happen fast enough to make the UI smooth as  
we would like.

If I followed Dave's suggestion to the letter (ring-buffer), it would  
work, but something in me strongly resists storing the same data in  
two places. Probably the fact that every time I've done that, I've had  
sync issues. I could write a trigger for the synchronization, but that  
would break sqlite support, which I'm trying to keep in since it was  
requested at a conference.

Thank you all again for the advice, and I'm sorry if I misspelled  
anybody's name.

As I continue to work with this project, you'll doubtless hear from me  
again.

-Dan