[Twisted-Python] Re: Twisted and the Posh Module
Bob Ippolito
bob at redivi.com
Mon Mar 14 13:42:55 MST 2005
On Mar 14, 2005, at 3:24 PM, Ken Kinder wrote:
> Ed Suominen wrote:
>
>> I think having some sort of process pool mangagement in Twisted is a
>> great idea, especially with multi-core CPUs emerging on the scene. I
>> have access to a dual-core Pentium Prescott CPU and it would be
>> great to have that available to keep both cores humming on a certain
>> CPU-intensive project I'm considering.
>>
>> However, I'm not sure the best way to go about it would be with the
>> posh module, pypar (see http://datamining.anu.edu.au/~ole/pypar/),
>> or just using Perspective Broker as an underlying message-passing
>> mechanism with UNIX sockets and/or TCP. One thought might be to have
>> a single master process start up and act as a PB server and process
>> pool manager. Subsidiary processes could then make authenticated PB
>> connections to the server to "volunteer" for work in the process
>> pool.
>> Note that pypar lets you easily find out how many CPUs you have
>> under kernel control, with pypar.size(). Thus, the main process
>> could start the process pool by spawning a subsidiary "volunteer"
>> process for each CPU core present.
>>
> My goal is somewhat similar in that I have multiple CPU's that aren't
> being used very much, but the other problem is that certain libraries
> I use (PIL) grab the GIL for long-running operations like image
> modifications. I can thread them off, but it still liberally acquires
> the GIL and makes the server unresponsive for the duration of the
> operation.
>
> I too had looked at using Perspective Broker to communicate with
> separate "worker" processes, and the only reason I'm not excited with
> that option is that the bandwidth between the master and worker
> processes involves a lot of large binary strings. Shared memory
> seemed more efficient than PB for transferring those strings.
>
> So for that reason, finding out how many CPU's I have isn't that
> important, because I'll still want more worker processes than I have
> CPU's.
Ideally we'd have a message-passing system that could use multiple
backends (i.e. shared memory, mmap, or sockets). Using sockets is
probably a better solution for now -- you're likely to do a lot of
copying anyway, cause it's Python and PIL :)
With sockets, you can scale right to multiple computers.. with shared
memory, you're stuck on a single box. The API that POSH exposes
(proxied non-blocking objects) can't scale well to multiple machines,
where a socket-based API could be scaled down to actually use an
efficient shared memory implementation at some point.
-bob
More information about the Twisted-Python
mailing list