[Twisted-Python] When would you considering split a server application to some physical instance with different logic function?
Maarten ter Huurne
maarten at treewalker.org
Mon Dec 1 16:22:13 MST 2008
On Monday 01 December 2008, Peter Cai wrote:
> As far as I know, some networking device manufactures use this model to
> implement their routers or switches.
> but I have never heard any examples besides that.
The Postfix mail server uses an architecture of many small processes working
together:
http://www.postfix.org/OVERVIEW.html
One of the big motivations for this architecture is security: each process
only needs to run with the minimum privileges it needs to do its work. Just
because SMTP has to listen to port 25 does not mean the entire mail server
has to run as root.
> As twisted book says, most networking application use one of these 3
> modes:
>
> 1. handle each connection in a separate operating system process, in
> which case the operating system will take care
> of letting other processes run while one is waiting;
>
> 2. handle each connection in a separate thread1 in which the threading
> framework takes care of letting other threads
> run while one is waiting; or
>
> 3. use non-blocking system calls to handle all connections in one thread.
> (Like twisted or lib-event or just select)
This describes how multiple connections can be handled (for example Apache
has several different approaches for this), which is a different issue from
spreading functionality over different processes (like Postfix does).
Where do you expect the bottleneck to be in your application? Does it have
to serve a large number of clients? Does it have to send or receive large
amounts of data? Does it have to do heavy computations? Does it have to get
data to or from external servers like a DB server or a web service?
> After considering for a while, I thought there are some faults in the
> multiple parts model:
>
> 1. Writing code to handle message is much more tedious than just doing
> function calls
> 2. It's not very easy to make a pipe line works fine.
It depends a lot on what you are trying to build and where and how you make
the splits between differents parts of your application.
For example a function call is simple in the case there is only one thread
and the called operation will not block. If it does block, you need to add a
callback mechanism like Twisted's Deferred. Or if there are multiple
threads, you have to be very careful about which functions you are allowed
to call while your thread is holding one or more locks. So a function call
is not so simple anymore when it's part of a complex application...
There are libraries that make communication over a pipe more friendly, such
as Perspective Broker or Foolscap. This does not mean you can replace any
arbitrary function call by a remote method call, but it does mean you can
skip writing (de)serialization code again and again.
No matter which model you choose, dividing your application into
communicating blocks is a good idea: it is absolutely necessary when using
multiple processes, if avoids a lot of locking issues when using threads and
it makes your application easier to test in all three models.
Bye,
Maarten
More information about the Twisted-Python
mailing list