[Twisted-web] Cache-friendly Nevow pages
Mary Gardiner
mary-twisted at puzzling.org
Sat Aug 7 01:02:59 MDT 2004
For some reason, it might be spending most of my time sitting at the end
of a long thin piece of string connecting me to the rest of the
universe, but it might also have been watching people pound on my RSS
feeds every five minutes, I've been trying to write cache-friendly Nevow
resources.
This involves setting two HTTP headers, "Last-Modifed" and "ETag". At
the moment I'm setting both of these headers using my data source (files
in the file system). However, this has left me with a bit of a quandry
about my docFactory templates.
When my templates change, so should my Last-Modified and ETag headers.
Otherwise clients using caches will see my old templates more or less
indefinitely, at least on pages I don't subsequently change, because
their conditional GET requests complete with correct If-Modified-Since
and If-None-Match headers will tell the server never to send a fresh
copy of the data.
So, I'm faced with the problem of dating my templates or otherwise
detecting when they change and I can't think of a good way.
Some thoughts:
1. use file timestamps on the template files
Pros: Fits OK with the way I deal with the rest of the website data
Cons: Reduces flexibility. I can't think of a good way to do this
with Stan templates. I also can't think of a good way to do this
without restarting the server when my templates change. (I do
currently do this, but would prefer not to.)
2. generate the ETag header based on a hash of page contents
Pros: As best I can tell, this is how the ETag header is really meant
to be generated, ideally it signals octect equality and should change
if, for example, Nevow for some reason starts pretty-printing output.
Cons: rend.Page.renderHTTP seems to make this really hard --
even if you set the bufferedflag = True, rend.Page.afterRender
doesn't seem to have any way to access the result of the render.
(Correct me if I'm wrong.) Also, this doesn't help with the
Last-Modified date, which means I'm not helping HTTP/1.0 caches very
much, unless I store the date the hash changed somewhere.
3. store the templates in some kind of object store and date-stamp
them there.
Pros: This might well let me change templates without restarting the
server.
Cons: It imposes a maintainence burden whereby I have to update the
objet database with new templates. I like to have a copy of my
website and templates on two different servers, and as best I can
tell, no object database is going to like being copied to a remote
server without me killing all associated processes on the remote
server first, so there's a deployment problem.
4. hash the template so that a changed template means a changed hash
Pros: This is probably nearly as good as hashing the page content,
accuracy-wise.
Cons: I don't have any idea how to hash a DocFactory object
effectively. Hashing the DocFactory still leaves me vulnerable to
changes in Nevow's rendering. Hashing the DocFactory won't tell me to
update Last-Modified unless I store the date that the DocFactory
changed somewhere.
Anyone got any thoughts or has anyone solved this problem before? Help
with implementing 2 (how do I get the page contents in order to hash
them) or 4 (how can I hash a DocFactory object) also appreciated.
-Mary
More information about the Twisted-web
mailing list