Hi Dieter, *,
Dieter Maurer wrote:
> The "FieldStorage" must be rebuilt, because it can contain
> file objects. These file objects must be reset as they may
> have been (partially) read in the previous request.
> This prevent reusing the previous "FieldStorage".
As you may have seen in my first patch proposal I actually tried to
address this problem, although halfbaken and buggy. So here's the
next trial. I hope I've got it straight now.
Why didn't I do something to reset the files, FiledStorage creates?
Because I thought it wouldn't be needed. If you look at
cgi.FieldStorage you will see, that accessing FieldStorage's 'value'
field, will always reset the file before reading it and HTTPRequest's
'BODY' field does the same.
Cause of your hint I looked again and realised that I was wrong.
HTTPRquest's 'BODYFILE' field fails to reset the file before accessing
it (as in OFS/Image.py the File class does). That could be easily
fixed. What's worse is that FiledStorage can possibly create
subinstances of itsellf which can possibly contain other fileobjects.
This is the case for multipart-type requests.
How to solve this?
I can see three alternative aproaches:
1) In HTTPRequest.retry loop through self._fs.list (if it exists) and
call seek(0) on every fileobject, you can find there. This can
only work if the maximum depth of the FiledStorage tree is
predictable, which in turn depends on the possible input. I don't
know enough about the possible HTTP requests to determine such a
limit or to even 'prove' its existance.
2) In HTTPRequest.retry recursively walk through the FieldStorage
trees in self._fs.list (if it exists). I wouldn't do that in
Python without being able to predict the (worst case) number of
function calls needed to do this recursion. And without refreshing
my knowledge of compiler techniques I wouldn't be able to roll the
recursion out to a loop.
3) This is a somewhat drastic (and possibly conceived as hackish?)
approach which OTOH is independent of the input and should perform
quite well. I overwrite cgi.FieldStorage, esp. its make_file
method, such that it adds all fileobjects, that it creates, to a
(possibly caller provided) list. Thus the caller
(i.e. HTTPRequest) has a list of all those fileobjects that need to
be reset if the FieldStorage should be reused (as in ZPublisher's
retries).
For the patch attached I chose the third approach. Please let me
know, what you think about it. If it is assessed to be ok I will
update the collctor item, I opened yesterday.
cheers,
andreas
--- lib.orig/python/ZPublisher/HTTPRequest.py 2004-08-18 16:37:18.000000000 +0200
+++ lib/python/ZPublisher/HTTPRequest.py 2004-08-19 17:42:31.000000000 +0200
@@ -60,6 +60,37 @@
class NestedLoopExit( Exception ):
pass
+FILELIST_ENV_ENTRY = 'request.filestorage.filelist'
+
+class ZFieldStorage(cgi.FieldStorage):
+ """\
+ Tame the FieldStorage.
+
+ This class overrides the stock implementation to add all file
+ objects, generated while parsing a request, to a list. This list
+ is given by the caller.
+
+ Unfortunately cgi.FieldStorage is not generic enough, to adapt the
+ list of constructor parameters, by just deriving from it. Thus
+ the list is specified within the environ dict, given to the
+ constructor. This is ugly and could be fixed by a simple patch
+ against cgi.FieldStorage.
+ """
+
+ def __init__(self, fp=None, headers=None, outerboundary="",
+ environ=os.environ, keep_blank_values=0, strict_parsing=0):
+ self._filelist = environ.get(FILELIST_ENV_ENTRY, [])
+ cgi.FieldStorage.__init__(self, fp, headers, outerboundary,
+ environ, keep_blank_values,
+ strict_parsing)
+
+ def make_file(self, binary=None):
+ import tempfile
+ res = tempfile.TemporaryFile("w+b")
+ self._filelist.append(res)
+ return res
+
+
class HTTPRequest(BaseRequest):
"""\
Model HTTP request data.
@@ -125,10 +156,15 @@
def retry(self):
self.retry_count=self.retry_count+1
- self.stdin.seek(0)
+ if not self._fs:
+ raise ValueError, "Retrying on partially initialized request is not supported. Call processInputs()."
+ for f in self._fs_filelist:
+ f.seek(0)
r=self.__class__(stdin=self.stdin,
environ=self._orig_env,
- response=self.response.retry()
+ response=self.response.retry(),
+ fs = self._fs,
+ flist = self._fs_filelist
)
r.retry_count=self.retry_count
return r
@@ -138,6 +174,8 @@
# removing tempfiles.
self.stdin = None
self._file = None
+ self._fs = None
+ self._fs_filelist = None
self.form.clear()
# we want to clear the lazy dict here because BaseRequests don't have
# one. Without this, there's the possibility of memory leaking
@@ -237,7 +275,7 @@
"""
return self._client_addr
- def __init__(self, stdin, environ, response, clean=0):
+ def __init__(self, stdin, environ, response, clean=0, fs=None, flist=None):
self._orig_env=environ
# Avoid the overhead of scrubbing the environment in the
# case of request cloning for traversal purposes. If the
@@ -261,7 +299,11 @@
self.steps=[]
self._steps=[]
self._lazies={}
-
+ self._fs = fs
+ if flist is None:
+ self._fs_filelist = []
+ else:
+ self._fs_filelist = flist
if environ.has_key('REMOTE_ADDR'):
self._client_addr = environ['REMOTE_ADDR']
@@ -382,23 +424,26 @@
taintedform=self.taintedform
meth=None
- fs=FieldStorage(fp=fp,environ=environ,keep_blank_values=1)
- if not hasattr(fs,'list') or fs.list is None:
+ if not self._fs:
+ fs_env = environ
+ fs_env[FILELIST_ENV_ENTRY] = self._fs_filelist
+ self._fs=ZFieldStorage(fp=fp,environ=fs_env,keep_blank_values=1)
+ if not hasattr(self._fs,'list') or self._fs.list is None:
# Hm, maybe it's an XML-RPC
- if (fs.headers.has_key('content-type') and
- fs.headers['content-type'] == 'text/xml' and
+ if (self._fs.headers.has_key('content-type') and
+ self._fs.headers['content-type'] == 'text/xml' and
method == 'POST'):
# Ye haaa, XML-RPC!
global xmlrpc
if xmlrpc is None: import xmlrpc
- meth, self.args = xmlrpc.parse_input(fs.value)
+ meth, self.args = xmlrpc.parse_input(self._fs.value)
response=xmlrpc.response(response)
other['RESPONSE']=self.response=response
self.maybe_webdav_client = 0
else:
- self._file=fs.file
+ self._file=self._fs.file
else:
- fslist=fs.list
+ fslist=self._fs.list
tuple_items={}
lt=type([])
CGI_name=isCGI_NAME
_______________________________________________
Zope-Dev maillist - [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
** No cross posts or HTML encoding! **
(Related lists -
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )