Jim Gallacher
jpg at jgassociates.ca
Wed Jun 14 09:01:58 EDT 2006
sandip more wrote: > > > ----- Original Message ----- From: "Mike Looijmans" > <nlv11281 at natlab.research.philips.com> > To: "sandip more" <sandipm at talentica.com> > Cc: <mod_python at modpython.org> > Sent: Tuesday, June 13, 2006 7:27 PM > Subject: Re: [mod_python] Re: Mike's psp upload > > >> The default behaviour of FieldStorage is to place the uploaded file in >> some temp location. By providing the callback, you can prevent the >> file being stored twice (once on temp, once on the final location). >> This allows uploading files of many gigabytes without consuming >> diskspace or memory. >> >> Note that by doing: >> filedata = afile.file.read() >> you read the entire file into system memory. If a user sends you a 1GB >> file, your server is likely to "die" there. Use one of the >> shutil.copyfile functions to copy the file to where you want, without >> (potentially) consuming megabytes of memory. > > Thanks Mike for explaination, I got the point. but i didn't got the > callback function's implementation.. > can you give me some link to it? The following examples will be included in version 3.3. Simple file control using class constructor This example uses the FieldStorage class constructor to create the file object, allowing simple control. It is not advisable to add class variables to this if serving multiple sites from apache. In that case use the factory method instead. class Storage(file): def __init__(self, advisory_filename): self.advisory_filename = advisory_filename self.delete_on_close = True self.already_deleted = False self.real_filename = '/someTempDir/thingy-unique-thingy' super(Storage, self).__init__(self.real_filename, 'w+b') def close(self): if self.already_deleted: return super(Storage, self).close() if self.delete_on_close: self.already_deleted = True os.remove(self.real_filename) request_data = util.FieldStorage(request, keep_blank_values=True, file_callback=Storage) Advanced file control using object factory Using a object factory can provide greater control over the constructor parameters. import os class Storage(file): def __init__(self, directory, advisory_filename): self.advisory_filename = advisory_filename self.delete_on_close = True self.already_deleted = False self.real_filename = directory + '/thingy-unique-thingy' super(Storage, self).__init__(self.real_filename, 'w+b') def close(self): if self.already_deleted: return super(Storage, self).close() if self.delete_on_close: self.already_deleted = True os.remove(self.real_filename) class StorageFactory: def __init__(self, directory): self.dir = directory def create(self, advisory_filename): return Storage(self.dir, advisory_filename) file_factory = StorageFactory(someDirectory) [...sometime later...] request_data = util.FieldStorage(request, keep_blank_values=True, file_callback=file_factory.create) > and for memory thing, I am not clear about role of apache in this. > I think Apache should handle basic http request protocol. > it should store data in some file on disc rather than in-memory and > then should give mod_python handler a pointer to that file location.? Except that is not the way it works. Apache accepts the connection and dispatches the request to the appropriate handlers for each phase of the request. It's up to the handlers to decide what to do in each phase. There may be default modules that will handle a particular phase, but you can also register other handlers for a given phase. That is what the Python*Handler directives do. Apache gives your handler a pointer to the incoming file stream and it then becomes the responsibility of the handler to deal with it. If no handlers consume the stream it gets discarded (I assume). This is the nature of handlers. > because anyway in the case of multiple requests, apache might fall short > of memory > and might crash? Pretty much. This is why the handler should not hold a copy of a large file in-memory. Jim
|