Mike Looijmans
nlv11281 at natlab.research.philips.com
Tue Oct 25 05:00:13 EDT 2005
I was looking at Barry's fix in util.py, but I had already done some work into the same direction, in order to upload huge files (to a TAPE streamer, go figure...). http://www.modpython.org/pipermail/mod_python/2005-March/017773.html My idea is that the "read_to_boundary" function is unnessary complex. the following code does basically the same thing, but it skips a lot of memcpy calls (based on the 3.1.4 code but should work for 3.2.x too): def read_to_boundary(self, req, boundary, file): delim = "" line = req.readline(10240) while line and not line.startswith(boundary): odelim = delim if line[-2:] == "\r\n": delim = "\r\n" line = line[:-2] elif line[-1:] == "\n": delim = "\n" line = line[:-1] else: delim = "" file.write(odelim + line) line = req.readline(10240) Consider: - If the last char is a #13 (\r) then it's just sent to the file. The next readline will return the \n by itself. Since most callback handlers will just write to a disk file, they don't care about line ends anyway. They must be prepared to receive partial lines anyway. - line.startswith(boundary) Now you may argue that it is only a boundary if it appears on a line by itself. Well, I say, the odds that your file contains a boundary string followed by a newline are not _significantly_ smaller than without that one character. I tested this implementation with various binary (100MB), DOS and UNIX text files, without problems. The uploaded files were bitwise equal. I also implemented the callback by allowing subclasses to override make_file. That should go into another thread, I guess. -- Mike Looijmans Philips Natlab / Topic Automation
|