Alexis Marrero
amarrero at mitre.org
Sat Nov 5 11:14:45 EST 2005
All, I don't think this is the right mailing list to send this but here it goes. (Let me know if there is a developers list). The current 3.1 mod_python implementation of mod_python.util.StorageField.read_to_boudary reads as follows: 203 def read_to_boundary(self, req, boundary, file): 204 delim = "" 205 line = req.readline() 206 sline = line.strip() 207 last_bound = boundary + "--" 208 while line and sline != boundary and sline != last_bound: 209 odelim = delim 210 if line[-2:] == "\r\n": 211 delim = "\r\n" 212 line = line[:-2] 213 elif line[-1:] == "\n": 214 delim = "\n" 215 line = line[:-1] 216 file.write(odelim + line) 217 line = req.readline() 218 sline = line.strip() As we have discussed previously: http://www.modpython.org/pipermail/mod_python/2005-March/017754.html http://www.modpython.org/pipermail/mod_python/2005-March/017756.html http://www.modpython.org/pipermail/mod_python/2005-November/019460.html This triggered couple of changes in mod_python 3.2 Beta which reads as follows: 33 # Fixes memory error when upload large files such as 700+MB ISOs. 34 readBlockSize = 65368 35 ... 225 def read_to_boundary(self, req, boundary, file): ... 234 delim = '' 235 lastCharCarried = False 236 last_bound = boundary + '--' 237 roughBoundaryLength = len(last_bound) + 128 238 line = req.readline(readBlockSize) 239 lineLength = len(line) 240 if lineLength < roughBoundaryLength: 241 sline = line.strip() 242 else: 243 sline = '' 244 while lineLength > 0 and sline != boundary and sline ! = last_bound: 245 if not lastCharCarried: 246 file.write(delim) 247 delim = '' 248 else: 249 lastCharCarried = False 250 cutLength = 0 251 if lineLength == readBlockSize: 252 if line[-1:] == '\r': 253 delim = '\r' 254 cutLength = -1 255 lastCharCarried = True 256 if line[-2:] == '\r\n': 257 delim += '\r\n' 258 cutLength = -2 259 elif line[-1:] == '\n': 260 delim += '\n' 261 cutLength = -1 262 if cutLength != 0: 263 file.write(line[:cutLength]) 264 else: 265 file.write(line) 266 line = req.readline(readBlockSize) 267 lineLength = len(line) 268 if lineLength < roughBoundaryLength: 269 sline = line.strip() 270 else: 271 sline = '' This function has a mysterious bug in it... For some files which I could disclose (one of them been the PDF file for Apple Pages User Manual in Italian) the uploaded file in the server ends up with the same length but different sha512 (the only digest that I'm using). The problem is a '\r' in the middle of a chunk of data that is much larger than readBlockSize. Anyhow, I wrote a new function, which I believe is much simpler, and test it with thousands and thousands of different files and so far it seems to work fine. It reads as follows: def read_to_boundary(self, req, boundary, file): ''' read from the request object line by line with a maximum size, until the new line starts with boundary ''' previous_delimiter = '' while 1: line = req.readline(1<<16) if line.startswith(boundary): break if line.endswith('\r\n'): file.write(previous_delimiter + line[:-2]) previous_delimiter = '\r\n' elif line.endswith('\r') or line.endswith('\n'): file.write(previous_delimiter + line[:-1]) previous_delimiter = line[-1:] else: file.write(previous_delimiter + line) previous_delimiter = '' Mod_python developers, let me know any comments on it and if you test it and fails please also let me know. /amn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20051105/4f6b17b3/attachment-0001.html
|