[mod_python] Large File Upload Issues

Kurt Nordstrom knordstrom at library.unt.edu
Mon Nov 19 16:35:39 EST 2007


Well, it's been a while, but my boss has nudged me into posting the fix 
we came up with for this issue.  Just y'know, in the interests of 
closure, and in case somebody else is trying something similar.  The 
working code looks like:

from mod_python import apache

import os

def handler(request):
    #request.content_type = "text/plain"
    content_length = int(request.headers_in["Content-Length"])
    outPath = "/home/vphill/test/upload_out.dat"
    outFile = open(outPath, "w")
    conn = request.connection
    i = 0
    buf = ""
    while i < content_length:
        c = conn.read()
        if c == "":
            break
        buf = buf + c
        if len(buf) >= 102400:
            outFile.write(buf)
            outFile.flush()
            buf = ""
        i += len(c)
    if len(buf):
        outFile.write(buf)
        outFile.flush()
    outFile.close()
    talkBack = "Recieved %s bytes\n" % os.stat(outPath).st_size
    request.set_content_length(len(talkBack))
    request.write(talkBack)
    return apache.OK

We've had success using it (with curl as the sender) to receive files exceeding 2 GB in size.

So it just goes to show, if the programmer can't fix it, let the librarian look at it.

Hope this helps someone.



Graham Dumpleton wrote:
> Read one byte at a time is exceedingly inefficient. Rather than
> reinvent the wheel, perhaps look at Tramline.
>
>   http://www.infrae.com/products/tramline
>
> Personally I don't believe that Python is a good way of doing this and
> for best performance it should be done as an Apache module in C code.
> At least though the Tramline approach is more efficient than your
> current approach.
>
> Graham
>
> On 19/10/2007, Kurt Nordstrom <knordstrom at library.unt.edu> wrote:
>   
>> We're working on a webapp that will need to receive, from its clients,
>> fairly large files (it's an archival storage system).  The logical way
>> to do this seems to be to use http PUT, with appropriate code on the
>> server side to store the file to the proper place.
>>
>> Borrowing and modifying some code posted back in '05 to this list by a
>> Jeremy Jones, I have been playing with this:
>>
>> from mod_python import apache
>>
>> import os
>>
>> def handler(request):
>>     #request.content_type = "text/plain"
>>     content_length = int(request.headers_in["Content-Length"])
>>     outPath = "/home/webtmp/upload_out.dat"
>>     outFile = open(outPath, "w")
>>     conn = request.connection
>>     i = 0
>>     buf = ""
>>     while i < content_length:
>>         c = conn.read(1)
>>         if c == "":
>>             break
>>         buf = buf + c
>>         if len(buf) >= 102400:
>>             outFile.write(buf)
>>             outFile.flush()
>>             buf = ""
>>         i += 1
>>     if len(buf):
>>         outFile.write(buf)
>>         outFile.flush()
>>     outFile.close()
>>     talkBack = "Recieved %s bytes\n" % os.stat(outPath).st_size
>>     request.set_content_length(len(talkBack))
>>     request.write(talkBack)
>>     return apache.OK
>>     


-- 
===
Kurt Nordstrom
Programmer
University of North Texas Libraries
Digital Projects Unit
(940) 891-6747



More information about the Mod_python mailing list