[mod_python] requests are blocking when using req.sendfile

Graham Dumpleton graham.dumpleton at gmail.com
Sat Sep 27 19:34:35 EDT 2008


I haven't read this latest response, but have one guess al the same. I
will say first that working out what the problem has been has been
made more difficult by you not offering up any source code for your
handler to show exactly what you are doing.

Anyway, one viable explanation for the problem is that you are using
mod_python session objects. Session objects are not normally unlocked
until cleanup handler, which is after all response content has been
returned. Thus if sending a lot of data back, requests from the same
user will be locked out, as well as other users who are unfortunate
enough to have the session keys fall into the same session mutex lock
bucket.

If you are using sessions, just before you return from handler, do
something like:

  req.session.save() # if necessary
  req.session.unlock()

I only say 'req.session' here as that is recommended convention for
where you store it, you may well have it stored elsewhere.

Graham

2008/9/28 Matt Barnicle <mattb at wageslavery.org>:
>> I ask again what I asked before and which you didn't answer.
>>
>> How many concurrent requests for these files are you receiving and
> how
>> long does it take to download a file?
>
> sorry for not answering that earlier.  there aren't all that many.
> we get one or two requests for various pages about every 30 seconds
> during normal traffic, sometimes that number goes up.  we only get
> requests for downloads occassionally.  a much smaller percentage of
> the requests are for the downloads.  they are various sizes.  the
> one that is getting the most requests right now is an 18 MB zip
> file.  it takes around 30 seconds to 2 minutes on a DSL connection
> to complete the download.  you can see for yourself here:
>
> http://giveback.net/download/music/3
>
> i kept the site hidden earlier, because i've posted some code from
> of our infrastructure and thought i should keep who i'm working for
> hidden for that reason, but i think it may help toward getting the
> problem solved and i'm not so concerned anymore..  apologies if that
> has hindered the process in any way.
>
>> In respect of your original comment
>>
>>> it works..  but the problem is, requests start backing up whenever a
>>> download is being sent.  it looks like new requests are being sent
>>> to the apache process that is tied up sending the download file.  i
>>> don't understand why this is happening though.  i would think that
>>> while it's sending the file, no new requests would be queued in that
>>> process.
>>
>> If you are truly using prefork MPM, then new requests cant be sent to
>> the busy process as requests are effectively accepted by a process
>> only when it is ready to handle a request.
>>
>> Please verify which MPM is being used by running:
>>
>>   httpd -V
>
> # httpd -V
> Server version: Apache/2.2.3
> Server built:   Jan 15 2008 20:33:41
> Server's Module Magic Number: 20051115:3
> Server loaded:  APR 1.2.7, APR-Util 1.2.7
> Compiled using: APR 1.2.7, APR-Util 1.2.7
> Architecture:   64-bit
> Server MPM:     Prefork
>  threaded:     no
>  forked:     yes (variable process count)
> Server compiled with....
>  -D APACHE_MPM_DIR="server/mpm/prefork"
>  -D APR_HAS_SENDFILE
>  -D APR_HAS_MMAP
>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>  -D APR_USE_SYSVSEM_SERIALIZE
>  -D APR_USE_PTHREAD_SERIALIZE
>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>  -D APR_HAS_OTHER_CHILD
>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>  -D DYNAMIC_MODULE_LIMIT=128
>
>> How many httpd processes are running at the time you are making
>> requests that appear to be blocked?
>
> it varies.  could be less than 10, could be between 10 and 40.  i've
> seen it happening with any number of processes, it doesn't seem to
> affect things.
>
>> What happens if you make requests at same time from a different web
>> browser or from another box, are they also stalled?
>
> it depends.  if i download in one browser, then surf around in
> another one, i may very well be able to click around in the other
> the entire time without any problems.  but i may also get the
> blocking issue.  and sometimes i will load the site fresh for the
> first time on some days, and it won't load at all for a few minutes,
> then it will.  and in these times, i will check /server-status and
> see a lot of requests in the WAIT state, and always when it's like
> this, there are one ore more downloads happening.
>
>> Are you using any AJAX stuff on client side for handling the
>> download?
>>
>> Graham
>
> no.  there is a lot of AJAX on the site, but none of it related to
> downloads.
>
> i wonder if this could be related to something else, maybe the
> database pooling?  i can't really imagine why, but i do wonder now..
>  as you said above, new requests really shouldn't be sent to busy
> processes.  so i'm baffled.
>
> - m@
>
>> 2008/9/27 Matt Barnicle <mattb at wageslavery.org>:
>>> sorry maybe i wasn't being clear..  i did get your code
>>> suggestions
>>> working, so that i've replaced the req.sendfile code with the
>>> apache.DECLINED code.  and it is sending the file ok and i'm
>>> downloading it ok, but it doesn't fix the problem as i originally
>>> posted, that the downloading is causing apache to block me when i
>>> click on links while the download is happening.
>>>
>>> - m@
>>>
>>>> What does your handler actually do? The following works for me no
>>>> problems.
>>>>
>>>> from mod_python import apache
>>>>
>>>> def handler(req):
>>>>     req.filename = '/etc/services'
>>>>     req.finfo = apache.stat(req.filename, apache.APR_FINFO_MIN)
>>>>     return apache.DECLINED
>>>>
>>>> Does the file exist prior to calling apr.stat()?
>>>>
>>>> Apache even sets the output headers okay:
>>>>
>>>> HTTP/1.1 200 OK
>>>> Date: Fri, 26 Sep 2008 09:50:33 GMT
>>>> Server: Apache/2.2.8 (Unix) mod_python/3.3.2-dev-20080311
>>>> Python/2.5.1
>>>> mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 mod_wsgi/3.0-TRUNK
>>>> Last-Modified: Sun, 23 Sep 2007 21:37:31 GMT
>>>> ETag: "535d-a5847-43ad44fac1cc0"
>>>> Accept-Ranges: bytes
>>>> Content-Length: 677959
>>>> Connection: close
>>>> Content-Type: text/plain
>>>>
>>>> Ie., ETag and length etc.
>>>>
>>>> Graham
>>>>
>>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>>> nope  :-(
>>>>>
>>>>>> One last thing to add then. If this doesn't work, I'll have to
>>>>>> just
>>>>>> try it myself. :-)
>>>>>>
>>>>>>   req.handler = 'default-handler'
>>>>>>
>>>>>> Graham
>>>>>>
>>>>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>> ok, that worked..  but unfortunately the blocking issue is
>>>>>>> still
>>>>>>> happening.  i've verified that the download controller is
>>>>>>> returning
>>>>>>> apache.DECLINED and the file downloads ok.
>>>>>>>
>>>>>>> - m@
>>>>>>>
>>>>>>>> Try adding:
>>>>>>>>
>>>>>>>>   req.path_info = ''
>>>>>>>>
>>>>>>>> Graham
>>>>>>>>
>>>>>>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>>> 2008/9/25 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>>>>> 2008/9/24 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>>>>>>> Are you setting a content length on the response before
>>>>>>>>>>>>>> calling
>>>>>>>>>>>>>> req.sendfile()?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How many concurrent requests for these files are you
>>>>>>>>>>>>>> receiving
>>>>>>>>> and
>>>>>>>>>>>>>> how long does it take to download a file?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Graham
>>>>>>>>>>>>>
>>>>>>>>>>>>> yes, i believe so..  now that you mention it, i haven't
>>>>>>>>>>>>> verified
>>>>>>>>>>>>> the file size from the code below, i will do that to be
>>>>>>>>>>>>> sure.
>>>>>>>>> here
>>>>>>>>>>>>> is the code:
>>>>>>>>>>>>>
>>>>>>>>>>>>> file_path = self.conf.run.app_folder + 'public' +
>>>>>>>>>>>>> download.path
>>>>>>>>>>>>> file_stat = os.stat(file_path)
>>>>>>>>>>>>> file_size = str(file_stat.st_size)
>>>>>>>>>>>>>
>>>>>>>>>>>>> self.req.headers_out['Content-Length'] = file_size
>>>>>>>>>>>>> self.req.headers_out['Content-Disposition'] =
>>>>>>>>>>>>> 'attachment;
>>>>>>>>>>>>> filename=%s' % os.path.basename(file_path)
>>>>>>>>>>>>>
>>>>>>>>>>>>> bytes_sent = self.req.sendfile(file_path)
>>>>>>>>>>>>
>>>>>>>>>>>> BTW, now that you are using mod_python 3.3.1, instead of
>>>>>>>>>>>> using
>>>>>>>>>>>> req.sendfile(), you could possibly delegate serving of
>>>>>>>>>>>> the
>>>>>>>>> static file
>>>>>>>>>>>> back to Apache. From memory this can be done using:
>>>>>>>>>>>>
>>>>>>>>>>>>   req.filename = self.conf.run.app_folder + 'public' +
>>>>>>>>>>>> download.path
>>>>>>>>>>>>   req.finfo = apache.stat(req.filename,
>>>>>>>>>>>> apache.APR_FINFO_MIN)
>>>>>>>>>>>>   return apache.DECLINED
>>>>>>>>>>>>
>>>>>>>>>>>> By returning apache.DECLINED you say mod_python handler
>>>>>>>>>>>> will
>>>>>>>>>>>> not
>>>>>>>>>>>> actually handle it, and by having updated req.filename
>>>>>>>>>>>> and
>>>>>>>>>>>> req.fileinfo updated to new file, when it falls through
>>>>>>>>>>>> to
>>>>>>>>>>>> default-handler it should serve it as static file.
>>>>>>>>>>>>
>>>>>>>>>>>> Doing it this way should bypass prior Apache access
>>>>>>>>>>>> control
>>>>>>>>>>>> checks.
>>>>>>>>>>>> Thus file doesn't need to be in Apache document tree or
>>>>>>>>>>>> anywhere
>>>>>>>>> else
>>>>>>>>>>>> that is accessible.
>>>>>>>>>>>>
>>>>>>>>>>>> I believe doing it this way also has benefit that Apache
>>>>>>>>>>>> will
>>>>>>>>>>>> automatically set various headers that you probably
>>>>>>>>>>>> wouldn't
>>>>>>>>>>>> be.
>>>>>>>>>>>>
>>>>>>>>>>>> Please let us know if this alternate method works.
>>>>>>>>>>>>
>>>>>>>>>>>> Graham
>>>>>>>>>>> i tried the following but it doesn't seem to do what i
>>>>>>>>>>> want:
>>>>>>>>>>>
>>>>>>>>>>> DocumentRoot /var/www/my/application/public
>>>>>>>>>>> <Directory /var/www/my/application/public>
>>>>>>>>>>>  AddHandler python-program .py
>>>>>>>>>>>  PythonHandler myhandler
>>>>>>>>>>> </Directory>
>>>>>>>>>>
>>>>>>>>>> Use SetHandler instead of AddHandler.
>>>>>>>>>>
>>>>>>>>>>   DocumentRoot /var/www/my/application/public
>>>>>>>>>>   <Directory /var/www/my/application/public>
>>>>>>>>>>    SetHandler python-program
>>>>>>>>>>    PythonHandler myhandler
>>>>>>>>>>   </Directory>
>>>>>>>>>>
>>>>>>>>>> Graham
>>>>>>>>>
>>>>>>>>> ok, getting closer..  now i get a 404 error in the browser
>>>>>>>>> saying
>>>>>>>>> 'The requested URL /download/file/3 was not found on this
>>>>>>>>> server.'
>>>>>>>>> and in the apache logs:
>>>>>>>>>
>>>>>>>>> File does not exist:
>>>>>>>>> /var/www/application/public/downloads/music/music.zip/file/3
>>>>>>>>>
>>>>>>>>> so it looks like it's taking the last part of the URI and
>>>>>>>>> appending
>>>>>>>>> it to the physical filename and looking for that...  the
>>>>>>>>> request
>>>>>>>>> URI
>>>>>>>>> is 'http://example.com/download/file/3'.  the 'download'
>>>>>>>>> controller
>>>>>>>>> is what manages the music downloads.
>>>>>>>>>
>>>>>>>>> - m@
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>


More information about the Mod_python mailing list