[mod_python] requests are blocking when using req.sendfile

Matt Barnicle mattb at wageslavery.org
Sat Sep 27 18:21:39 EDT 2008


> I ask again what I asked before and which you didn't answer.
>
> How many concurrent requests for these files are you receiving and
how
> long does it take to download a file?

sorry for not answering that earlier.  there aren't all that many. 
we get one or two requests for various pages about every 30 seconds
during normal traffic, sometimes that number goes up.  we only get
requests for downloads occassionally.  a much smaller percentage of
the requests are for the downloads.  they are various sizes.  the
one that is getting the most requests right now is an 18 MB zip
file.  it takes around 30 seconds to 2 minutes on a DSL connection
to complete the download.  you can see for yourself here:

http://giveback.net/download/music/3

i kept the site hidden earlier, because i've posted some code from
of our infrastructure and thought i should keep who i'm working for
hidden for that reason, but i think it may help toward getting the
problem solved and i'm not so concerned anymore..  apologies if that
has hindered the process in any way.

> In respect of your original comment
>
>> it works..  but the problem is, requests start backing up whenever a
>> download is being sent.  it looks like new requests are being sent
>> to the apache process that is tied up sending the download file.  i
>> don't understand why this is happening though.  i would think that
>> while it's sending the file, no new requests would be queued in that
>> process.
>
> If you are truly using prefork MPM, then new requests cant be sent to
> the busy process as requests are effectively accepted by a process
> only when it is ready to handle a request.
>
> Please verify which MPM is being used by running:
>
>   httpd -V

# httpd -V
Server version: Apache/2.2.3
Server built:   Jan 15 2008 20:33:41
Server's Module Magic Number: 20051115:3
Server loaded:  APR 1.2.7, APR-Util 1.2.7
Compiled using: APR 1.2.7, APR-Util 1.2.7
Architecture:   64-bit
Server MPM:     Prefork
  threaded:     no
  forked:     yes (variable process count)
Server compiled with....
 -D APACHE_MPM_DIR="server/mpm/prefork"
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128

> How many httpd processes are running at the time you are making
> requests that appear to be blocked?

it varies.  could be less than 10, could be between 10 and 40.  i've
seen it happening with any number of processes, it doesn't seem to
affect things.

> What happens if you make requests at same time from a different web
> browser or from another box, are they also stalled?

it depends.  if i download in one browser, then surf around in
another one, i may very well be able to click around in the other
the entire time without any problems.  but i may also get the
blocking issue.  and sometimes i will load the site fresh for the
first time on some days, and it won't load at all for a few minutes,
then it will.  and in these times, i will check /server-status and
see a lot of requests in the WAIT state, and always when it's like
this, there are one ore more downloads happening.

> Are you using any AJAX stuff on client side for handling the
> download?
>
> Graham

no.  there is a lot of AJAX on the site, but none of it related to
downloads.

i wonder if this could be related to something else, maybe the
database pooling?  i can't really imagine why, but i do wonder now..
 as you said above, new requests really shouldn't be sent to busy
processes.  so i'm baffled.

- m@

> 2008/9/27 Matt Barnicle <mattb at wageslavery.org>:
>> sorry maybe i wasn't being clear..  i did get your code
>> suggestions
>> working, so that i've replaced the req.sendfile code with the
>> apache.DECLINED code.  and it is sending the file ok and i'm
>> downloading it ok, but it doesn't fix the problem as i originally
>> posted, that the downloading is causing apache to block me when i
>> click on links while the download is happening.
>>
>> - m@
>>
>>> What does your handler actually do? The following works for me no
>>> problems.
>>>
>>> from mod_python import apache
>>>
>>> def handler(req):
>>>     req.filename = '/etc/services'
>>>     req.finfo = apache.stat(req.filename, apache.APR_FINFO_MIN)
>>>     return apache.DECLINED
>>>
>>> Does the file exist prior to calling apr.stat()?
>>>
>>> Apache even sets the output headers okay:
>>>
>>> HTTP/1.1 200 OK
>>> Date: Fri, 26 Sep 2008 09:50:33 GMT
>>> Server: Apache/2.2.8 (Unix) mod_python/3.3.2-dev-20080311
>>> Python/2.5.1
>>> mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 mod_wsgi/3.0-TRUNK
>>> Last-Modified: Sun, 23 Sep 2007 21:37:31 GMT
>>> ETag: "535d-a5847-43ad44fac1cc0"
>>> Accept-Ranges: bytes
>>> Content-Length: 677959
>>> Connection: close
>>> Content-Type: text/plain
>>>
>>> Ie., ETag and length etc.
>>>
>>> Graham
>>>
>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>> nope  :-(
>>>>
>>>>> One last thing to add then. If this doesn't work, I'll have to
>>>>> just
>>>>> try it myself. :-)
>>>>>
>>>>>   req.handler = 'default-handler'
>>>>>
>>>>> Graham
>>>>>
>>>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>>>> ok, that worked..  but unfortunately the blocking issue is
>>>>>> still
>>>>>> happening.  i've verified that the download controller is
>>>>>> returning
>>>>>> apache.DECLINED and the file downloads ok.
>>>>>>
>>>>>> - m@
>>>>>>
>>>>>>> Try adding:
>>>>>>>
>>>>>>>   req.path_info = ''
>>>>>>>
>>>>>>> Graham
>>>>>>>
>>>>>>> 2008/9/26 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>> 2008/9/25 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>>>> 2008/9/24 Matt Barnicle <mattb at wageslavery.org>:
>>>>>>>>>>>>> Are you setting a content length on the response before
>>>>>>>>>>>>> calling
>>>>>>>>>>>>> req.sendfile()?
>>>>>>>>>>>>>
>>>>>>>>>>>>> How many concurrent requests for these files are you
>>>>>>>>>>>>> receiving
>>>>>>>> and
>>>>>>>>>>>>> how long does it take to download a file?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Graham
>>>>>>>>>>>>
>>>>>>>>>>>> yes, i believe so..  now that you mention it, i haven't
>>>>>>>>>>>> verified
>>>>>>>>>>>> the file size from the code below, i will do that to be
>>>>>>>>>>>> sure.
>>>>>>>> here
>>>>>>>>>>>> is the code:
>>>>>>>>>>>>
>>>>>>>>>>>> file_path = self.conf.run.app_folder + 'public' +
>>>>>>>>>>>> download.path
>>>>>>>>>>>> file_stat = os.stat(file_path)
>>>>>>>>>>>> file_size = str(file_stat.st_size)
>>>>>>>>>>>>
>>>>>>>>>>>> self.req.headers_out['Content-Length'] = file_size
>>>>>>>>>>>> self.req.headers_out['Content-Disposition'] =
>>>>>>>>>>>> 'attachment;
>>>>>>>>>>>> filename=%s' % os.path.basename(file_path)
>>>>>>>>>>>>
>>>>>>>>>>>> bytes_sent = self.req.sendfile(file_path)
>>>>>>>>>>>
>>>>>>>>>>> BTW, now that you are using mod_python 3.3.1, instead of
>>>>>>>>>>> using
>>>>>>>>>>> req.sendfile(), you could possibly delegate serving of
>>>>>>>>>>> the
>>>>>>>> static file
>>>>>>>>>>> back to Apache. From memory this can be done using:
>>>>>>>>>>>
>>>>>>>>>>>   req.filename = self.conf.run.app_folder + 'public' +
>>>>>>>>>>> download.path
>>>>>>>>>>>   req.finfo = apache.stat(req.filename,
>>>>>>>>>>> apache.APR_FINFO_MIN)
>>>>>>>>>>>   return apache.DECLINED
>>>>>>>>>>>
>>>>>>>>>>> By returning apache.DECLINED you say mod_python handler
>>>>>>>>>>> will
>>>>>>>>>>> not
>>>>>>>>>>> actually handle it, and by having updated req.filename
>>>>>>>>>>> and
>>>>>>>>>>> req.fileinfo updated to new file, when it falls through
>>>>>>>>>>> to
>>>>>>>>>>> default-handler it should serve it as static file.
>>>>>>>>>>>
>>>>>>>>>>> Doing it this way should bypass prior Apache access
>>>>>>>>>>> control
>>>>>>>>>>> checks.
>>>>>>>>>>> Thus file doesn't need to be in Apache document tree or
>>>>>>>>>>> anywhere
>>>>>>>> else
>>>>>>>>>>> that is accessible.
>>>>>>>>>>>
>>>>>>>>>>> I believe doing it this way also has benefit that Apache
>>>>>>>>>>> will
>>>>>>>>>>> automatically set various headers that you probably
>>>>>>>>>>> wouldn't
>>>>>>>>>>> be.
>>>>>>>>>>>
>>>>>>>>>>> Please let us know if this alternate method works.
>>>>>>>>>>>
>>>>>>>>>>> Graham
>>>>>>>>>> i tried the following but it doesn't seem to do what i
>>>>>>>>>> want:
>>>>>>>>>>
>>>>>>>>>> DocumentRoot /var/www/my/application/public
>>>>>>>>>> <Directory /var/www/my/application/public>
>>>>>>>>>>  AddHandler python-program .py
>>>>>>>>>>  PythonHandler myhandler
>>>>>>>>>> </Directory>
>>>>>>>>>
>>>>>>>>> Use SetHandler instead of AddHandler.
>>>>>>>>>
>>>>>>>>>   DocumentRoot /var/www/my/application/public
>>>>>>>>>   <Directory /var/www/my/application/public>
>>>>>>>>>    SetHandler python-program
>>>>>>>>>    PythonHandler myhandler
>>>>>>>>>   </Directory>
>>>>>>>>>
>>>>>>>>> Graham
>>>>>>>>
>>>>>>>> ok, getting closer..  now i get a 404 error in the browser
>>>>>>>> saying
>>>>>>>> 'The requested URL /download/file/3 was not found on this
>>>>>>>> server.'
>>>>>>>> and in the apache logs:
>>>>>>>>
>>>>>>>> File does not exist:
>>>>>>>> /var/www/application/public/downloads/music/music.zip/file/3
>>>>>>>>
>>>>>>>> so it looks like it's taking the last part of the URI and
>>>>>>>> appending
>>>>>>>> it to the physical filename and looking for that...  the
>>>>>>>> request
>>>>>>>> URI
>>>>>>>> is 'http://example.com/download/file/3'.  the 'download'
>>>>>>>> controller
>>>>>>>> is what manages the music downloads.
>>>>>>>>
>>>>>>>> - m@
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>




More information about the Mod_python mailing list