[mod_python] urlopen permissions

Emlyn Jones emlynj at gmail.com
Wed Jan 11 06:44:22 EST 2006


Your right!
I thought I had solved that one by explicitly adding a user-agent header.
Turns out the urllib2 in python2.2 adds it's own anyway (mod_python used 2.2,
my shell gives me 2.4). I've upgraded mod_python to python2.4 and it works.
Now I get a seg fault from xml.dom.minidom.parsestring which again only
happens via mod_python not from the shell (although I guess that could just
be a fluke), grrr. More googling I guess!

Thanks for your help.
Emlyn.

On 1/10/06, Colin Bean <ccbean at gmail.com> wrote:
>
> It looks like the 403 is being returned by Google, rather than your copy
> of Apache.  I believe that Google has some rules designed to block web
> crawlers / scrapers under some circumstances.  One possibility would be that
> running your script under apache changes something in the request headers
> that makes Google block the request.  That's just a guess, though (and my
> knowledge of Google's behavior is based on a project I did a couple of years
> ago; so take it with a grain of salt).  Have you tried this code on some
> other URLs?  What kind of results do you get then?
>
> hth,
> -Colin
>
>
>
> On 1/10/06, Emlyn Jones <emlynj at gmail.com > wrote:
>
> > Hello,
> > I'm not convinced that this is a specific mod_python problem but I'm not
> > sure exactly how to explain it in generic terms so hopefully someone here
> > can point me in the right direction.
> > I have a python script which calls urllib2.urlopen to open a url on a
> > remote server (google). It works fine from the command line but when I run
> > it from a mod_python.psp page I get:
> >
> > HTTPError: HTTP Error 403: Forbidden
> >
> > Clearly this is something to do with the permissions of the apache user
> > vs my shell user but what is the safest way to allow this script to run?
> >
> > Regards,
> > Emlyn.
> >
> >
> > _______________________________________________
> > Mod_python mailing list
> > Mod_python at modpython.org
> > http://mailman.modpython.org/mailman/listinfo/mod_python
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20060111/40f2950b/attachment.html


More information about the Mod_python mailing list