[mod_python] Unable to import modules in subdirs

Graham Dumpleton grahamd at dscpl.com.au
Thu Oct 27 23:59:27 EDT 2005


Brandon N wrote ..
> I meant seeing as others had pointed out the concern that one shouldn't put
> .py files under htdocs/ or similar directories for fear that someone might
> find access to one's source files, wholly intact.

The reference to putting .py files under htdocs pertained more to any
shared set of modules used by your application. The idea being to make
what code is in a handler module file as minimal as possible with
callouts to separate application modules which do the bulk of the work.

There are a number of reasons for doing this. The first is that in the
event that an Apache configuration is stuffed up and .py files exposed,
that you aren't exposing the bulk of the code of your application. Ie.,
the important stuff where you might hold things like any database
login/password details or pathnames to other files which may contain
sensitive information.

The second reason is to avoid any problems with modules being loaded
both by the standard Python import mechanism and the mod_python module
import mechanism. Mixing the two can cause some issues and it is easier
to avoid the problem by never using "import" to import modules in the
document tree. Best way of doing that is to move shared modules
elsewhere. If you must import a module in the document tree from another
module in the document tree, use apache.import_module() instead.

In terms of security of .py files in the document tree, the risk is
similar as to when using .php or .cgi files. If someone screws up the
Apache configuration in all these cases source code could be exposed.
This sort of issue will possibly more easily occur when mod_python is
configured from the main Apache configuration file. At least if
mod_python is configured from a .htaccess file in the document tree, the
file is adjacent to the source code and the association more easily
seen. When in the main Apache configuration, too easy for someone
to unknowingly remove/disable it, or to wipe it out when upgrading
Apache. With a .htaccess file it will keep working unless FileInfo option
is disabled or use of .htaccess files is disabled. If FileInfo is disabled,
result will from memory be a 500 error so still safe. Disable .htaccess
files though and code can still be exposed.

What I would personally be more worried about is where the user that
Apache runs as has some sort of write access to the document tree.
If it does, then .pyc and .pyo files can be left in the document tree
from when a module is loaded. If AddHandler is used to only map .py
files to mod_python, then the .pyc and .pyo files can be exposed and
downloadable. If someone had the right tools they could decompile
the bytecode and find out something about your source code, including
possibly sensitive details.

Even if the user Apache runs as doesn't have write access to the
document tree, I would always suggest the following be added to
the Apache configuration.

  <Files *.pyc>
  deny from all
  </Files>
  
  <Files *.pyo>
  deny from all
  </Files>

This will block access to the files if they are created by mod_python
where directories are writable, or if the files are inadvertantly copied
there from another location.

As to keeping handler modules out of the document tree, thus eliminating
the danger they could be exposed, this is not really possible with
mod_python as it stands now. With mod_python 3.2 though, there is
potential for it to be done, although it means writing a special handler
which emulates the way that Apache maps URLs to files. The change that
has been made that makes this possible is that in 3.2, it is possible to
modify the value of req.path_info as well as req.filename. Thus a
handler could reevaluate a URL against a part of the filesystem which
isn't in the document tree and then execute a handler to service the
request against what was found.

As an example, in a new system I am working on, you can write
something like:

    import mod_python.publisher

    handler = handlers.MapLocationToView(
        directory = '/tmp/htdocs',
        resource_extension = '.py',
        script_extension = '.py',
        handler = mod_python.publisher.handler,
    )

The MapLocationToView handler will map a URL to a .py file like Apache
does now when AddHandler is used and then trigger the standard
mod_python.publisher handler. The difference is that is this example,
the files all live outside of the document tree in '/tmp/htdocs'. The
Apache configuration itself knows nothing about that directory and its
contents can't be exposed in any way if the Apache configuration is
stuffed up.

Graham

> Though
> > In order for
> > Apache to make this determination, the .py files must be in the public
> > directories that Apache is managing.
> made it clear for me.
> 
> Is that at all a security issue. Or rather, is there a standard method
> of
> referencing code outside of the public directories?
> 
> On 10/27/05, Graham Dumpleton <grahamd at dscpl.com.au> wrote:
> >
> > Brandon N wrote ..
> > > I've checked out Vampire, and it would seem to be exactly that which
> I
> > > desire (after only a few minutes of experimentation at least). Does
> one
> > > typically include their .py files with this setup in the public
> > directory
> > > (with indexing and such disabled, naturally)? Or is there a way to
> > reference
> > > files outside of the public system?
> >
> > What do you mean by files? Do you mean the .py files which contain the
> > handlers or other Python helper modules, static files etc?
> >
> > In terms of how most mod_python extensions work, eg, Vampire,
> > mod_python.publisher etc, they rely on the fact that Apache performs
> the
> > mapping of URL to a physical file in the filesystem. Ie., they work out
> > what to do based on what Apache has set req.filename to. In order for
> > Apache to make this determination, the .py files must be in the public
> > directories that Apache is managing. Note though that this doesn't
> > mean they have to be physically under the main Apache document
> > root as you can use the Alias directive or symlinks and the FollowSymLinks
> > directive to locate them in different places but still appear under the
> > public
> > URL namespace.
> >
> > Anyway, if you can be clearer about what you mean, can possibly give
> > a better answer. :-)
> >
> > Graham
> >
> > > Thanks to the both of you with your help. It's cleared up a great deal
> > > for
> > > me.
> > >
> > > Cheers!
> > >
> > > On 10/27/05, Graham Dumpleton <grahamd at dscpl.com.au> wrote:
> > > >
> > > > Jorey Bump wrote ..
> > > > > Brandon N wrote:
> > > > > > A) Is it requestHandler's job to determine which file was
> > requested
> > > > and
> > > > > > respond accordingly (via the request's .filename?) with a switch
> > > > > > construct or equivalent?
> > > > >
> > > > > Yes and no. Apache's already passed the file to the handler based
> on
> > > its
> > > > > extension, presence in a directory, or other criteria. The developer
> > > of
> > > > > the handler gets to decide what the handler does with *whatever*
> is
> > > > > passed to it. Some assume it will contain only valid Python code
> and
> > > > > process it as such (mod_python.publisher, for example). Some might
> > > want
> > > > > to process proprietary or other file formats using python (you
> might
> > > > > make a handler to display Word files, for example), but remain
> > agnostic
> > > > > about the actual filename or extension. But there's no reason why
> > your
> > > > > handler can't branch according to the file extension (which is
> what
> > > > > Graham's Vampire does, if I'm not mistaken).
> > > >
> > > > If you are coming from a PHP background where each URL essentially
> > > > maps to a distinct file, Vampire may well be a good starting point
> as
> > > it
> > > > works in a similar way at it most basic level.
> > > >
> > > > Thus, where in PHP you might have:
> > > >
> > > > index.php # URL -> /index.php
> > > > search.php # URL -> /search.php
> > > >
> > > > Vampire would similarly have separate files for each resource,
> > although
> > > > in Vampire it is the name of the handler within the file which
> > dictates
> > > > what extension the URL needs to have:
> > > >
> > > > index.py
> > > >
> > > > def handler(req): ... # /index
> > > > def handler_html(req): ... # /index.html
> > > > def handler_php(req): ... # /index.php (Yes, pretend we are PHP when
> > > we
> > > > aren't).
> > > >
> > > > search.py
> > > >
> > > > def handler(req): ... # /search
> > > > def handler_html(req): ... # /search.html
> > > >
> > > > Thus, if you want to write your code in the form of basic handlers
> but
> > > a
> > > > distinct handler for each resource, the basic dispatch mechanism
> of
> > > > Vampire is going to allow you to get started quicker.
> > > >
> > > > Another alternative as Jorey pointed out is mod_python.publisher,
> > > > however it doesn't allow you to as easily dictate use of multiple
> > > > different extension types used on URLs nor is it necessarily as easy
> > > to
> > > > mix static files in the same directory.
> > > >
> > > > For a further basic introduction to Vampire, see:
> > > >
> > > > http://www.dscpl.com.au/projects/vampire/articles/vampire-001.html
> > > >
> > > > Graham
> > > >
> > > >
> > > > _______________________________________________
> > > > Mod_python mailing list
> > > > Mod_python at modpython.org
> > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > >
> >


More information about the Mod_python mailing list