[mod_python] Multiple loading of same module with winnt MPM

Graham Dumpleton graham.dumpleton at gmail.com
Thu May 10 20:46:14 EDT 2007


Your change narrows the chance of the problem happening but does not
eliminate it completely. Also, it circumvents the whole two level
locking scheme designed to avoid slow downs in a multithreaded system.
In other words, when a module has to be imported you block all other
threads being able to load modules which don't require an actual
import. Also, I suspect the change will also cause a deadlock when a
module being imported in turn tries to use apache.import_module() as
the locks aren't reentrant by the same thread.

More later.

Graham

On 10/05/07, Michael Sanders <m.r.sanders at gmail.com> wrote:
> Michael Sanders wrote:
> > Further to this, one of my colleagues has made some other changes to
> > 'importer.py' that appear to fix the problem (please find a diff attached).
> >
> > The modification is a change to the locking in the import_module()
> > method of the _ModuleCache class:
> >
> > The 'self._lock1' lock is now used to lock out the actual 'loading' code
> > as well as the check-for-cache code: _reload_required().
> >
> > See line 554 ish (Code up to line 769 indented)
>
> Apologies, I sent the whole file and not a diff. I've (hopefully)
> attached the diff this time!
>
> Thanks,
> Michael
>
> --- importer.py 2007-05-10 13:57:01.157138700 +0100
> +++ importer.py.new     2007-05-10 14:18:22.976417100 +0100
> @@ -551,218 +551,222 @@
>              # Import module or obtain it from cache as is
>              # appropriate.
>
> -            if load:
> -
> -                # Setup a new empty module to load the code for
> -                # the module into. Increment the instance count
> -                # and set the reload flag to force a reload if
> -                # the import fails.
> -
> -                cache.instance = cache.instance + 1
> -                cache.reload = 1
> -
> -                module = imp.new_module(label)
> -
> -                # If the module was previously loaded we need to
> -                # manage the transition to the new instance of
> -                # the module that is being loaded to replace it.
> -                # This entails calling the special clone method,
> -                # if provided within the existing module. Using
> -                # this method the existing module can then
> -                # selectively indicate what should be transfered
> -                # over to the next instance of the module,
> -                # including thread locks. If this process fails
> -                # the special purge method is called, if
> -                # provided, to indicate that the existing module
> -                # is being forcibly purged out of the system. In
> -                # that case any existing state will not be
> -                # transferred.
> -
> -                if cache.module != None:
> -                    if hasattr(cache.module, "__mp_clone__"):
> -                        try:
> -                            # Migrate any existing state data from
> -                            # existing module instance to new module
> -                            # instance.
> -
> -                            if log:
> -                                msg = "Cloning module '%s'" % file
> +            try:
> +                self._lock1.acquire()
> +                if load:
> +
> +                    # Setup a new empty module to load the code for
> +                    # the module into. Increment the instance count
> +                    # and set the reload flag to force a reload if
> +                    # the import fails.
> +
> +                    cache.instance = cache.instance + 1
> +                    cache.reload = 1
> +
> +                    module = imp.new_module(label)
> +
> +                    # If the module was previously loaded we need to
> +                    # manage the transition to the new instance of
> +                    # the module that is being loaded to replace it.
> +                    # This entails calling the special clone method,
> +                    # if provided within the existing module. Using
> +                    # this method the existing module can then
> +                    # selectively indicate what should be transfered
> +                    # over to the next instance of the module,
> +                    # including thread locks. If this process fails
> +                    # the special purge method is called, if
> +                    # provided, to indicate that the existing module
> +                    # is being forcibly purged out of the system. In
> +                    # that case any existing state will not be
> +                    # transferred.
> +
> +                    if cache.module != None:
> +                        if hasattr(cache.module, "__mp_clone__"):
> +                            try:
> +                                # Migrate any existing state data from
> +                                # existing module instance to new module
> +                                # instance.
> +
> +                                if log:
> +                                    msg = "Cloning module '%s'" % file
> +                                    self._log_notice(msg)
> +
> +                                cache.module.__mp_clone__(module)
> +
> +                            except:
> +                                # Forcibly purging module from system.
> +
> +                                self._log_exception()
> +
> +                                if log:
> +                                    msg = "Purging module '%s'" % file
> +                                    self._log_notice(msg)
> +
> +                                if hasattr(cache.module, "__mp_purge__"):
> +                                    try:
> +                                        cache.module.__mp_purge__()
> +                                    except:
> +                                        self._log_exception()
> +
> +                                cache.module = None
> +
> +                                # Setup a fresh new module yet again.
> +
> +                                module = imp.new_module(label)
> +
> +                        if log:
> +                            if cache.module == None:
> +                                msg = "Importing module '%s'" % file
>                                  self._log_notice(msg)
> -
> -                            cache.module.__mp_clone__(module)
> -
> -                        except:
> -                            # Forcibly purging module from system.
> -
> -                            self._log_exception()
> -
> -                            if log:
> -                                msg = "Purging module '%s'" % file
> +                            else:
> +                                msg = "Reimporting module '%s'" % file
>                                  self._log_notice(msg)
> -
> -                            if hasattr(cache.module, "__mp_purge__"):
> -                                try:
> -                                    cache.module.__mp_purge__()
> -                                except:
> -                                    self._log_exception()
> -
> -                            cache.module = None
> -
> -                            # Setup a fresh new module yet again.
> -
> -                            module = imp.new_module(label)
> -
> -                    if log:
> -                        if cache.module == None:
> +                    else:
> +                        if log:
>                              msg = "Importing module '%s'" % file
>                              self._log_notice(msg)
> -                        else:
> -                            msg = "Reimporting module '%s'" % file
> -                            self._log_notice(msg)
> -                else:
> -                    if log:
> -                        msg = "Importing module '%s'" % file
> -                        self._log_notice(msg)
> -
> -                # Must add to the module the path to the modules
> -                # file. This ensures that module looks like a
> -                # normal module and this path will also be used
> -                # in certain contexts when the import statement
> -                # is used within the module.
> -
> -                module.__file__ = file
> -
> -                # Setup a new instance object to store in the
> -                # module. This will refer back to the actual
> -                # cache entry and is used to record information
> -                # which is specific to this incarnation of the
> -                # module when reloading is occuring.
> -
> -                instance = _InstanceInfo(label, file, cache)
> -
> -                module.__mp_info__ = instance
> -
> -                # Cache any additional module search path which
> -                # should be used for this instance of the module
> -                # or package. The path shouldn't be able to be
> -                # changed during the lifetime of the module to
> -                # ensure that module imports are always done
> -                # against the same path.
> -
> -                if path is None:
> -                    path = []
> -
> -                module.__mp_path__ = list(path)
> -
> -                # Place a reference to the module within the
> -                # request specific cache of imported modules.
> -                # This makes module lookup more efficient when
> -                # the same module is imported more than once
> -                # within the context of a request. In the case
> -                # of a cyclical import, avoids a never ending
> -                # recursive loop.
> -
> -                if modules is not None:
> -                    modules[label] = module
> -
> -                # If this is a child import of some parent
> -                # module, add this module as a child of the
> -                # parent.
> -
> -                atime = time.time()
> +
> +                    # Must add to the module the path to the modules
> +                    # file. This ensures that module looks like a
> +                    # normal module and this path will also be used
> +                    # in certain contexts when the import statement
> +                    # is used within the module.
> +
> +                    module.__file__ = file
> +
> +                    # Setup a new instance object to store in the
> +                    # module. This will refer back to the actual
> +                    # cache entry and is used to record information
> +                    # which is specific to this incarnation of the
> +                    # module when reloading is occuring.
> +
> +                    instance = _InstanceInfo(label, file, cache)
> +
> +                    module.__mp_info__ = instance
> +
> +                    # Cache any additional module search path which
> +                    # should be used for this instance of the module
> +                    # or package. The path shouldn't be able to be
> +                    # changed during the lifetime of the module to
> +                    # ensure that module imports are always done
> +                    # against the same path.
> +
> +                    if path is None:
> +                        path = []
> +
> +                    module.__mp_path__ = list(path)
> +
> +                    # Place a reference to the module within the
> +                    # request specific cache of imported modules.
> +                    # This makes module lookup more efficient when
> +                    # the same module is imported more than once
> +                    # within the context of a request. In the case
> +                    # of a cyclical import, avoids a never ending
> +                    # recursive loop.
> +
> +                    if modules is not None:
> +                        modules[label] = module
> +
> +                    # If this is a child import of some parent
> +                    # module, add this module as a child of the
> +                    # parent.
> +
> +                    atime = time.time()
> +
> +                    if parent_info:
> +                        parent_info.children[label] = atime
> +
> +                    # Perform actual import of the module.
> +
> +                    try:
> +                        execfile(file, module.__dict__)
> +
> +                    except:
> +
> +                        # Importation of the module has failed for
> +                        # some reason. If this is the very first
> +                        # import of the module, need to discard the
> +                        # cache entry entirely else a subsequent
> +                        # attempt to load the module will wrongly
> +                        # think it was successfully loaded already.
> +
> +                        if cache.module is None:
> +                            del self._cache[label]
> +
> +                        raise
> +
> +                    # Update the cache and clear the reload flag.
> +
> +                    cache.module = module
> +                    cache.reload = 0
> +
> +                    # Need to also update the list of child modules
> +                    # stored in the cache entry with the actual
> +                    # children found during the import. A copy is
> +                    # made, meaning that any future imports
> +                    # performed within the context of the request
> +                    # handler don't result in the module later being
> +                    # reloaded if they change.
> +
> +                    cache.children = dict(module.__mp_info__.children)
> +
> +                    # Create link to embedded path at end of import.
> +
> +                    cache.path = module.__mp_path__
> +
> +                    # Increment the generation count of the global
> +                    # state of all modules. This is used in the
> +                    # dependency management scheme for reloading to
> +                    # determine if a module dependency has been
> +                    # reloaded since it was loaded.
> +
> +                    self._lock2.acquire()
> +                    self._generation = self._generation + 1
> +                    cache.generation = self._generation
> +                    self._lock2.release()
> +
> +                    # Update access time and reset access counts.
> +
> +                    cache.ctime = atime
> +                    cache.atime = atime
> +                    cache.direct = 1
> +                    cache.indirect = 0
>
> -                if parent_info:
> -                    parent_info.children[label] = atime
> -
> -                # Perform actual import of the module.
> -
> -                try:
> -                    execfile(file, module.__dict__)
> -
> -                except:
> -
> -                    # Importation of the module has failed for
> -                    # some reason. If this is the very first
> -                    # import of the module, need to discard the
> -                    # cache entry entirely else a subsequent
> -                    # attempt to load the module will wrongly
> -                    # think it was successfully loaded already.
> -
> -                    if cache.module is None:
> -                        del self._cache[label]
> -
> -                    raise
> -
> -                # Update the cache and clear the reload flag.
> -
> -                cache.module = module
> -                cache.reload = 0
> -
> -                # Need to also update the list of child modules
> -                # stored in the cache entry with the actual
> -                # children found during the import. A copy is
> -                # made, meaning that any future imports
> -                # performed within the context of the request
> -                # handler don't result in the module later being
> -                # reloaded if they change.
> -
> -                cache.children = dict(module.__mp_info__.children)
> -
> -                # Create link to embedded path at end of import.
> -
> -                cache.path = module.__mp_path__
> -
> -                # Increment the generation count of the global
> -                # state of all modules. This is used in the
> -                # dependency management scheme for reloading to
> -                # determine if a module dependency has been
> -                # reloaded since it was loaded.
> -
> -                self._lock2.acquire()
> -                self._generation = self._generation + 1
> -                cache.generation = self._generation
> -                self._lock2.release()
> -
> -                # Update access time and reset access counts.
> -
> -                cache.ctime = atime
> -                cache.atime = atime
> -                cache.direct = 1
> -                cache.indirect = 0
> -
> -            else:
> -
> -                # Update the cache.
> -
> -                module = cache.module
> -
> -                # Place a reference to the module within the
> -                # request specific cache of imported modules.
> -                # This makes module lookup more efficient when
> -                # the same module is imported more than once
> -                # within the context of a request. In the case
> -                # of a cyclical import, avoids a never ending
> -                # recursive loop.
> -
> -                if modules is not None:
> -                    modules[label] = module
> -
> -                # If this is a child import of some parent
> -                # module, add this module as a child of the
> -                # parent.
> -
> -                atime = time.time()
> -
> -                if parent_info:
> -                    parent_info.children[label] = atime
> -
> -                # Didn't need to reload the module so simply
> -                # increment access counts and last access time.
> -
> -                cache.atime = atime
> -                cache.direct = cache.direct + 1
> -
> -            return module
> +                else:
> +
> +                    # Update the cache.
> +
> +                    module = cache.module
> +
> +                    # Place a reference to the module within the
> +                    # request specific cache of imported modules.
> +                    # This makes module lookup more efficient when
> +                    # the same module is imported more than once
> +                    # within the context of a request. In the case
> +                    # of a cyclical import, avoids a never ending
> +                    # recursive loop.
> +
> +                    if modules is not None:
> +                        modules[label] = module
> +
> +                    # If this is a child import of some parent
> +                    # module, add this module as a child of the
> +                    # parent.
> +
> +                    atime = time.time()
> +
> +                    if parent_info:
> +                        parent_info.children[label] = atime
> +
> +                    # Didn't need to reload the module so simply
> +                    # increment access counts and last access time.
> +
> +                    cache.atime = atime
> +                    cache.direct = cache.direct + 1
> +
> +                return module
> +            finally:
> +                self._lock1.release() # always unlock
>
>          finally:
>              # Lock on cache object can now be released.
>
>


More information about the Mod_python mailing list