[mod_python] Reg Word Document Reading

Nicolas Lehuen nicolas at lehuen.com
Fri Feb 24 01:51:17 EST 2006


Oh - of course it's only a starting point, since using ActiveDocument
and Documents.Close is a big no-no in a server process where multiple
request can be handled concurrently. You don't want one request to
access the text from another request's document. But there are other
ways to do so, see the doc.

Plus, you have to make sure that the Word COM object are properly
removed from memory if you don't want to see Word processes staying in
memory.

Regards,
Nicolas

2006/2/24, Nicolas Lehuen <nicolas at lehuen.com>:
> Hi,
>
> This has *nothing* to do whatsoever with mod_python. Nothing. Please
> ask this on comp.lang.python, where you should receive half a dozen
> response in a  quite short time.
>
> A pretty simple way to access comp.lang.python if you don't know ho
> newsgroup work is to  use Google Groups :
>
> http://groups.google.com/group/comp.lang.python
>
> After writing that, I shouldn't help you, as it will only make you
> keep asking irrelevant questions here, but hey, I'm only human, so
> I'll give you a few pointers.
>
>  .doc files are opaque, binary files, so you cannot read them as
> easily as plain text files. You have to use the Word API, which is
> exposed as COM objects, with the pywin32 modules.
>
> Now, behold the power of Google :
>
> http://www.google.com/search?q=word+COM+python
>
> A few clicks here and there take me to this page :
>
> http://wikipython.flibuste.net/moin.py/CodeWindows
>
> It's French but I guess I could have seen this elsewhere :
>
> import win32com.client
>
> word = win32com.client.Dispatch("Word.Application")
> word.Documents.Open("D:/partage/python/wordtxt/test.doc") # full path needed
> text = word.ActiveDocument.content.text
> word.Documents.Close()
>
> text = text.encode('Latin-1')
>
> print text
>
> Regards,
> Nicolas
>
> P.S. since you've been asking quite a few questions on this list,
> having something like part of your real name would be nice - I cannot
> be brought to start a mail by "Hello, Python Eager !", which looks
> plain stupid.
>
> 2006/2/24, python eager <python_eager at yahoo.com>:
> > Hi,
> > > >   This is my code snippet. This code will not read the characters
> > in
> > > > Ms Word Document. I got the below output. But if i use any *txt*
> > > > format this will be display properly. Instead of txt format i am
> > > > using doc format. I got error while reading. please help me to
> > solve
> > > > this problem
> > > >
> > > >
> > > >           req = self.request()
> > > >           inp = file('C:\\Python.doc','r')
> > > >           for line in inp:
> > > >              self.writeln(line)
> > > >           inp.close()
> > > >
> > > > *Output :*
> > > > *    ÐÏࡱ*
> > > >
> > > > regards!
> > > > Python eager
> >
> >
> >  ________________________________
> > Brings words and photos together (easily) with
> >  PhotoMail - it's free and works with Yahoo! Mail.
> >
> >
> > _______________________________________________
> > Mod_python mailing list
> > Mod_python at modpython.org
> > http://mailman.modpython.org/mailman/listinfo/mod_python
> >
> >
> >
>



More information about the Mod_python mailing list