On Thu, 2008-07-24 at 14:04 +0200, Fredrik Lundh wrote: > Eric Chao wrote: > > > I've been trying to convert some text that has some odd coding to xml. I > > am trying to use python to create a program that will process this text: > > > > <BN>GENESIS</BN> > > <CN>CHAPTER 1</CN> > > <SH>The Creation</SH> > > <C>{{01:1}}1 <RA>In the beginning <RB>God <RC>created the heavens and > > the earth. > > <V>{{01:1}}2 The earth was <$FOr {a waste and > > emptiness}>><N1><RA>formless and void, and <RB>darkness was over the > > <V>{{01:1}}3 Then <RA>God said, ``Let there be light"; and there was light. > > > > to something like this: > > > > <book osisID="Gen"> > > <chapter sID="Gen.1"/> > > <p><verse sID="Gen.1.1"/>In the beginning God created the heaven and the > > earth.<verse eID="Gen.1.1"/></p> > > <p><verse sID="Gen.1.2"/>And the earth was without form, and void; and > > darkness was upon the face of the deep. And the Spirit of God moved upon > > the face of the waters.<verse eID="Gen.1.2"/></p> > > <p><verse sID="Gen.1.3"/>And God said, Let there be light: and there was > > light.<verse eID="Gen.1.3"/></p> > > > > I am not very good with Python and I was hoping someone could offer some > > advice on how to get started. I tried to write a program that produces > > XML, but I think I need more of a find and replace type program. Thanks ! > > that looks a rather daunting task even for an experienced Python > programmer (especially mapping between different translations ;-). > > I'd concentrate on parsing the original file format first, before even > thinking about how to write it out in XML. > > it might be some kind of SGML, in which case the standard sgmllib > library might be helpful: > > http://effbot.org/librarybook/sgmllib.htm > > if that seems to work, try building some suitable data structure from > the incoming data (lists of strings might work, but you might want to > create some simple container objects that holds the lists for you).
If it turns out not to be valid SGML, you may need to look into using pyparsing. There was a good introduction to it in a recent issue of python magazine. There are also a bunch of online tutorials. -- Oook! J. Cliff Dyer Carolina Digital Library and Archives UNC Chapel Hill _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig