Well, maybe this is what I should have done to start with to avoid the
name collusion problem
from xml.etree.ElementTree import ElementTree as ET
f = open('test.txt', 'w')
et = ET(file='out_g4r_b.xml')
for statentityref in
et.findall('admin/responseanalyses/analysis/analysisdata/statentityref')
:
for ss in statentityref.findall('statentityref'):
for statval in ss.findall('statval'):
print >> f, statentityref.attrib['id'], ss.attrib['id'], '\t',
statval.attrib['type'], '\t', statval.attrib['value']
f.close()
This works and formats output as desired. Just checking to see if this
is the way others would tackle this.
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
> Sent: Tuesday, April 08, 2008 9:48 AM
> To: Stefan Behnel
> Cc: [email protected]; J. Cliff Dyer
> Subject: Re: [XML-SIG] Learning to use elementtree
>
> Thanks. I'm piecing this together slowly, but I did get the
> following to work.
>
> Test.py
> from xml.etree.ElementTree import ElementTree as ET f =
> open('test.txt', 'w') et = ET(file='out_g4r_b.xml') for
> statentityref in
> et.findall('admin/responseanalyses/analysis/analysisdata/state
> ntityref')
> :
> print >> f, statentityref.attrib['id']
> for statentityref in statentityref.findall('statentityref'):
> for statval in statentityref.findall('statval'):
> print >> f, statentityref.attrib['id'], '\t',
> statval.attrib['type'], '\t', statval.attrib['value']
> f.close()
>
> And this gives output like:
>
> 13963
> 0.000000 UncollapsedMeanScore 23.863636
> 0.000000 ScorePtPct 0.018333
> 0.000000 ScorePtBiserial -0.496309
> 0.000000 ScorePtAdjBiserial -0.452588
> 1.000000 UncollapsedMeanScore 34.941426
> 1.000000 ScorePtPct 0.981667
> 1.000000 ScorePtBiserial 0.496309
> 1.000000 ScorePtAdjBiserial 0.452588
> omit ScorePtPct 0.000000
> omit ScorePtBiserial -99999.990000
> omit ScorePtAdjBiserial -99999.990000
> 13962
> 0.000000 UncollapsedMeanScore 29.305195
> 0.000000 ScorePtPct 0.256667
> 0.000000 ScorePtBiserial -0.484469
> 0.000000 ScorePtAdjBiserial -0.425165
> 1.000000 UncollapsedMeanScore 36.614350
> 1.000000 ScorePtPct 0.743333
> 1.000000 ScorePtBiserial 0.484469
> 1.000000 ScorePtAdjBiserial 0.425165
> omit ScorePtPct 0.000000
> omit ScorePtBiserial -99999.990000
> omit ScorePtAdjBiserial -99999.990000
>
> ...
>
> This is almost exactly what I want, and can live with this if needed.
> What would be most convenient, however, is to format the ouput as
> follows:
>
> 13963 0.000000 UncollapsedMeanScore 23.863636
> 13963 0.000000 ScorePtPct 0.018333
> 13963 0.000000 ScorePtBiserial -0.496309
> 13963 0.000000 ScorePtAdjBiserial -0.452588
> 13963 1.000000 UncollapsedMeanScore 34.941426
> 13963 1.000000 ScorePtPct 0.981667
> 13963 1.000000 ScorePtBiserial 0.496309
> 13963 1.000000 ScorePtAdjBiserial 0.452588
>
> I think this may be what Cliff meant by name collusion. That
> is, the number 13963 comes from an attribute ['id'] in
> statentityref. But also, 0.000 and 1.0 are also from the id
> attribute in statentityref nested in statentityref. So, I'm a
> bit confused as to how to go about printing them out side by side.
>
>
> > -----Original Message-----
> > From: Stefan Behnel [mailto:[EMAIL PROTECTED]
> > Sent: Monday, April 07, 2008 8:32 AM
> > To: Doran, Harold
> > Cc: J. Cliff Dyer; [email protected]
> > Subject: Re: [XML-SIG] Learning to use elementtree
> >
> > Hi,
> >
> > Doran, Harold wrote:
> > > Well, I think I'm getting close. But, I think this is
> > similar to the
> > > problem I had when I started. This seems to create a huge
> data file
> > > with all information under the first item, and then again all
> > > information under the second item and so forth.
> > >
> > > for statentityref in \
> > >
> > et.findall('admin/responseanalyses/analysis/analysisdata/state
> > ntityref')
> > > :
> > > print >> f, statentityref.attrib['id']
> > > for statentityref in \
> > >
> > >
> > et.findall('admin/responseanalyses/analysis/analysisdata/state
> > ntityref/s
> > > tatentityref'):
> > > for statval in statentityref.findall('statval'):
> > > print >> f, statentityref.attrib['id'], '\t',
> > > statval.attrib['type'], '\t', statval.attrib['value']
> >
> > I think you should read the previous post again. You are
> nesting three
> > loops here where two would do what you want.
> >
> > Stefan
> >
> >
> > >> -----Original Message-----
> > >> From: J. Cliff Dyer [mailto:[EMAIL PROTECTED]
> > >> Sent: Wednesday, April 02, 2008 3:36 PM
> > >> To: Doran, Harold
> > >> Cc: [email protected]
> > >> Subject: Re: [XML-SIG] Learning to use elementtree
> > >>
> > >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote:
> > >>> Indeed, navigating the xml is tough (for me). I have been
> > >> able to get
> > >>> the following to work. I put in "Sub Element" to
> indicate the new
> > >>> section of data. But, from looking at the text output,
> > one doesn't
> > >>> know which item these sub elements belong to. I think the
> > >> solution is
> > >>> to create an index like 13965-0 to show that this is the
> > >>> subinformation from the item above it. That seems to be
> > >> where I am getting stuck.
> > >>> Although, I am open to other suggestions on how to best
> > >> represent the
> > >>> output.
> > >>>
> > >>> from xml.etree.ElementTree import ElementTree as ET
> > >>>
> > >>> filename = raw_input("Please enter the AM XML file: ")
> new_file =
> > >>> raw_input("Save this file as: ")
> > >>>
> > >>> # create a new file defined by the user f = open(new_file, 'w')
> > >>>
> > >>> et = ET(file=filename)
> > >>>
> > >>> for statentityref in \
> > >>>
> > >>
> >
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> > >> f
> > >>> ')
> > >>> :
> > >>> for statval in statentityref.findall('statval'):
> > >>> print >> f, statentityref.attrib['id'], '\t',
> > >>> statval.attrib['type'], '\t', statval.attrib['value']
> > >>>
> > >>> f.write("\n\n")
> > >>> f.write("Sub Element\n\n")
> > >>>
> > >>> for statentityref in \
> > >>>
> > >>
> >
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> > >> f
> > >>> /s
> > >>> tatentityref'):
> > >>> for statval in statentityref.findall('statval'):
> > >>> print >> f, statentityref.attrib['id'], '\t',
> > >>> statval.attrib['type'], '\t', statval.attrib['value']
> > >>> f.close()
> > >> Do you want your second statentityref loop to be based on
> > its parent
> > >> statentityref? If so, you need to nest it in the original
> > loop, and
> > >> use an xpath relative to your outer statentityref (and
> > watch for name
> > >> collisions).
> >
> >
> _______________________________________________
> XML-SIG maillist - [email protected]
> http://mail.python.org/mailman/listinfo/xml-sig
>
_______________________________________________
XML-SIG maillist - [email protected]
http://mail.python.org/mailman/listinfo/xml-sig