Re: [XML-SIG] Learning to use elementtree

Doran, Harold Tue, 08 Apr 2008 06:48:39 -0700

Thanks. I'm piecing this together slowly, but I did get the following to
work.


Test.py
from xml.etree.ElementTree import ElementTree as ET
f = open('test.txt', 'w')
et = ET(file='out_g4r_b.xml')
for statentityref in
et.findall('admin/responseanalyses/analysis/analysisdata/statentityref')
:
   print >> f, statentityref.attrib['id']    
   for statentityref in statentityref.findall('statentityref'):
      for statval in statentityref.findall('statval'): 
         print >> f, statentityref.attrib['id'], '\t',
statval.attrib['type'], '\t', statval.attrib['value']
f.close() 

And this gives output like:

13963
0.000000        UncollapsedMeanScore    23.863636
0.000000        ScorePtPct      0.018333
0.000000        ScorePtBiserial         -0.496309
0.000000        ScorePtAdjBiserial      -0.452588
1.000000        UncollapsedMeanScore    34.941426
1.000000        ScorePtPct      0.981667
1.000000        ScorePtBiserial         0.496309
1.000000        ScorePtAdjBiserial      0.452588
omit    ScorePtPct      0.000000
omit    ScorePtBiserial         -99999.990000
omit    ScorePtAdjBiserial      -99999.990000
13962
0.000000        UncollapsedMeanScore    29.305195
0.000000        ScorePtPct      0.256667
0.000000        ScorePtBiserial         -0.484469
0.000000        ScorePtAdjBiserial      -0.425165
1.000000        UncollapsedMeanScore    36.614350
1.000000        ScorePtPct      0.743333
1.000000        ScorePtBiserial         0.484469
1.000000        ScorePtAdjBiserial      0.425165
omit    ScorePtPct      0.000000
omit    ScorePtBiserial         -99999.990000
omit    ScorePtAdjBiserial      -99999.990000

...

This is almost exactly what I want, and can live with this if needed.
What would be most convenient, however, is to format the ouput as
follows:

13963   0.000000        UncollapsedMeanScore    23.863636
13963   0.000000        ScorePtPct      0.018333
13963   0.000000        ScorePtBiserial         -0.496309
13963   0.000000        ScorePtAdjBiserial      -0.452588
13963   1.000000        UncollapsedMeanScore    34.941426
13963   1.000000        ScorePtPct      0.981667
13963   1.000000        ScorePtBiserial         0.496309
13963   1.000000        ScorePtAdjBiserial      0.452588

I think this may be what Cliff meant by name collusion. That is, the
number 13963 comes from an attribute ['id'] in statentityref. But also,
0.000 and 1.0 are also from the id attribute in statentityref nested in
statentityref. So, I'm a bit confused as to how to go about printing
them out side by side.


> -----Original Message-----
> From: Stefan Behnel [mailto:[EMAIL PROTECTED] 
> Sent: Monday, April 07, 2008 8:32 AM
> To: Doran, Harold
> Cc: J. Cliff Dyer; [email protected]
> Subject: Re: [XML-SIG] Learning to use elementtree
> 
> Hi,
> 
> Doran, Harold wrote:
> > Well, I think I'm getting close. But, I think this is 
> similar to the 
> > problem I had when I started. This seems to create a huge data file 
> > with all information under the first item, and then again all 
> > information under the second item and so forth.
> > 
> > for statentityref in \
> > 
> et.findall('admin/responseanalyses/analysis/analysisdata/state
> ntityref')
> > :   
> >    print >> f, statentityref.attrib['id']
> >    for statentityref in \
> >  
> > 
> et.findall('admin/responseanalyses/analysis/analysisdata/state
> ntityref/s
> > tatentityref'):   
> >       for statval in statentityref.findall('statval'):
> >          print >> f, statentityref.attrib['id'], '\t', 
> > statval.attrib['type'], '\t', statval.attrib['value']
> 
> I think you should read the previous post again. You are 
> nesting three loops here where two would do what you want.
> 
> Stefan
> 
> 
> >> -----Original Message-----
> >> From: J. Cliff Dyer [mailto:[EMAIL PROTECTED]
> >> Sent: Wednesday, April 02, 2008 3:36 PM
> >> To: Doran, Harold
> >> Cc: [email protected]
> >> Subject: Re: [XML-SIG] Learning to use elementtree
> >>
> >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote:
> >>> Indeed, navigating the xml is tough (for me). I have been
> >> able to get
> >>> the following to work. I put in "Sub Element" to indicate the new 
> >>> section of data. But, from looking at the text output, 
> one doesn't 
> >>> know which item these sub elements belong to. I think the
> >> solution is
> >>> to create an index like 13965-0 to show that this is the 
> >>> subinformation from the item above it. That seems to be
> >> where I am getting stuck.
> >>> Although, I am open to other suggestions on how to best
> >> represent the
> >>> output.
> >>>
> >>> from xml.etree.ElementTree import ElementTree as ET
> >>>
> >>> filename = raw_input("Please enter the AM XML file: ") new_file = 
> >>> raw_input("Save this file as: ")
> >>>
> >>> # create a new file defined by the user f = open(new_file, 'w')
> >>>
> >>> et = ET(file=filename)
> >>>
> >>> for statentityref in \
> >>>
> >> 
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> >> f
> >>> ')
> >>> :
> >>>     for statval in statentityref.findall('statval'):
> >>>       print >> f, statentityref.attrib['id'], '\t', 
> >>> statval.attrib['type'], '\t', statval.attrib['value']
> >>>
> >>> f.write("\n\n")
> >>> f.write("Sub Element\n\n")
> >>>
> >>> for statentityref in \
> >>>
> >> 
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> >> f
> >>> /s
> >>> tatentityref'):
> >>>     for statval in statentityref.findall('statval'):
> >>>       print >> f, statentityref.attrib['id'], '\t', 
> >>> statval.attrib['type'], '\t', statval.attrib['value']
> >>> f.close()
> >> Do you want your second statentityref loop to be based on 
> its parent 
> >> statentityref?  If so, you need to nest it in the original 
> loop, and 
> >> use an xpath relative to your outer statentityref (and 
> watch for name 
> >> collisions).
> 
> 
_______________________________________________
XML-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/xml-sig

Re: [XML-SIG] Learning to use elementtree

Reply via email to