Main Page | Report this Page
Computers Forum Index  »  Computer Languages (Ruby)  »  XML Parsing Speed - ruby libxml & REXML
Page 1 of 1    

XML Parsing Speed - ruby libxml & REXML

Author Message
subimage
Posted: Wed May 31, 2006 9:20 am
Guest
WHOAH!

Ok so I finally dug into the stream parser and this is lightning fast!

Thanks everyone for the advice...this is really sweet.

PS: I learned a lot from the tutorial available here:

http://www.rubyxml.com/articles/REXML/Stream_Parsing_with_REXML

I wrote a BasicStreamListener that throws each item into a hash
complete with pseudo xPaths...

Let me know if anyone would be interested in it, or a tutorial. Might
write something up for my blog as well on the subject, since there
doesn't seem to be a wealth of information out there on the subject.
 
Adam Sanderson
Posted: Mon Jun 05, 2006 8:44 pm
Guest
Yeah, I ran into similar problems ealier using xpath. Streaming the
xml and plucking out what you need is a little more complicated but it
is more efficent on three counts:

1) It will probably consume less memory since you will likely only
store a small subset of the data
2) You either don't need to build a full DOM tree, or you can build a
very light weight one
3) Parsing and executing xpath expressions takes some time, if you're
doing a ton of them it might have a noticeable effect.

It might be best for people test it out using xpath and such, if that
works keep it, but if not, you can always fall back on building a
stream parser. Wish I saw your post earlier ;)

.adam

subimage wrote:
Quote:
WHOAH!

Ok so I finally dug into the stream parser and this is lightning fast!

Thanks everyone for the advice...this is really sweet.

PS: I learned a lot from the tutorial available here:

http://www.rubyxml.com/articles/REXML/Stream_Parsing_with_REXML

I wrote a BasicStreamListener that throws each item into a hash
complete with pseudo xPaths...

Let me know if anyone would be interested in it, or a tutorial. Might
write something up for my blog as well on the subject, since there
doesn't seem to be a wealth of information out there on the subject.
 
Mathieu Blondel
Posted: Wed Jun 07, 2006 7:21 pm
Guest
For large file, stream parsers are faster and have a smaller memory
footprint.

A few months ago, I also tested the expat bindings for ruby which
turned out to be up to 20 times faster than the stream parser provided
by REXML.

subimage a écrit :

Quote:
WHOAH!

Ok so I finally dug into the stream parser and this is lightning fast!

Thanks everyone for the advice...this is really sweet.

PS: I learned a lot from the tutorial available here:

http://www.rubyxml.com/articles/REXML/Stream_Parsing_with_REXML

I wrote a BasicStreamListener that throws each item into a hash
complete with pseudo xPaths...

Let me know if anyone would be interested in it, or a tutorial. Might
write something up for my blog as well on the subject, since there
doesn't seem to be a wealth of information out there on the subject.
 
subimage
Posted: Thu Jun 08, 2006 2:33 am
Guest
Got a URL or more info on these bindings? Test code? How to get it
running?

Mathieu Blondel wrote:
Quote:
For large file, stream parsers are faster and have a smaller memory
footprint.

A few months ago, I also tested the expat bindings for ruby which
turned out to be up to 20 times faster than the stream parser provided
by REXML.

subimage a écrit :

WHOAH!

Ok so I finally dug into the stream parser and this is lightning fast!

Thanks everyone for the advice...this is really sweet.

PS: I learned a lot from the tutorial available here:

http://www.rubyxml.com/articles/REXML/Stream_Parsing_with_REXML

I wrote a BasicStreamListener that throws each item into a hash
complete with pseudo xPaths...

Let me know if anyone would be interested in it, or a tutorial. Might
write something up for my blog as well on the subject, since there
doesn't seem to be a wealth of information out there on the subject.
 
Mathieu Blondel
Posted: Thu Jun 08, 2006 3:02 pm
Guest
http://www.yoshidam.net/Ruby.html#xmlparser

You will most certainly have to compile the binding yourself.

Look at samples/xmlevent.rb which is shipped with the source code. It
shows how to use the event-based (a la sax) xmlparser.

HTH
Mathieu

subimage a écrit :

Quote:
Got a URL or more info on these bindings? Test code? How to get it
running?

Mathieu Blondel wrote:
For large file, stream parsers are faster and have a smaller memory
footprint.

A few months ago, I also tested the expat bindings for ruby which
turned out to be up to 20 times faster than the stream parser provided
by REXML.

subimage a écrit :

WHOAH!

Ok so I finally dug into the stream parser and this is lightning fast!

Thanks everyone for the advice...this is really sweet.

PS: I learned a lot from the tutorial available here:

http://www.rubyxml.com/articles/REXML/Stream_Parsing_with_REXML

I wrote a BasicStreamListener that throws each item into a hash
complete with pseudo xPaths...

Let me know if anyone would be interested in it, or a tutorial. Might
write something up for my blog as well on the subject, since there
doesn't seem to be a wealth of information out there on the subject.
 
 
Page 1 of 1    
All times are GMT
The time now is Thu Nov 26, 2009 10:18 am