Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 6 of 6

Thread: Java XML parsing for HUGE size file and no root tags

  1. #1
    Junior Member
    Join Date
    Jun 2013
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Exclamation Java XML parsing for HUGE size file and no root tags

    Hi All,

    I am working on situation where i have to write code to parse XML files with following :

    1) Size of file will be Huge in future (So the code should be smart enough to deal with it). It may maximum reach upto 8GB.
    2) The XML is not proper as it has no fix root tag rather have multiple XML in a single XML. (So we may need to add root tag and delete repated <?xml version and <!DOCTYPE ? tags) or split into multiple xml before parsing.
    3) There is no XSD provide by the client rather DTD is given.

    Do anyone has anything to share for such problem statement. I am thinking of multiple approach to deal with this situation like using shell script, SAX or sTax etc.

    Just came to this forum, to see if we have similar problem faced by someone on future and i can be lucky


  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,237
    Thanks
    176
    Thanked 817 Times in 760 Posts
    Blog Entries
    5

    Default Re: Java XML parsing for HUGE size file and no root tags

    This thread has been cross posted here:

    http://www.java-forums.org/xml/79540-java-xml-parsing-huge-size-file-no-root-tags.html

    Although cross posting is allowed, for everyone's benefit, please read:

    Java Programming Forums Cross Posting Rules

    The Problems With Cross Posting


  3. #3
    Forum VIP
    Join Date
    Jul 2010
    Posts
    1,609
    Thanks
    25
    Thanked 316 Times in 295 Posts

    Default Re: Java XML parsing for HUGE size file and no root tags

    Out of curiosity: why use xml if you are ignoring some of the basic standards (like root nodes) and enormous file sizes?
    I'm asking because I suspect you may be attempting to do something in xml which may be far more suited to do in some other fashion.
    NOTE TO NEW PEOPLE LOOKING FOR HELP ON FORUM:

    When asking for help, please follow these guidelines to receive better and more prompt help:
    1. Put your code in Java Tags. To do this, put [highlight=java] before your code and [/highlight] after your code.
    2. Give full details of errors and provide us with as much information about the situation as possible.
    3. Give us an example of what the output should look like when done correctly.

    Join the Airline Management Simulation Game to manage your own airline against other users in a virtual recreation of the United States Airline Industry. For more details, visit: http://airlinegame.orgfree.com/

  4. #4
    Junior Member
    Join Date
    Jun 2013
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Java XML parsing for HUGE size file and no root tags

    Quote Originally Posted by aussiemcgr View Post
    Out of curiosity: why use xml if you are ignoring some of the basic standards (like root nodes) and enormous file sizes?
    I'm asking because I suspect you may be attempting to do something in xml which may be far more suited to do in some other fashion.
    Hi aussiemcgr,

    We are getting this xml from the some other application with which we are going to integrate so this is what the requirement is and we have to use the same XML.

    We have to parse this XML and finally inster the date from it into out database after validation.

    Please let me know if you have some views.

    thanks...

  5. #5
    Forum VIP
    Join Date
    Jul 2010
    Posts
    1,609
    Thanks
    25
    Thanked 316 Times in 295 Posts

    Default Re: Java XML parsing for HUGE size file and no root tags

    So is the file itself an xml, or is a text file which contains a handful of xml structures? I ask, because if it is the latter, you will probably not be able to read it with any native xml reader.

    Also, if it is the latter, my immediate suggestion (which may not be the "best" way of doing it) would be to create a handful of temp files and put each xml structure in their own temp file, which you can then read independently as xmls. I suggest this because the large file that you get from the other program will have to be parsed with a stream buffer. As far as I know, there is no other way since you cannot store 8gb in memory (unless you have a really powerful computer and you are able to super-jack the amount of available memory java is allowed to access somehow). So, if you break up each structure into their own temp xml files, you may be able to load each xml file into memory independently while you parse them; or, at the very least, be able to use one of the java xml parsers. Thoughts?
    NOTE TO NEW PEOPLE LOOKING FOR HELP ON FORUM:

    When asking for help, please follow these guidelines to receive better and more prompt help:
    1. Put your code in Java Tags. To do this, put [highlight=java] before your code and [/highlight] after your code.
    2. Give full details of errors and provide us with as much information about the situation as possible.
    3. Give us an example of what the output should look like when done correctly.

    Join the Airline Management Simulation Game to manage your own airline against other users in a virtual recreation of the United States Airline Industry. For more details, visit: http://airlinegame.orgfree.com/

  6. #6
    Junior Member
    Join Date
    Jun 2013
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Java XML parsing for HUGE size file and no root tags

    Hi,

    It's a XML file only. we r thinking of 2 approaches:

    1) One thing to manipulate XML by adding root tags and deleted some repeated stuff and parse it using SAX or Stax as they don't load everything in memory ...
    2) As you said break it up, put in temp folder and than process it one by one. (looks to be right choice if option 1 don't give good response in terms of performance)

    thanks

Similar Threads

  1. Java XML Parsing
    By nimilc2002 in forum What's Wrong With My Code?
    Replies: 1
    Last Post: February 9th, 2011, 10:00 AM
  2. java xml-rpc response parsing to xml
    By kievari in forum File I/O & Other I/O Streams
    Replies: 0
    Last Post: November 19th, 2009, 01:36 PM
  3. [SOLVED] Java code to embedding xml tags at start and end of file
    By John in forum File I/O & Other I/O Streams
    Replies: 3
    Last Post: April 30th, 2009, 03:02 PM
  4. [SOLVED] Parsing ID3 tags from mp3
    By John in forum Java Theory & Questions
    Replies: 14
    Last Post: April 16th, 2009, 01:36 PM