Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 10 of 10

Thread: Printing xml to the console from .wmdb without printing junks

  1. #1
    Junior Member
    Join Date
    Mar 2009
    Posts
    28
    Thanks
    5
    Thanked 0 Times in 0 Posts

    Default Printing xml to the console from .wmdb without printing junks

    Hi guys

    Further to a previous post that i made i've been trying to print some xml from a .wmdb (windows media database) file. The file itself basically stores metadata about media that has been played using windows media player. However the file also contains lots of other junk in it and not just xml, but i'm only concerned with printing the xml from the file to the consle for now. The follwoing is some xml taken from the file:

    < M E T A D A T A x m l n s : s q l = " u r n : s c h e m a s - m i c r o s o f t - c o m : x m l - s q l " >

    < M D R - C D >

    < v e r s i o n > 4 . 0 < / v e r s i o n >

    < W M C o l l e c t i o n I D > 9 9 9 F B C 3 A - E 1 E A - 4 F 9 A - A B 8 3 - A 9 A 8 C A 0 D F 7 F B < / W M C o l l e c t i o n I D >

    < W M C o l l e c t i o n G r o u p I D > 9 9 9 F B C 3 A - E 1 E A - 4 F 9 A - A B 8 3 - A 9 A 8 C A 0 D F 7 F B < / W M C o l l e c t i o n G r o u p I D >

    < m d q R e q u e s t I D / >

    < u n i q u e F i l e I D > A M G a _ i d = R 2 0 6 7 1 4 < / u n i q u e F i l e I D >

    < a l b u m T i t l e > M T V U n p l u g g e d i n N e w Y o r k < / a l b u m T i t l e >

    < a l b u m A r t i s t > N i r v a n a < / a l b u m A r t i s t >

    < r e l e a s e D a t e > 1 9 9 4 - 1 1 - 0 1 < / r e l e a s e D a t e >

    < l a b e l > D i v i n e R e c o r d i n g s < / l a b e l >

    < g e n r e > R o c k < / g e n r e >

    < p r o v i d e r S t y l e > R o c k < / p r o v i d e r S t y l e >

    < p u b l i s h e r R a t i n g > 9 < / p u b l i s h e r R a t i n g >

    The code that i'm using is as follows. It's basically the same as the DOMParseExample on this website. For the meantime i've just tried to get the program to read the tag WM Collection ID as you can see below.

    However the program seems to compile with no errors, but when i try to run it i get the following error:
    [Fatal Error] CurrentDatabase_360.wmdb:1:1: Content is not allowed in prolog.
    org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at com.sun.org.apache.xerces.internal.parsers.DOMPars er.parse(DOMParser.java:264)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBu ilderImpl.parse(DocumentBuilderImpl.java:292)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBu ilder.java:172)
    at DOMParseExample.main(DOMParseExample.java:16)


    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.NodeList;
    import java.io.*;
     
    public class DOMParseExample {
     
    public static void main(String[] args) {
     
      File file = new File("CurrentDatabase_360.wmdb");
     
      try {
      DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
      Document doc = builder.parse(file);
     
      NodeList nodes = doc.getElementsByTagName("Metadata");
     
      for (int i = 0; i < nodes.getLength(); i++) {
     
         Element element = (Element) nodes.item(i);
     
         NodeList WMCollectionID = element.getElementsByTagName("WMCollectionID");
         Element line = (Element) WMCollectionID.item(0);
     
         System.out.println("WM Collection ID: " + line.getFirstChild().getTextContent());
     
         System.out.println();
      }
      }
      catch (Exception e) {
      e.printStackTrace();
      }
      }
    }

    I'm not sure whether this is because the program doesn't recognize the file or maybe it's because there is other data in the file asides from xml.

    Any help on this would be much appreciated as i'm stumped!

    Thanks

    John


  2. #2
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    Hello John,

    Could you please zip & attach the CurrentDatabase_360.wmdb file?

    If it wont let you then please PM me and I will send you my email address.

    Has your other thread been solved now? If so, please mark it as solved:

    http://www.javaprogrammingforums.com/jpf-rules-guides/188-important-marking-your-thread-solved-new-post.html
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

  3. #3
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    Hey John,

    When I open this file with notepad there is soooooo much junk in it. You will never be able to parse this without editing out the rubbish first.

    Maybe you should write a program to read in the .wmdb file, clean it, and write the XML out to another file. Then you can parse that...

    Try using that code above with a file that contains just the wmdb XML.
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

  4. #4
    Junior Member
    Join Date
    Mar 2009
    Posts
    28
    Thanks
    5
    Thanked 0 Times in 0 Posts

    Default Re: Parsing xml problem

    Quote Originally Posted by JavaPF View Post
    Hey John,

    When I open this file with notepad there is soooooo much junk in it. You will never be able to parse this without editing out the rubbish first.

    Maybe you should write a program to read in the .wmdb file, clean it, and write the XML out to another file. Then you can parse that...

    Try using that code above with a file that contains just the wmdb XML.
    Hey thanks for taking time to look at the file, it's much appreciated.

    I'm just curious do you think using regular expressions to parse the xml would work?

    John

  5. #5
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    I suppose it would work but it would be complicated. Is the XML always printed in the same format?

    I think there are easier ways to do it. When I viewed that file you sent me, I could not see the clear XML like your example above. There was data clogging it inside and outside of the tags. Did you clean this up before you posted?
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

  6. #6
    Junior Member
    Join Date
    Mar 2009
    Posts
    28
    Thanks
    5
    Thanked 0 Times in 0 Posts

    Default Re: Parsing xml problem

    To be honest the xml for each piece of media appears quite randomely. Generally the xml tags for each piece of media are similar however some of the media contains xml that others don't. The file i sent you was one of the smaller ones, but some of the larger ones i've looked at can be quite messy.

    The xml above was one of the better structured pieces of xml i found, but not all are like that as you've seen from the file. I copied some of the xml into an xml file and tidied it up manually and it worked, but preferably i'd like something that could read the xml straight from the .wmdb file or even copy to a text file and read direct from that instead of me having to tidy up the file by removing data, but perhaps i'm being too unrealistic.

    One of the problems i've found with the xml however is that each tag seems to contain whitespace between each character which is a problem when using the DOMParse. It means i have to go through each tag and remove all the whitespace which would take me hours. This is why i was thinking that by using regular expressions i could parse the whitespace also?

    Perhaps i'm wrong though as i'm not really too familar with regular expressions.

    John

  7. #7
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    Hmm... This does look like its going to be a bit of a problem!

    I think your best bet is to read in the file, try to isolate the XML and delete any of the crap/white space.

    Then you can take the clean XML and parse it.

    I will try to write something to do the first step when I get a chance. I'm not sure how straight forward its going to be!!!
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

  8. #8
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    This is so nasty. I've wrote a program to look for the XML elements in the .wmdb file but in the file you sent me, I can't pick up any of the tags like you posted above.
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

  9. #9
    Junior Member
    Join Date
    Mar 2009
    Posts
    28
    Thanks
    5
    Thanked 0 Times in 0 Posts

    Default Re: Parsing xml problem

    Quote Originally Posted by JavaPF View Post
    This is so nasty. I've wrote a program to look for the XML elements in the .wmdb file but in the file you sent me, I can't pick up any of the tags like you posted above.
    Hey thanks for taking time to look at it. I'm still working on it, but i've got another problem!

    I'll post it in a new thread though

  10. #10
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 286 Times in 225 Posts
    Blog Entries
    4

    Default Re: Parsing xml problem

    OK John, I'll do my best to help you solve your new problem!
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

    Looking for a Java job? Visit - Java Programming Careers

Similar Threads

  1. [SOLVED] Parsing ID3 tags from mp3
    By John in forum Java Theory & Questions
    Replies: 14
    Last Post: April 16th, 2009, 01:36 PM