Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 8 of 8

Thread: Java Regex Help

  1. #1
    Forum Squatter newbie's Avatar
    Join Date
    Nov 2010
    Location
    North Wales
    Posts
    661
    My Mood
    Stressed
    Thanks
    28
    Thanked 115 Times in 106 Posts
    Blog Entries
    1

    Default Java Regex Help

    I'm currently learning regular expressions, and one of my tasks is to get all text between <body> and </body> of a html file. When <body> and </body> are on the same line I get a result, but the contents of body is supposed to span across multiple lines which in that instance, doesn't return anything as its only checking the one full line?.

    What I don't understand is, where in this regex could I put a new line ?
    String regex = "<body>.*?</body>"
    also, what new line expression should I use? as I've seen examples such as:
    \n \\n '\n' ('\u000A')

    Thanks.

    Any good tutorials would be appreciated as an alternative
    Last edited by newbie; January 25th, 2011 at 06:38 PM.
    Please use [highlight=Java]//code goes here...[/highlight] tags when posting your code


  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Java Regex Help

    By default, a dot (".") in a regular expression excludes new lines. Pass Pattern.DOTALL to the Pattern compiler to have dots represent all plus new lines
    Pattern patter = Pattern.compile(regex, Pattern.DOTALL);
    Last edited by copeg; January 25th, 2011 at 07:18 PM.

  3. #3
    Forum Squatter newbie's Avatar
    Join Date
    Nov 2010
    Location
    North Wales
    Posts
    661
    My Mood
    Stressed
    Thanks
    28
    Thanked 115 Times in 106 Posts
    Blog Entries
    1

    Default Re: Java Regex Help

    Hmm yeah thanks for that, The description of pattern = Pattern.compile(regex, Pattern.DOTALL);
    seems to be what I was after, but unfortunately, It still fails to find a match when <body> and </body> are on different lines.
    Please use [highlight=Java]//code goes here...[/highlight] tags when posting your code

  4. #4
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Java Regex Help

    You will have to post an example then (both code and text to parse) that does not work, as it works for me...

  5. #5
    Forum Squatter newbie's Avatar
    Join Date
    Nov 2010
    Location
    North Wales
    Posts
    661
    My Mood
    Stressed
    Thanks
    28
    Thanked 115 Times in 106 Posts
    Blog Entries
    1

    Default Re: Java Regex Help

    ----VOID----
    Last edited by newbie; February 6th, 2011 at 03:02 PM.
    Please use [highlight=Java]//code goes here...[/highlight] tags when posting your code

  6. #6
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Java Regex Help

    This isn't a problem with the regular expression per se, but how the file is read. If you wish to search for a regular expression across the file, you should read the file in full, and then pass the contents into the regular expression engine. As of right now, you are reading using a scanner, which delims on whitespace, so you feed the regular expression engine the contents of the file word by word, and will never find the body tags.

  7. The Following User Says Thank You to copeg For This Useful Post:

    newbie (January 26th, 2011)

  8. #7
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 294 Times in 227 Posts
    Blog Entries
    4

    Default Re: Java Regex Help

    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

  9. The Following User Says Thank You to JavaPF For This Useful Post:

    newbie (January 26th, 2011)

  10. #8
    Forum Squatter newbie's Avatar
    Join Date
    Nov 2010
    Location
    North Wales
    Posts
    661
    My Mood
    Stressed
    Thanks
    28
    Thanked 115 Times in 106 Posts
    Blog Entries
    1

    Default Re: Java Regex Help

    Thank you both for your input, I read the file into a StringBuilder, then searched that, which worked a treat
    Please use [highlight=Java]//code goes here...[/highlight] tags when posting your code

Similar Threads

  1. Need help with Regex
    By snytkine in forum Java Theory & Questions
    Replies: 4
    Last Post: October 12th, 2010, 07:30 AM
  2. Need help with regEx
    By ptabatt in forum What's Wrong With My Code?
    Replies: 4
    Last Post: August 14th, 2010, 11:17 AM
  3. Using Regular Expression (regex) in Java Programming
    By lordelf007 in forum What's Wrong With My Code?
    Replies: 8
    Last Post: May 14th, 2010, 10:29 AM
  4. Urgent need a java regex pattern
    By mallikarjun_sg in forum Java Theory & Questions
    Replies: 1
    Last Post: May 6th, 2010, 05:51 AM
  5. [SOLVED] Java Regular Expressions (regex) Greif
    By username9000 in forum Java SE APIs
    Replies: 4
    Last Post: June 11th, 2009, 05:53 PM