Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 3 of 3

Thread: Simple Hashsets and regular expression problem !

  1. #1
    Junior Member
    Join Date
    Mar 2011
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Simple Hashsets and regular expression problem !

    Hey people,

    I am trying to make a program on which is able to gather some url from a web page through the method html parser and then filter them.

    The section where the url is gathered is done - its is placed under hashsets.


    But i find it hard to take the links thats is in the hashtable and filter them with regular expression !

    I just want to take the hyperlinks and filter !


    I am sure this is a simple problem - but java is not my speciality so i am slow when it comes to this.







    here is the program so far !



     
    import org.htmlparser.util.*;
    import java.util.Iterator;	
    import org.htmlparser.*;
    import org.htmlparser.tags.*;
    import org.htmlparser.filters.*;
    import java.util.HashSet;
    import java.util.regex.Pattern;
    import java.util.regex.Matcher;
    class Hash
    {
     
    	static Matcher m;
    	static String st;
     
    	public static HashSet<String> visit (String s) 
    	{
    		HashSet <String> s1 = new HashSet(); 
    		try{
    			Parser parser1 = new Parser (s);
    		    NodeList list1 = parser1.parse (new LinkStringFilter("http:")); 
     
     
     
    			for (int i=0;i<list1.size();i++)
     
    			{
     
    			    String st = ((LinkTag)(list1.elementAt(i))).extractLink();
     
    	        		 s1.add(st);	
    	                Iterator iter = s1.iterator();
    	                while (iter.hasNext()){
     
    	               String str = (String)iter.next() ;
     
     
    	                	Pattern pattern = Pattern.compile("html");
    	        		    m = pattern.matcher(str);
    	                }
    	                	if (m.find()){
     
     
     
    	        		     System.out.println(m);
     
     
     
    			}
    			}
     
     
     
     
     
     
     
     
     
     
     
     
    			return s1;
    		   }		
    		catch (Exception e)
    		{
    			return new HashSet();
     
    		}	
     
    }



    I try to put an regular expression method but its prints out all of the links instead of the links with html on it.

    I hope someone could help me !


  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Simple Hashsets and regular expression problem !

    I'm not sure I understanding what you are asking. Are you just trying to just get the links that end in ".html"? Does your code print out every link? Your code is quite difficult to read with the non-preserved tabbing and excess spaces - and I would recommend posting an SSCCE with a hardcoded example that demonstrates the issue you are having.

  3. #3
    Junior Member
    Join Date
    Mar 2011
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Simple Hashsets and regular expression problem !

    Yes , i want to filter the links that contain "html" .
    Yes, its just prints out every link !


    Okay, I don't want to confuse you so i will show the program without having regular expression and an example of how it works.






    import org.htmlparser.util.*;
    import java.util.Iterator;
    import org.htmlparser.*;
    import org.htmlparser.tags.*;
    import org.htmlparser.filters.*;
    import java.util.HashSet;
    import java.util.regex.Pattern;
    import java.util.regex.Matcher;
    class Hash
    {

    static Matcher m;
    static String st;

    public static HashSet<String> visit (String s)
    {
    HashSet <String> s1 = new HashSet();
    try{
    Parser parser1 = new Parser (s);
    NodeList list1 = parser1.parse (new LinkStringFilter("http:"));



    for (int i=0;i<list1.size();i++)

    {

    String st = ((LinkTag)(list1.elementAt(i))).extractLink();

    s1.add(st);

    }

    return s1;
    }
    catch (Exception e)
    {
    return new HashSet();

    }
    }
    }








    This the output of the program :





    So all i want to do is implement a regular expression method onto the code. So it can filter all the links that the program has collected.


    So that where u saw my fail attempt above - where i tried to do regex but all it did was print all the links instead filtering,


    I hope i made it more sense to u now !

Similar Threads

  1. Basic Math Expression Java Problem
    By andyluvskrissy in forum What's Wrong With My Code?
    Replies: 6
    Last Post: November 15th, 2011, 03:22 AM
  2. Very simple problem...PLEASE HELP!
    By dungeondragon in forum What's Wrong With My Code?
    Replies: 4
    Last Post: March 1st, 2011, 07:19 AM
  3. Simple problem...
    By _lithium_ in forum What's Wrong With My Code?
    Replies: 9
    Last Post: February 6th, 2011, 12:02 AM
  4. Basic Math Expression Java Problem
    By andyluvskrissy in forum Object Oriented Programming
    Replies: 3
    Last Post: September 30th, 2010, 02:46 PM
  5. Using Regular Expression (regex) in Java Programming
    By lordelf007 in forum What's Wrong With My Code?
    Replies: 8
    Last Post: May 14th, 2010, 10:29 AM

Tags for this Thread