Website content extractor
I have been trying to get a running program to display text from a site that is mainly text, in this case hackaday.com. I just can't find the right methods. Can anyone get it working AND explain it to me?
import javax.swing.JLabel;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import javax.swing.JFrame;
public class SeeWebsite
{
public static void main(String[] argv) throws Exception
{
URL url = new URL("http://www.hackaday.com");
System.out.println(url.toExternalForm());
}
}
Re: Website content extractor
What is printed out when you execute your program?
Why do you expect your code to get the contents of the html file from the server?
Re: Website content extractor
All it prints is "http://www.hackaday.com".
I don't. Not yet. That's why I posted here. Also, I'd like to be able to view the text as the browser shows it, not html.
Re: Website content extractor
One way would be to use the HttpURLConnection class to connect to a server and read what the server returns. That would be the html.
If you want a browser-like display (simple html only) use the JEditorPane class.
Re: Website content extractor
Yes, but how would i convert between the HTML and plain text like would be shown in a browser?
Re: Website content extractor
Are you asking how to parse an html page and extract text into Strings? I think there are third party packages that do some of that. Try asking Google.