Problem to get info from HTML file
I am trying to get all the text from following URL "https://svenskaspel.se/p4.aspx?pageid=264".
But the problem is that I canīt get the text from the specific tag, and I have no idea why I canīt get the text between the tag.
Here is my code.
Code Java:
public getExpert(){
try {
sc = new Scanner(new URL("https://svenskaspel.se/p4.aspx?pageid=264").openStream(), "iso-8859-1");
} catch (IOException ex) {
Logger.getLogger(getTips.class.getName()).log(Level.SEVERE, null, ex);
}
String s = null;
do {
s = sc.nextLine();
Matcher m = Pattern.compile("<SPAN\\b[^>]*CLASS=\"mbr entry-content>(.*?)</SPAN>").matcher(s);
if (m.find()) {
System.out.println(m.group(1).trim());
}
} while (sc.hasNextLine() && !s.matches("<SPAN\\b[^>]*CLASS=\"mbr entry-content>(.*?)</SPAN>"));
}
}
Re: Problem to get info from HTML file
I did some tests using this slightly edited code:
Code Java:
try {
Scanner sc = new Scanner(new URL("https://svenskaspel.se/p4.aspx?pageid=264").openStream(),"iso-8859-1");
while(sc.hasNextLine()){
String s = sc.nextLine();
Matcher m = Pattern.compile("<SPAN\\b[^>]*CLASS=").matcher(s);
if (m.find()) {
System.out.println(m.group(0));
//System.out.println(s);
}
}
} catch (IOException e) {
System.out.println("Ouch! " + e);
}
It returns results but not the exact results you are looking for.
I think you need to play with your regular expression.