How to download protected web page using JAVA
We have a task, to design a class which can download source of any web page. But when I try to test my code and fetch page like http://anidb.net/perl-bin/animedb.pl?show=main – nothing is working.
Standard code like this fails:
Code :
import java.net.*;
import java.io.*;
public class URLReader {
public static void main(String[] args) throws Exception {
URL link = new URL("http://www.anidb.net/");
BufferedReader in = new BufferedReader(
new InputStreamReader(link.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
Here is the result I got: Šwq>²"¦§5´ïÇUº=ôÙö?kŠ}~“bd`?l“Ïçz¢Çêõ>"?j׉R“y}K¸\ Ìc_DLÙªÏ_ –óMm_¼_0”•ö°ËC_aí½sî¤ìÁS ‚>dC0ìs_–y¹ñ±ÏÝÜAø%ÈäÖáæ©A@,4x„жëɃ?
I have tried everything: cookies, header files but nothing seems to work. If you have some hint for me, I will appreciate it. I’ve been thinking about this problem for to weeks. Thanks.
Re: How to download protected web page using JAVA
Try using a a different class that will return the HTTP header etc instead of just the contents.
Try your code with another website that returns an html page. Its possible that the site you are going ti does not return an html page.
When I read from that site I get this:
hdr> Date: Sat, 22 Sep 2012 12:22:41 GMT
hdr> Server: Apache
hdr> Cache-control: no-cache
hdr> Pragma: no-cache
hdr> Content-Type: text/html; charset=UTF-8
hdr> Expires: Sat, 22 Sep 2012 12:22:41 GMT
hdr> Set-Cookie: adbuin=1348316562-wmNq; path=/; expires=Tue, 20-Sep-2022 12:22:42 GMT
hdr> Vary: Accept-Encoding
hdr> Content-Encoding: gzip
hdr> Keep-Alive: timeout=4, max=50
hdr> Connection: Keep-Alive
hdr> Transfer-Encoding: chunked
Also posted at http://www.java-forums.org/advanced-...sing-java.html