Welcome to the Java Programming Forums

The professional, friendly Java community. 21,500 members and growing!

The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 2 of 2

Thread: Thread Blocked

  1. #1
    Junior Member
    Join Date
    Feb 2010
    Thanked 0 Times in 0 Posts

    Unhappy Thread Blocked

    Hi all,

    My Question in Detail :-

    I am working on web-crawler project, for that we are executing the webpages using the HttpClient4.0 beta2.jar API.
    Basically it's a multithreaded application, and we are following the client server approach in our project.
    we are fetching the url's form the database, and exceuting the urls using the java HttpClient.execute( ) method.

    Code snippet as follows:

             HttpGet get1;
             DefaultHttpClient client = new DefaultHttpClient();
             String body = null;
             URL url=new URL(html.GetURL().trim());
         HttpResponse preresponse=null;
             ResponseHandler<String> handler = new BasicResponseHandler();
              preresponse = client.execute(get1);
              StringBuffer source=new StringBuffer();
             HttpEntity entity=preresponse.getEntity();
                 InputStream is=entity.getContent();
                 BufferedReader br=new BufferedReader(new InputStreamReader(is));
                 String line=br.readLine();

    when we start our application all the threads perfectly crawling the web-url, but after some time suppose 4 to 5 hours or more , some of thread gets blocked , after executing the statement
    StringBuffer source=new StringBuffer();"

    and they didn't crawl futher any of the url's. We have also generated the log file for the Thread status after every 5 minutes, the thread which gets blocked, has still in the Runnable State, also in addition we have also caught all the exception of java.io, java.util., org.apache.http.client. packages. and write all the exception in the log file with the time and the url, but there would be no log generated in our log file.

    Please help me
    Thanks & Regards
    Rakesh Yadav
    Last edited by Json; February 19th, 2010 at 03:53 AM. Reason: Please use code tags.

  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Thanked 832 Times in 772 Posts
    Blog Entries

    Default Re: Thread Blocked

    Problems like this can be a pain to diagnose. First thought is, after executing for 4-5 hours and depending upon how/if you are saving info in memory you could be running out of memory (OutOfMemoryException). You can set the virtual machine -Xmx memory at runtime. This could also be a result of the host because it seems to block at preresponse.getEntity() - I've never used the HttpClient4.0 but there should be a way to set timeouts and and deal with server header codes (why crawl pages that have 3xx,4xx,5xx http status codes). Finally, threads blocking can be the result from deadlocking, although this may not be the case here.

Similar Threads

  1. thread sorting
    By thanos_ in forum What's Wrong With My Code?
    Replies: 1
    Last Post: February 12th, 2010, 06:23 PM
  2. A thread as game loop
    By maikeru in forum Threads
    Replies: 0
    Last Post: December 25th, 2009, 09:01 PM
  3. DateFormat is not thread safe
    By trueacumen in forum Threads
    Replies: 5
    Last Post: August 15th, 2009, 02:16 AM
  4. How to do thread communication in java
    By Koren3 in forum Threads
    Replies: 4
    Last Post: March 29th, 2009, 10:49 AM