Welcome to the Java Programming Forums

The professional, friendly Java community. 21,500 members and growing!

The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 2 of 2

Thread: Word Frequency in News Articles

  1. #1
    Junior Member mblem22's Avatar
    Join Date
    Jun 2012
    Thanked 0 Times in 0 Posts

    Default Word Frequency in News Articles

    Hello everyone. I was wondering if anyone could point me in the right direction. I am trying to write a Java method which will take, as an argument, a string, 'searchText' and an array of strings, 'keyWords'.

    It will then open a URL connection to google news or some other news website and perform a search using the given 'searchText' string. Then, it will open the first 50 news articles returned and count the number of times each of the 'keyWords' strings show up in each article.

    I am fairly experienced with Java programming for local applications, but I have never tried to do anything extensive with web access.

    Basically, I feel capable of opening a URL connection to the website, but I'm not sure how to:

    1. Execute the search
    2. Open/download the contents of each returned web page
    3. Isolate the main body of the article from the banners, sidebars, comments, etc.

    Once I have the raw content from the articles, it should be a breeze.

    Can anyone point me in the right direction or give me some pointers? I would really appreciate it.

  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Thanked 827 Times in 770 Posts
    Blog Entries

    Default Re: Word Frequency in News Articles

    This thread has been cross posted here:


    Although cross posting is allowed, for everyone's benefit, please read:

    Java Programming Forums Cross Posting Rules

    The Problems With Cross Posting

Similar Threads

  1. read a file word by word
    By poornima2806 in forum File I/O & Other I/O Streams
    Replies: 1
    Last Post: February 23rd, 2012, 02:14 PM
  2. Reading a text file word by word
    By dylanka in forum File I/O & Other I/O Streams
    Replies: 3
    Last Post: October 21st, 2011, 02:06 PM
  3. Word length frequency help for applet
    By jake6047 in forum Java Applets
    Replies: 5
    Last Post: August 26th, 2011, 07:47 AM
  4. Llist of words and frequency of each word
    By jkkj in forum Collections and Generics
    Replies: 7
    Last Post: August 16th, 2011, 09:59 AM