Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 2 of 2

Thread: Word Frequency in News Articles

  1. #1
    Junior Member mblem22's Avatar
    Join Date
    Jun 2012
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Word Frequency in News Articles

    Hello everyone. I was wondering if anyone could point me in the right direction. I am trying to write a Java method which will take, as an argument, a string, 'searchText' and an array of strings, 'keyWords'.

    It will then open a URL connection to google news or some other news website and perform a search using the given 'searchText' string. Then, it will open the first 50 news articles returned and count the number of times each of the 'keyWords' strings show up in each article.

    I am fairly experienced with Java programming for local applications, but I have never tried to do anything extensive with web access.

    Basically, I feel capable of opening a URL connection to the website, but I'm not sure how to:

    1. Execute the search
    2. Open/download the contents of each returned web page
    3. Isolate the main body of the article from the banners, sidebars, comments, etc.

    Once I have the raw content from the articles, it should be a breeze.

    Can anyone point me in the right direction or give me some pointers? I would really appreciate it.


  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Word Frequency in News Articles

    This thread has been cross posted here:

    http://www.java-forums.org/new-java/60973-word-frequency-news-articles.html

    Although cross posting is allowed, for everyone's benefit, please read:

    Java Programming Forums Cross Posting Rules

    The Problems With Cross Posting


Similar Threads

  1. read a file word by word
    By poornima2806 in forum File I/O & Other I/O Streams
    Replies: 1
    Last Post: February 23rd, 2012, 03:14 PM
  2. Reading a text file word by word
    By dylanka in forum File I/O & Other I/O Streams
    Replies: 3
    Last Post: October 21st, 2011, 02:06 PM
  3. Word length frequency help for applet
    By jake6047 in forum Java Applets
    Replies: 5
    Last Post: August 26th, 2011, 07:47 AM
  4. Llist of words and frequency of each word
    By jkkj in forum Collections and Generics
    Replies: 7
    Last Post: August 16th, 2011, 09:59 AM