removing stopwords from a word list
hi,
i have a word list from which i need to remove stopwords.. i have a stopword list stroed in array a. im reading the array and then checking it with my word list... my word list contains elements as follws..
it
Belgium
's IMEC
a threeyear collaboration agreement
all
advanced metallization process
i want to remove words like it, all from this list... but my code does not recognise it and all.. it recognises only the all in "advanced metallization process" word. i want to remove only the pronouns that are as a whole in the list. here is my code..pls help..
Code Java:
for(Iterator f = words1.iterator();f.hasNext();){
String anew = f.next().toString();
System.out.println(anew);
for(int iFilt = 0; iFilt<a.length; iFilt++) {
System.out.println("checking for " + a[iFilt]);
if(anew.indexOf(a[iFilt]) != -1){
System.out.println(a[iFilt]+""+"found!"+"in"+anew);
anew = anew.replace(a[iFilt],"");
}
}
}
Re: removing stopwords from a word list
Hello jessie,
Welcome to the Java Programming Forums :)
I have moved this thread to - File I/O & Other I/O Streams - Java Programming Forums
There are several things you could try. How about splitting the line String into individual words? You could use space as a delimiter.