using Scanner for 75mb file
Hello,
I am trying to parse a 75mb .list /.txt file to the screen first then eventually to the DB if it ever works.
I am trying to use Scanner and it stops readinglines after 9-10 lines and the application commences properly no crash no nothing. The file i m trying to read is 100000 line-long IMDB rating.list file.
Should i do some memory management or smth?
Thanks in advance.
Part of my code;
Code :
...
try {
scanner.findWithinHorizon("Title", 0);
while (scanner.hasNextLine()) {
String nextLine = scanner.nextLine();
System.out.println(nextLine);
if (!isValid(nextLine)) {
continue;
}
processLine(nextLine);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
scanner.close();
}
...
//For each line i use another scanner
private void processLine(String aLine) {
lineScanner = new Scanner(aLine);
lineScanner.useDelimiter("\\s{2,}");
//process the line...
}
Re: using Scanner for 75mb file
For such a long file, I'd recommend NOT displaying everything you parse to the screen.
For example, try this:
Code :
for (int i = 0; i < 1000000; i++)
{
for (int j = 0; j < 1000000; j++)
{
System.out.println(i*j);
}
}
vs. this:
Code :
for (int i = 0; i < 1000000; i++)
{
for (int j = 0; j < 1000000; j++)
{
i*j;
}
}
The second code will finish many times faster because printing stuff out is extremely slow.
To test if your algorithm is working, I'd recommend taking the first 10 lines or so of your file and then test it with that (with the screen output in place). If that works, then remove the screen output code and process the larger file.
If you have a computer that was made at least in the 2000's or newer, you'll probably have ~512MB to 3GB of memory, plenty to deal with your file (I once tried to allocate an array of size 2000000 and it succeeded)
Re: using Scanner for 75mb file
Depending upon if and how you are reading the data into memory, you may also need to set the maximum JVM memory (although based upon your description this may not be the problem - you should see an OutOfMemoryException). Just add something like -Xmx512m or -Xmx1g on the command line to safeguard against memory exceptions
Re: using Scanner for 75mb file
Thank you for your quick replies. I ll try to exlain further my problem.
When i run the code for a testfile.txt, it reads all of its 90 lines.
Movie title : The Shawshank Redemption
Movie title : The Godfather
Movie title : The Godfather: Part II
Movie title : Il buono, il brutto, il cattivo.
Movie title : Pulp Fiction
Movie title : Schindler's List
...
Movie title : Who Made the Potatoe Salad?
Movie title : Who Makes Movies?
BUILD SUCCESSFUL (total time: 0 seconds)
If I try it with the imdb file (100000 lines), it stops reading after 10 lines.
...
0000000124 335002 8.7 Fight Club (1999)
0000000124 63810 8.7 C'era una volta il West (1968)
BUILD SUCCESSFUL (total time: 0 seconds)
...
And actually in certain occasions it stops reading in the middle of a long line.
...
0000000124 335002 8.7 Fight Club (1999)
0000000124 349139 8.7 The Lord of the Rings: The Fellowshi
java.lang.ArrayIndexOutOfBoundsException: 1
// out of bounds occurs when it tries to process a incomplete line.
...
I understand the screen printing issue. But i would understand it better if it would crash trying to print those numoerous line.
I suspect it has smth to the with the file size being huge. how can i make it crash at least? :)
Re: using Scanner for 75mb file
That last exception "java.lang.ArrayIndexOutOfBoundsException" says a lot, especially if those are not being caught. There is possibly something in your processLine function that is the culprit
Re: using Scanner for 75mb file
So are you reading each line and passing that line to the scanner?
// Json