Um, ya, wanting to do a little more than my program can take
Ok, so I have an Excel Sheet with 38,500 row of data. Now, only 11,591 of those rows are relevant. The bad new is that I have to go through those 11,591 rows at least 72 times, which accounts for reading 834,552 rows. So I let it run for 10-15 minutes probably, and it evenutally froze and crashed (yay!).
Now, to put it into context, the Excel Sheet contains data like this:
Column 1 = The Year for the Data (2010 or 2011 in this case)
Column 2 = The Month of the Data represented as their month number (where 1 is January) (9 through 7 of the next year in this case)
Column 3 = A name of some sort (sorry, cant be more specific)
Column 4 = A city name
These are the only relevant columns for my current issue.
Now, what is happening here is: Given a Year, a Month, and a City, I am creating a data structure to store an Object that I created that holds a Name, 2 Cities, and two numbers. The Name is aquired from Column 3, 1st City from Column 4, 2nd City from Column 5, 1st number from Column 7, 2nd number from Column 10. Now, these Objects need to be separed by their Name because they will be treated differently later on, so my data structure is an ArrayList that contains ArrayLists of Object with all the same Name variable.
My current method of finding each Object is to increment rows until the correct Year is found, then increment rows until the correct Month is found, then increment rows looking for the correct City and creating Objects with that row until the row doesnt correspond with the correct Month anymore.
So clearly that isnt working for me. Does anyone else have any ideas as to how to approach this?
Re: Um, ya, wanting to do a little more than my program can take
Why the 72 passes? I missed the reason for that.
Re: Um, ya, wanting to do a little more than my program can take
Well, the 11,591 rows is a three month long period, so thats about 4,000 rows for each month. The 4,000 rows is ordered based on the Name, and then ordered by the City Names. I receive 72 City Names, and I need to search each Name for that City Name to see if I need to include in it in my Array. So I need to make 3 passes (for 3 months) of about 4,000 rows, for all 72 City Names.
Re: Um, ya, wanting to do a little more than my program can take
How about putting the 72 Names in a Set and then for each record you could do one lookup to see if its Name is in the Set.
One pass looks at all 12K records
Re: Um, ya, wanting to do a little more than my program can take
I'm still a bit unclear of your goal (sounds a bit complex, a short snippet example would define the problem a bit better), however:
Quote:
increment rows until the correct Year is found, then increment rows until the correct Month is found, then increment rows looking for the correct City and creating Objects with that row until the row doesnt correspond with the correct Month anymore
This sentence sounds like some sort of tree data structure might help. Makes searching fast and easy. You could do it through a node/pointer type reference structure, or possibly easier through a few Maps whose values index the keys in another map until you reach your goal.
Re: Um, ya, wanting to do a little more than my program can take
Quote:
Originally Posted by
Norm
How about putting the 72 Names in a Set and then for each record you could do one lookup to see if its Name is in the Set.
One pass looks at all 12K records
I made a HUGE rookie mistake and I feel like an idiot now. I forgot to increment the rows...
Regardless, that thought came into my mind after I made this post, and now that you said it I am happy I forgot to increment the rows because I'm going to give that a shot.
It also seems to be overlooking some things with my current setup, I will have to look into that.