Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 5 of 5

Thread: Externally sorting ints as fast as possible

  1. #1
    Junior Member
    Join Date
    Sep 2010
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Externally sorting ints as fast as possible

    Hi all,

    Just to get it out of the way, I'm a bit of a Java noob but I have a guide on reading and writing files through RandomAccessFiles, BufferedOutput(or Input)Streams and DataOutput(or Input)Streams.

    I'm trying to code a Java program to sort any file of ints as fast as possible. The file could be any size, and for each instance of the problem I'm given another file (of the same size as the one I have to sort) to use to store intermediate results. I think they'll also restrict the amount of memory available to the JVM but it hasn't specified how, so I'll deal with that as I run it against the test harness.

    I was considering writing some kind of external mergesort but I was wondering if it'd be possible to do a counting sort?

    Here are the problems I don't know how to deal with:

    1. The test harness may give me a file containing the largest negative int and the largest positive int. This would mean my algorithm would be unable to make a "count" array big enough. Is it possible to get around this using some kind of sparse data structure instead of an array, while still preserving the O(n) time complexity of counting sort?

    2. I think this is actually unlikely, but the test harness may give a file that simply has so many of a certain integer that the "count" array/data structure overflows. Actually I think this is REALLY unlikely but worth mentioning.

    So, any ideas?


  2. #2
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Externally sorting ints as fast as possible

    Rather than write your own, you can use the Collections library that java provides. First, store the integers in a List, then use Collections to sort the list (Collections.sort uses a modified Mergesort with guaranteed n*log(n) performance).

    String[] line = //...readfile
    List<Integer> ints = new ArrayList<Integer>();
    for (String s : line ){
       ints.add(Integer.parseInt(s));//no error handling - this expects each line to have a single, parsable integer on it
    }
    Collections.sort(ints);//sort the array

  3. #3
    Junior Member
    Join Date
    Sep 2010
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Externally sorting ints as fast as possible

    Thanks for the quick and sensible reply. However, I have to write the sorting part myself. Also, there are too many ints to put in a list.

  4. #4
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,042
    Thanks
    63
    Thanked 2,708 Times in 2,658 Posts

    Default Re: Externally sorting ints as fast as possible

    there are too many ints to put in a list.
    That makes it slightly harder.
    You can define a large array and fill it. If it is going to overflow, create a larger array, copy the first to the new one and continue. At some point there will be two very large arrays in memory where you need to copy the contents of the smaller one to the larger one to continue. Don't know how to solve this.

    Or you need to leave some of the ints on disk and only read parts of them into memory at a time. You only have a subsection of the ints in memory at one time, sort them and write them out to a to-be-merged file. After all the ints have been read, sorted and written to these to-be-merged files, you read the files and merge them to the new sorted output file.

  5. #5
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,320
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default Re: Externally sorting ints as fast as possible

    By 'too many int's in list', do you mean that you will end up getting a stack overflow / out of memory exceptions?

    It sounds like you may want to look into creating a binary tree structure, whose nodes hold values (or ranges of values) as well as point to files. Here, your tree structure will not only help you sort the values, but also tell you what file to place the integer into based upon its value. It will result in several files which can then be merged by traversing the tree in a particular order and, if needed (for example each file contains a range of values), each one sorted prior to merging using a quicksort of mergesort.
    Last edited by copeg; September 21st, 2010 at 03:02 PM.

Similar Threads

  1. Code stopping, need help fast.
    By aussiemcgr in forum What's Wrong With My Code?
    Replies: 2
    Last Post: August 11th, 2010, 09:00 AM
  2. adding multiple ints into a string
    By straw in forum Java Theory & Questions
    Replies: 1
    Last Post: March 18th, 2010, 06:02 PM
  3. Sorting/Lexicographic =)
    By jcs990 in forum What's Wrong With My Code?
    Replies: 5
    Last Post: March 12th, 2010, 11:19 PM
  4. New person Just trying to read a file of ints
    By dubois.ford in forum File I/O & Other I/O Streams
    Replies: 1
    Last Post: March 7th, 2010, 11:47 PM
  5. [SOLVED] sorting
    By kite98765 in forum Algorithms & Recursion
    Replies: 8
    Last Post: February 4th, 2010, 08:34 AM

Tags for this Thread