Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 5 of 5

Thread: String Comparisons, HashSet, Duplicates

  1. #1
    Junior Member
    Join Date
    Apr 2014
    Location
    Louisville, KY
    Posts
    3
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default String Comparisons, HashSet, Duplicates

    I'll be completely honest, I'm 31 and developing a Minecraft plugin, but this isn't a Bukkit question, it's a Java question. Anyways, I've read a couple other posts on this same general subject, but they haven't seemed to really help my issue.

    OK, I have a HashSet, which I created to prevent duplicates upon output, but of course it's printing duplicates(or else I wouldn't be posting this). The order of my output does not matter, nor the input. The data type is String in the format (x + "," + z), where x and z are integers, creating a collection of coordinate sets. So to prevent the output of duplicates, I'm trying to get rid of the duplicates before they are added to the collection. I've tried doing a '.equals()' string comparison but what happens is, since my string is added via one variable, it compares itself to itself and if itself equals itself it won't be added to the collection. I really need to keep this as a comparison of a single variable, because creating a key for each value would be sooo ridiculous for this volume of inputs.

    So, with that being said, I would like to add one copy of the string, discard the duplicates, and do this thousands of times. I can post a snippet of code or give more details if needed, but I would prefer not to(give a snippet of code). Thanks in advance for any help. P.S. it is Easter, for me, so if I don't get right back to you, I'm probably covered in chocolate.

    [EDIT]: totally meant to post this in the Collections Thread, not the 'what's wrong with my code' thread, sorry admins! feel free to move
    Last edited by marmalade; April 20th, 2014 at 11:56 AM. Reason: wording
    int i = 5, i++ + ++i pffft yeah right, ......no wait, yeah that's, no, uuuuumm 12!


  2. #2
    Member
    Join Date
    Feb 2014
    Posts
    180
    Thanks
    0
    Thanked 48 Times in 45 Posts

    Default Re: String Comparisons, HashSet, Duplicates

    I'm not sure if I completely understand... Are you saying that you have Strings that look something like x + "," + z that you added to a HashSet, and when you print out the contents of the HashSet there were duplicates? That's quite impossible...

    So to prevent the output of duplicates, I'm trying to get rid of the duplicates before they are added to the collection. I've tried doing a '.equals()' string comparison but what happens is, since my string is added via one variable, it compares itself to itself and if itself equals itself it won't be added to the collection.
    That shouldn't be necessary. When you add elements into a HashSet, if the element already exists in the set, the set will be unchanged. There's no need for you to write any code to check if there are duplicates. The very action of adding an element into a set will take care of duplicates.

    PS: Happy Easter!

  3. #3
    Junior Member
    Join Date
    Apr 2014
    Location
    Louisville, KY
    Posts
    3
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default Re: String Comparisons, HashSet, Duplicates

    Quote Originally Posted by jashburn View Post
    I'm not sure if I completely understand... Are you saying that you have Strings that look something like x + "," + z that you added to a HashSet, and when you print out the contents of the HashSet there were duplicates? That's quite impossible...



    That shouldn't be necessary. When you add elements into a HashSet, if the element already exists in the set, the set will be unchanged. There's no need for you to write any code to check if there are duplicates. The very action of adding an element into a set will take care of duplicates.

    PS: Happy Easter!
    Well, actually, it's not quite impossible to have a HashSet output duplicates. Yes, a HashSet, by definition, cannot store duplicates. But I assure you it can and it will AND there are threads, in this very forum actually, that address the issue; otherwise, like I said, I wouldn't be here.

    More importantly(back to the issue), what happens is I have a method that grabs an integer for an x-axis coordinate, then grabs an integer for a z-axis coordinate. It then stores each integer in a String, concatenated with a comma, so that (i.e.) it will display the coordinate location as '12,11' or '58,-14' or whatever the coordinates may be. Let me give you a foundation of what exactly happens:

    Chunks of data are loaded in the game. A chunk is comprised of a 16x16x256 chunk of blocks. I use a for loop that grabs all the chunks that have been loaded, iterates through each chunk on a 16x16 plane. And for each slice of the plane it encounters, I change its biome. Ok, so now that I've changed the biome for all loaded chunks, I then grab (from each block that has been processed) the chunk coordinate x and the chunk coordinate z (which are different from a player's location coordinates and can only be obtained using the methods within the API). I take these coords and store them in a string in their desired output format (x,z), and I assign all this to a command that is only available if player is OP. And then I store them in a HashSet to be pulled upon retrieval by the user.

    So to review, the command changes the biome of all loaded chunks, grabs it coords, and prints them out from a HashSet. The problem is, since the coordinates are stored within the for loop that iterates through BLOCKS, it grabs the location of every block in the chunk whether its in the same chunk or not. So the HashSet gets stored with hundreds if not thousands of duplicates. It sounds unbelievable I know. Let me put this into perspective: typically the number of chunks loaded in a normal game depending on setting is somewhere between 3-15. Take that number times 16x16x256 so (65,536 x number of chunks) is how many strings are being stored in this HashSet.

    Regardless, I think the problem lies within the nature of the data manipulation prior to entering the collection, which is why I would like to nip the problem in the bud before it gets to the collection. I've read that the '.equals()' method can cause problems with HashSets as well regarding duplicates. Just to be clear, my code works and does what its supposed to do except for the duplicates upon output from the HashSet. It is quite possible.

    Sorry for the long post but here's what would be ideal: Store my coords in the string, store 1 copy of the string in the collection, and if any more copies of that are encountered, then discard. Move on to the next. Keep in mind I'm using 1 variable for all coords.
    int i = 5, i++ + ++i pffft yeah right, ......no wait, yeah that's, no, uuuuumm 12!

  4. #4
    Member
    Join Date
    Feb 2014
    Posts
    180
    Thanks
    0
    Thanked 48 Times in 45 Posts

    Default Re: String Comparisons, HashSet, Duplicates

    Sorry, HashSet cannot store duplicates. If it does, it is a major bug that breaks applications everywhere. Having said that, an object that is a "duplicate" of another implies the objects are "equal", and in this context "equality" is defined as the objects returning the same values as each other for the their implementation of the equals() and hashCode() methods.

    To put this in another way, if object objA is a duplicate or is equal to object objB, then
    1. objA.equals(objB) and objB.equals(objA) are true, and
    2. (objA.hashCode() == objB.hashCode()) is true

    In this case you can only store either objA or objB in a HashSet, and not both.

    If you were referring to a post such as http://www.javaprogrammingforums.com...ted-value.html where a HashSet stores duplicate elements, then be aware that it's an example of not properly overriding the equals() and hashCode() methods. See http://www.javaprogrammingforums.com...-use-sets.html for a further write-up on this.

    If the above doesn't help, and/or you've come across a case where duplicated Strings are stored in a HashSet, then I'm afraid I'll need to see some example code to further understand the problem.

  5. The Following User Says Thank You to jashburn For This Useful Post:

    marmalade (April 21st, 2014)

  6. #5
    Junior Member
    Join Date
    Apr 2014
    Location
    Louisville, KY
    Posts
    3
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Default Re: String Comparisons, HashSet, Duplicates

    Quote Originally Posted by jashburn View Post
    Sorry, HashSet cannot store duplicates. If it does, it is a major bug that breaks applications everywhere. Having said that, an object that is a "duplicate" of another implies the objects are "equal", and in this context "equality" is defined as the objects returning the same values as each other for the their implementation of the equals() and hashCode() methods.

    To put this in another way, if object objA is a duplicate or is equal to object objB, then
    1. objA.equals(objB) and objB.equals(objA) are true, and
    2. (objA.hashCode() == objB.hashCode()) is true

    In this case you can only store either objA or objB in a HashSet, and not both.

    If you were referring to a post such as http://www.javaprogrammingforums.com...ted-value.html where a HashSet stores duplicate elements, then be aware that it's an example of not properly overriding the equals() and hashCode() methods. See http://www.javaprogrammingforums.com...-use-sets.html for a further write-up on this.

    If the above doesn't help, and/or you've come across a case where duplicated Strings are stored in a HashSet, then I'm afraid I'll need to see some example code to further understand the problem.
    Well, first of all, thank you for your quick responses. The link, How to Use Sets, was a helpful link in furthering my understanding of possible problems in the future. However, I solved my own problem by just adding an if statement that does checks if the string equals itself and if it does, it creates a new 'replaceAll(regex, replacement) string and then I store the new string. Regex being the value of my pre-converted string and the replacement being the pre-converted string variable.

    So the code(small snippet) looked something like this:

    Original:

    int chunkx = chunkCoords.getX(); //get chunk location of said chunk
    int chunkz = chunkCoords.getZ();
    String coordSet = (chunkx + "," + chunkz); //store set of coords as string
    Collection.add(coordSet);
    Collection.add(coordSet);
    count++;
    }

    New:

    int chunkx = chunkCoords.getX(); //get chunk location of said chunk
    int chunkz = chunkCoords.getZ();
    String cx = new Integer(chunkx).toString();
    String cz = new Integer(chunkz).toString();
    String coordSet = (cx + "," + cz); //store set of coords as string
    if(coordSet.equals(coordSet)){
    String cxz = coordSet.replaceAll((cx + "," + cz), coordSet);
    Collection.add(cxz);
    count++;
    }
    int i = 5, i++ + ++i pffft yeah right, ......no wait, yeah that's, no, uuuuumm 12!

Similar Threads

  1. [SOLVED] Issue Determing Average Number of Comparisons for Quicksort
    By gabie1121 in forum What's Wrong With My Code?
    Replies: 1
    Last Post: February 24th, 2014, 04:23 PM
  2. java hashset
    By srkmca07 in forum What's Wrong With My Code?
    Replies: 4
    Last Post: September 2nd, 2013, 03:21 AM
  3. [SOLVED] displaying every 1000 comparisons in merge sort
    By mia_tech in forum What's Wrong With My Code?
    Replies: 6
    Last Post: May 27th, 2012, 07:26 PM
  4. [SOLVED] counting number of comparisons in merge sort
    By mia_tech in forum What's Wrong With My Code?
    Replies: 9
    Last Post: May 26th, 2012, 11:54 PM
  5. error in HashSet
    By harsh23 in forum What's Wrong With My Code?
    Replies: 1
    Last Post: April 4th, 2011, 09:18 AM

Tags for this Thread