Home arrow static arrow Java Programming [Archive] - Boolean hashCode
Warning: Creating default object from empty value in /www/htdocs/w008deb8/wiki/components/com_staticxt/staticxt.php on line 51
Java Programming [Archive] - Boolean hashCode
This topic has 30 replies on 3 pages.    1 | 2 | 3 | Next »

Posts:5,904
Registered: 04/03/99
Boolean hashCode  
Aug 2, 2004 3:33 AM



 
Why is Boolean's hashCode method defined as returing 1231 (true) and 1237 (false) ?

Are these values arbitrary, or is there deeper magic behind it ?

Dave.
 

Posts:2,830
Registered: 9/1/03
Re: Boolean hashCode  
Aug 2, 2004 3:47 AM (reply 1 of 30)



 
well they are both prime, i think, so they are less likely to intersect
with any other hashcodes ... ? hmm.
 

Posts:319
Registered: 11/8/00
Re: Boolean hashCode  
Aug 2, 2004 3:48 AM (reply 2 of 30)



 
They are constant values from the Boolean class

        /**     * Returns a hash code for this <tt>Boolean</tt> object.     *     * @return  the integer <tt>1231</tt> if this object represents      * <tt>true</tt>; returns the integer <tt>1237</tt> if this      * object represents <tt>false</tt>.      */    public int hashCode() {	return value ? 1231 : 1237;    }
 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 3:51 AM (reply 3 of 30)



 

They are constant values from the Boolean class

That's because the API defines hashCode as returning those values. It is a consequence of the definition, not the reason for the definition. Read the question, please.

As to the primes observation - yep, ok, they're primes. So why those primes rather than 2 and 3 (for instance). Does being larger primes help in this respect ?

Dave.
 

Posts:2,830
Registered: 9/1/03
Re: Boolean hashCode  
Aug 2, 2004 4:06 AM (reply 4 of 30)



 
Does being larger primes help in this respect ?

Dunno... infact now that you mention it, it would seem that a smaller-sized
prime would be more effective at hiding from collisions.
 

Posts:5,965
Registered: 5/17/03
Re: Boolean hashCode  
Aug 2, 2004 4:58 AM (reply 5 of 30)



 
So why those primes rather than 2 and 3 (for
instance). Does being larger primes help in this
respect ?

I think the most rational hashCode for true and false would be 0 and 1. This would make it impossible for Boolean to collide if it were to be used as key in a hash table. But a hash table really isn't necessary because an array with just two elements already covers all possible values.

Say the hash table size for some reason is 6 and the standard modulo hash function is used:

1231 % 6 = 1
1237 % 6 = 1

which means a collision. So never use Boolean as key in a HashMap of size 6 -:)
 

Posts:12,831
Registered: 2/22/00
Re: Boolean hashCode  
Aug 2, 2004 5:06 AM (reply 6 of 30)



 
If a possible hash code were 0, then poorly defined hashCode methods that reference Boolean would have a higher rate of collisions.

I mean suppose some idiot wrote this:
public int hashCode() {  return myBooleanField.hashCode() * 17 * someOtherField.hashCode() * 23;}

But then I guess we can't be held responsible for others' foolishness. At least not that much.
 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 5:13 AM (reply 7 of 30)



 
But a hash table really isn't necessary
because an array with just two elements already covers
all possible values.

I think that assumes that you're not putting anything but Boolean objects into your map. There's no reason why you shouldn't be sticking Integer and String objects in as well.

Presumably the hashCodes were selected to ensure that they played well with other objects. But I don't know what the criteria were, or indeed if this is true. Perhaps I should email "Arthur van Hoff" who's the attributed author of Sun's implementation...

Dave.
 

Posts:5,965
Registered: 5/17/03
Re: Boolean hashCode  
Aug 2, 2004 5:46 AM (reply 8 of 30)



 
I think that assumes that you're not putting anything
but Boolean objects into your map. There's no reason
why you shouldn't be sticking Integer and String
objects in as well.

Well a Map is normally used to store key/item pairs where all keys are the same class and all items are the same class. This is for example a pre-requise for the type safety provided by generics in Tiger. But you're right 0 and 1 wasn't that good but 1 and 2 would do the trick I think. No collision is possible.

But you have the situation of course where a Boolean hashCode is used with other hashCodes to produce a composite hashCode. Then maybe two quite large primes would be the right choise. Especially if there are many Booleans involved. Larger primes probably contributes better to avoid clustering than smaller do in that case. I think this is the true reason actually.

Perhaps I should email "Arthur van Hoff" who's the
attributed author of Sun's implementation...

Yes that would be interesting.
 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 6:08 AM (reply 9 of 30)



 

Well a Map is normally used to store key/item pairs
where all keys are the same class and all items are
the same class. This is for example a pre-requise for
the type safety provided by generics in Tiger.

Not strictly true. The following is perfectly legal:

Map<Serializable,String> m = new HashMap<Serializable,String>();
m.put(Boolean.FALSE, "Huh ?");
m.put(42,"Hah !");

But
you're right 0 and 1 wasn't that good but 1 and 2
would do the trick I think. No collision is possible.

2 & 3 have the attraction of being primes under the meaning of the act.
Although not being a mathematician, I've always been dubious about the status of the number one as a non-prime.

http://mathworld.wolfram.com/PrimeNumber.html

But you have the situation of course where a Boolean
hashCode is used with other hashCodes to produce a
composite hashCode. Then maybe two quite large primes
would be the right choise. Especially if there are
many Booleans involved. Larger primes probably
contributes better to avoid clustering than smaller do
in that case. I think this is the true reason
actually.

I'm going to have a think about that. The question still remains, but it sounds like the answer may be that "they're arbitrary given certain constraints".

Perhaps I should email "Arthur van Hoff" who's the
attributed author of Sun's implementation...

Yes that would be interesting.

Well, I'll see if anyone drops anything definitively conclusive into this forum, dig out some of his papers and (if that all fails) drop him a line. But he might be rather busy these days...

http://www.strangeberry.com/about/avh-resume.html

Dave.
 

Posts:441
Registered: 2/25/04
Re: Boolean hashCode  
Aug 2, 2004 7:16 AM (reply 10 of 30)



 
Well a Map is normally used to store key/item pairs
where all keys are the same class and all items are
the same class.

Many public APIs use a map to pass a set implementation specific properties, where key and value types are implementation specific. Though in most cases the key will be a string, it's rarely restricted.

This is for example a pre-requise for the type safety provided by generics in Tiger.

Which tells you something about how important type safety is in these cases. Generics may be nice, as they make up a little for the lack of type inference, but IME errors that such type safety stops show at the first unit test, so are negligable in terms of productivity.

Pete

 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 7:25 AM (reply 11 of 30)



 
Which tells you something about how important type safety is in these cases. Generics may be nice, as they make up a little for the lack of type inference, but IME errors that such type safety stops show at the first unit test, so are negligable in terms of productivity.

No. You should always have a clear idea what types your container will permit. It may on rare occasions be as general as "Object", but in practice it will usually be much more specific.

In my example, the type was Serializable. Unless I explicitly cast, I am constrained to treat the values as Serializable Object instances. If I need to cast, I am very probably using an inappropriate generic type.

My experience has been that class cast exceptions are very common bugs (not as common as Null Pointer Exceptions, but definitely up there). The collection classes pretty much eliminated fencepost errors from projects I worked on. I anticipate generics eliminating class cast exceptions to the same extent.

The reduction in the length of code required to iterate over a collection is a nice side benefit (and remember that code length is related to bug count).

Dave.
 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 7:27 AM (reply 12 of 30)



 
Oh, and I don't buy the argument that Unit Testing eliminates X, therefore features that reduce instances of X are a Waste of Time. Because unit tests are rarely as comprehensive as the authors would like to think. X being a Bad Thing doesn't make a treatment for a cure to X a Good Thing, but it makes it worthy of consideration regardless of unit testing.

Dave.
 

Posts:4,496
Registered: 19/06/02
Re: Boolean hashCode  
Aug 2, 2004 7:38 AM (reply 13 of 30)



 
Why is Boolean's hashCode method defined as returing
1231 (true) and 1237 (false) ?

The question is, why not? These values are documented in the JLS, and I'm inclined to believe that they were randomly chosen. A good hashCode should be a hard-coded, randomly chosen, non-zero, odd, prime number. It should of course also ensure that if two objects are equal, then their hashCode values are equal as well. So 1231 and 1237 look ok to me.
 

Posts:5,904
Registered: 04/03/99
Re: Boolean hashCode  
Aug 2, 2004 7:43 AM (reply 14 of 30)



 

I don't know of a requirement that a hashcode be randomly chosen. Again, got a reference ?

Dave.
 
This topic has 30 replies on 3 pages.    1 | 2 | 3 | Next »