java.util.IdentityHashMap

by Mikhail Vorontsov

IdentityHashMap is the only map in JDK useful for tracking objects by identity (object is identical only to itself) instead of tracking by equality (if object1.equals( object2 ), they are equal). IdentityHashMap is used mostly in cases when you can’t modify tracked objects definition by either adding additional fields or writing equals and hashCode methods.

Let’s see how some of these limitations could be removed:

  • If you can’t add equals and hashCode to the class (or it has different implementations of these methods), it could be solved by Trove maps custom hashing strategies.
  • If there is a primary key field in the object – try to use an ordinary map with this field as a key instead of IdentityHashMap with object itself as a key.

IdentityHashMap calls System.identityHashCode to get object identity hash code – some semi-unique int value identifying each object. System.identityHashCode belongs to a set of Java intrinsic methods, which is a small set of core Java methods, which are directly (or nearly directly) mapped onto single (or just a few) CPU instructions.

You can find the list of Java 8 intrinsic methods in this listing. Look for do_intrinsic macros in the second half of the file.

Non-deterministic order of IdentityHashMap iteration

Unfortunately, a value returned by System.identityHashCode method for the same object in program will be different on nearly every run. It means that the iteration order on IdentityHashMap is not consistent between two runs of the same program. Different iteration orders means some difficulties in writing your unit tests as well as the different program output if it is being generated inside such loop.

Never try to iterate IdentityHashMap if you want predictable results.

You can try to run the following method several times (it does not matter if it will be called in different program runs or repeatedly on the same run). It will print you something different every time. As you can see, it does not matter for IdentityHashMap if its keys are actually implementing equals/hashCode methods.

1
2
3
4
5
6
7
8
private static void testOrder()
{
    final Map<String, String> map = new IdentityHashMap<String, String>();
    for ( int i = 0; i < 10; ++i )
        map.put( "Line #" + i, "Value #" + i );
    for ( final String key : map.keySet() )
        System.out.println( key );
}
private static void testOrder()
{
    final Map<String, String> map = new IdentityHashMap<String, String>();
    for ( int i = 0; i < 10; ++i )
        map.put( "Line #" + i, "Value #" + i );
    for ( final String key : map.keySet() )
        System.out.println( key );
}

IdentityHashMap vs HashMap performance

Let's compare an IdentityHashMap with an ordinary HashMap. We will use String keys, which have one useful property for this test: String caches the calculated hashCode, so for the second and following calls, String.hashCode is very cheap to execute - it reads and returns a single int value.

Object.identityHashCode applies the similar logic to an object header, so by using HashMap and IdentityHashMap with String keys we may approximately compare the identity hash code calculation/access speed with a single field access speed.

In order to demonstrate this, a small test was written. Five million strings were created and stored into an array. After that they are added to the map with their positions in the array as map values. When the map is ready, we check that all keys are present and have correct values. We need an array, because otherwise different strings would be used while calling IdentityHashMap.put and IdentityHashMap.get.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
public static void main( final String... args )
{
    compareMaps( 20000 );    //warmup
    compareMaps( 5000000 );  //test
}
 
private static void compareMaps( final int size ) {
    final String[] ar = new String[ size ];
    for ( int i = 0; i < size; ++i )
        ar[ i ] = "String #" + i;
    
    for ( int i = 0; i < 10; ++i )
    {
        testMap(ar, new HashMap<String, Integer>( size ), "HashMap");
        System.gc();
        testMap(ar, new IdentityHashMap<String, Integer>( size ), "IdentityHashMap");
        System.gc();
    }
}
 
private static void testMap( final String[] ar, final Map<String, Integer> map, final String name )
{
    final long start = System.currentTimeMillis();
    for ( int i = 0 ; i < ar.length; ++i )
        map.put( ar[ i ], i );
    
    int failed = 0;
    for ( int i = 0; i < ar.length; ++i )
    {
        if ( map.get( ar[ i ] ) != i )
            ++failed;
    }
    System.out.println( name + " time = " + ( System.currentTimeMillis() - start ) / 1000. + " sec" );
    if ( map.size() != ar.length )
        System.out.println( "Size check failed" );
    if ( failed != 0 )
        System.out.println( "Test failed");
}
public static void main( final String... args )
{
    compareMaps( 20000 );    //warmup
    compareMaps( 5000000 );  //test
}

private static void compareMaps( final int size ) {
    final String[] ar = new String[ size ];
    for ( int i = 0; i < size; ++i )
        ar[ i ] = "String #" + i;
    
    for ( int i = 0; i < 10; ++i )
    {
        testMap(ar, new HashMap<String, Integer>( size ), "HashMap");
        System.gc();
        testMap(ar, new IdentityHashMap<String, Integer>( size ), "IdentityHashMap");
        System.gc();
    }
}

private static void testMap( final String[] ar, final Map<String, Integer> map, final String name )
{
    final long start = System.currentTimeMillis();
    for ( int i = 0 ; i < ar.length; ++i )
        map.put( ar[ i ], i );
    
    int failed = 0;
    for ( int i = 0; i < ar.length; ++i )
    {
        if ( map.get( ar[ i ] ) != i )
            ++failed;
    }
    System.out.println( name + " time = " + ( System.currentTimeMillis() - start ) / 1000. + " sec" );
    if ( map.size() != ar.length )
        System.out.println( "Size check failed" );
    if ( failed != 0 )
        System.out.println( "Test failed");
}

The first HashMap test took longer than the following tests, because String calculates and caches its hashcode. On the following iterations we will compare String.hash field access for HashMap tests with a Java object header access for IdentityHashMap tests:

HashMap time = 2.487 sec
IdentityHashMap time = 1.218 sec
HashMap time = 0.613 sec
IdentityHashMap time = 0.729 sec
HashMap time = 0.695 sec
IdentityHashMap time = 0.721 sec

As you can see, field access takes about the same time as identity hash code access. By this we have proven that accessing the object identity hash code is a very cheap operation.

This article will not be complete without a warning. Object identity hash code is stored in the union part of the Java header, which means that calculating it may erase some other information. This Oracle article warns that an object for which identity hash code was calculated can not be used for biased locking. It is a very unlikely scenario, but in theory some serialization algorithms could call Object.identityHashCode while traversing an object graph during the serialization.

See also

Java collections overview - an overview of all standard JDK collections.

Summary

  • java.util.IdentityHashMap uses System.identityHashCode to get object identity hash code. Avoid using IdentityHashMap if you either have primary key field in the objects (use them as a key for ordinary HashMap) or use Trove maps custom hashing strategy if you need to add your own equals and hashCode methods, but can't update the objects you are working on.
  • Do not try to iterate IdentityHashMap contents, because iteration order will be different on every run of your program, thus making your program results inconsistent.
  • Accessing the object identity hash code is a very cheap Java intrinsic operation.
  • Beware that an object with the calculated identity hash code can not be used for biased locking. While very rare in normal circumstances, you may end up in this situation if your lock will be accessed by any Java object graph traversal algorithm (serialization, for example).

Leave a Reply

Your email address will not be published. Required fields are marked *