Tag Archives: collections

Trove library: using primitive collections for performance

by Mikhail Vorontsov


19 July 2014: article text cleanup, added a chapter on JDK to Trove migration.
16 July 2012: original version.

This article would describe Trove library, which contains a set of primitive collection implementations. The latest version of Trove (3.1a1 at the time of writing) would be described here.

Why should you use Trove? Why not to keep using well known JDK collections? The answer is performance and memory consumption. Trove doesn’t use any java.lang.Number subclasses internally, so you don’t have to pay for boxing/unboxing each time you want to pass/query a primitive value to/from the collection. Besides, you don’t have to waste memory on the boxed numbers (24 bytes for Long/Double, 16 bytes for smaller types) and reference to them. For example, if you want to store an Integer in JDK map, you need 4 bytes for a reference (or 8 bytes on huge heaps) and 16 bytes for an Integer instance. Trove, on the other hand, uses just 4 bytes to store an int. Trove also doesn’t create Map.Entry for each key-value pair unlike java.util.HashMap, further reducing the map memory footprint. For sets, it doesn’t use a Map internally, just keeping set values.

There are 3 main collection types in Trove: array lists, sets and maps. There are also queues, stacks and linked lists, but they are not so important (and usually instances of these collections tend to be rather small).

Array lists

All array lists are built on top of an array of corresponding data type (for example, int[] for TIntArrayList). There is a small problem which you should deal with: a value for the absent elements (default value). It is zero by default, but you can override it using

1
2
public TIntArrayList(int capacity, int no_entry_value);
public static TIntArrayList wrap(int[] values, int no_entry_value);
public TIntArrayList(int capacity, int no_entry_value);
public static TIntArrayList wrap(int[] values, int no_entry_value);

There are 2 useful methods called getQuick/setQuick – they just access the underlying array without any additional checks. As a side-effect they would allow to access elements between list size and capacity (don’t use it too much – it is still better to add values legally and when getQuick values as long as you are inside array list boundaries.

Each Trove array list has several helper methods which implement the java.util.Collections functionality:

1
2
3
4
5
6
7
8
9
10
11
12
public void reverse()
public void reverse(int from, int to)
public void shuffle(java.util.Random rand)
public void sort()
public void sort(int fromIndex, int toIndex)
public void fill(int val)
public void fill(int fromIndex, int toIndex, int val)
public int binarySearch(int value)
public int binarySearch(int value, int fromIndex, int toIndex)
public int max()
public int min()
public int sum()
public void reverse()
public void reverse(int from, int to)
public void shuffle(java.util.Random rand)
public void sort()
public void sort(int fromIndex, int toIndex)
public void fill(int val)
public void fill(int fromIndex, int toIndex, int val)
public int binarySearch(int value)
public int binarySearch(int value, int fromIndex, int toIndex)
public int max()
public int min()
public int sum()

Continue reading

Core Java 7 Change Log

by Mikhail Vorontsov


01 Jan 2014 update – covered all Java 7 versions up to Java 7u45.

This article will list performance related changes made in the core part of various Java 7 updates. I will track changes in the following packages:

  • java.lang.*
  • java.io
  • java.math
  • java.net
  • java.nio.*
  • java.text.*
  • java.util.*

These changes have one peculiar property – Oracle usually does not include them in the release notes. Probably, they consider them too minor… Anyway, this page will try to help you a little: hopefully, you will not miss updates like a “famous” String.substring change in Java 7u6.

This page will be updated after new Java 7 updates. I will also add a list of changes done before Java 7u45 (give me some time!).

Java 7u45 (compared to Java 7u25)

Most of these changes were made in Java 7u40, but I have unfortunately missed that release.

Shared empty table in ArrayList and HashMap update

File changed: \util\ArrayList.java
File changed: \util\HashMap.java

Two most popular Java collections – ArrayList and HashMap are now consuming less memory if their instances are empty (size=0). The only reason for this update, from my point of view is a critical mass of cases in Oracle performance benchmarks when a map/list is created, but not populated later due to a conditional logic (for example, empty parameter list was provided in the HTTP request).

This update reduces the garbage collector workload by creating less unused garbage.

In case of ArrayList, only objects created via an empty constructor are updated. If you are using a constructor specifying the initial size, there are no changes at all for you.

1
2
3
4
5
6
7
8
9
/**
 * Shared empty array instance used for empty instances.
 */
private static final Object[] EMPTY_ELEMENTDATA = {};
 
public ArrayList() {
    super();
    this.elementData = EMPTY_ELEMENTDATA;
}
/**
 * Shared empty array instance used for empty instances.
 */
private static final Object[] EMPTY_ELEMENTDATA = {};

public ArrayList() {
    super();
    this.elementData = EMPTY_ELEMENTDATA;
}

HashMap got a more thorough update – internal table initialization is now moved away from the constructor. It is now always initialized with an empty table. As a consequence, all getters and setters in the HashMap are now checking if a map is empty. This makes getters slightly faster in case of an empty map – you no longer have to look at the actual table, but at the same time it makes all setters/getters slower in every other case (this is a very tiny slowdown – a zero check of a single int field, but this is still an extra instruction to execute).

I hope that this HashMap change has a solid justification – there are enough always empty maps in the Oracle benchmark applications. In my personal opinion this change is quite questionable – I do not like “taxes” you always have to pay just for a corner case optimization.

1
2
3
4
5
6
7
8
9
/**
 * An empty table instance to share when the table is not inflated.
 */
static final Entry<?,?>[] EMPTY_TABLE = {};
 
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
/**
 * An empty table instance to share when the table is not inflated.
 */
static final Entry<?,?>[] EMPTY_TABLE = {};

/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

An author of these changes has responded me here.

Continue reading

A few more memory saving techniques in Java

by Mikhail Vorontsov

This article will list a few more ideas on cheap memory saving techniques in Java. This article follows An overview of memory saving techniques in Java, Memory consumption of popular Java data types – part 1 and Memory consumption of popular Java data types – part 2 articles earlier published in this blog. I strongly recommend you to read An overview of memory saving techniques in Java before reading this article. This article assumes that size of Object references is equal to 4 bytes.

Static inner classes

Java lacks a normal library tuple support like the one Scala has. This forces Java developers to create a lot of usually inner classes, whose only purpose is to compose a few elements as a single entity for a collection. One really popular example of such inner class is Pair, which is often implemented as a generic class.

Inner non-static classes have an important property – they store a reference to their parent class. Such reference allows them to seamlessly call parent methods and access parent fields. Such a reference is often not needed for the simple tuple-like classes. You can get rid of it by marking your class as static. This will save you 4 bytes and increase its chances to occupy 8 bytes less (see An overview of memory saving techniques in Java for more details).

You should generally mark all your inner classes as static until there is a need to remove such qualifier.

Continue reading

Memory consumption of popular Java data types – part 1

by Mikhail Vorontsov

This article will give you an overview of some popular Java data types memory consumption. This article follows An overview of memory saving techniques in Java article earlier published in this blog. I strongly recommend you to read the previous article before reading this one.

Enum, EnumMap, EnumSet

Enums were added to Java 1.5. Technically speaking, each enum is an Object and has all memory consumption properties of objects described in the previous article. Luckily, they have several useful properties:

All enum objects are singletons. This is guaranteed by Java language – you can create an instance of enum only by its definition. This means that you pay for enum object once and then you only spend 4 bytes per its reference (in this article I will assume that object references occupy 4 bytes). But what is the benefit of enums if they consume as much memory as int variables and 4 times more than byte variables, besides type safety and better support in IDE?

The answer is the ordinal() method implemented by every enum. It is an increasing number starting from 0 and ending at the number of values in the given enum minus 1. Such enum property allows us to use arrays for enum to any other object mapping: a value related to the first enum value will be stored in array[0], second enum will be mapped to array[1] and so on according to Enum.ordinal() method result. By the way, it was a short description of JDK EnumMap class implementation.

If you need to implement a map from Object into enum, you won’t have a tailored JDK implementation at hand. Of course, you can use any JDK Object to Object map, but it wouldn’t be that efficient. The easy way here is to use Trove TObjectByteMap (or TObjectIntMap in rare cases when you have more than 128 values in your enum) and map from Object key into Enum.ordinal() value. You will need a decoding method for getters in order to convert byte into an enum. This method will require 1 byte per entry, which is the best we can do without paying CPU algorithmic penalty (of course, we can use less than 1 byte per enum, if there are less than 128, 64, 32, etc elements in the enum, but it may make your code more complicated for a very little memory gain).

With all this knowledge at hand, you may now realize that EnumSet is implemented similarly to BitSet. There are 2 EnumSet implementations in JDK: RegularEnumSet and JumboEnumSet. The former is used for enums having less than 65 values (it covers 99,9999% of real-world enums), the latter is used for larger enumerations.

RegularEnumSet utilizes the knowledge of the fact that there are less or equal to 64 values in the enum. It allows RegularEnumSet to use a single long to store all “enum present in the set” flags, rather than using long[] (utilized by JumboEnumSet or BitSet). Using a single long instead of long[1] allows to save 4 bytes on long[] reference and 16 bytes on long value (long occupies 8 bytes, long[1] needs 12 bytes for header, 8 bytes for a single long and 4 bytes for alignment).

Continue reading

Java collections overview

by Mikhail Vorontsov

This article will give you an overview of all standard Java collections. We will categorize their distinguishable properties and main use cases. Besides that, we will list all correct ways of transforming your data between various collection types.

Arrays

Array is the only collection type built in Java. It is useful when you know an upper bound on the number of processed elements in advance. java.util.Arrays contains a lot of useful methods for array processing:

  • Arrays.asList – conversion from array to List, which could be passed to other standard collection constructors.
  • Arrays.binarySearch – fast lookup in a sorted array or its subsection.
  • Arrays.copyOf – use this method if you need to expand your array while keeping its contents.
  • Arrays.copyOfRange – if you need to make a copy of the whole array or its subsection.
  • Arrays.deepEquals, Arrays.deepHashCode – versions of Arrays.equals/hashCode supporting nested sub-arrays.
  • Arrays.equals – if you need to compare two arrays for equality, use this method instead of array equals method ( array.equals is not overridden in any array, so it only compares references to arrays, rather than their contents ). This method may be combined with Java 5 boxing and varargs in order to write a simple implementation of your class equals method – just pass all your class fields to Arrays.equals after comparing object types.
  • Arrays.fill – populate the whole array or its subsection with a given value.
  • Arrays.hashCode – useful method for calculating a hashcode of array contents ( array own hashcode method can not be used for this purpose ). This method may be combined with Java 5 boxing and varargs in order to write a simple implementation of your class hashcode method – just pass all your class fields to Arrays.hashcode.
  • Arrays.sort – sort the full array or its subsection using natural ordering. There is also a pair of Arrays.sort methods for sorting an Object[] using a provided Comparator.
  • Arrays.toString – fine-print the array contents.

If you need to copy one part of array (or the whole array) into another already existing array, you need to use System.arraycopy, which copies a given number of elements from a given position in the source array to a given position in the destination array. Generally, this is the fastest way to copy array contents in Java (but in some cases you may need to check if ByteBuffer bulk copy works faster ).

Finally, we need to mention that any Collection could be copied into an array using T[] Collection.toArray( T[] a ) method. The usual pattern of this method call:

1
return coll.toArray( new T[ coll.size() ] );
return coll.toArray( new T[ coll.size() ] );

Such method call allocates a sufficient array for storing the whole collection, so that toArray doesn’t have to allocate sufficiently large array to return.

Single-threaded collections

This part of the article describes non-thread-safe collections. All these collections are stored in the java.util package. Some of these collections were present in Java since Java 1.0 (and are now deprecated), most of them were already present in Java 1.4. Enum collections were added in Java 1.5 with the support of generics in all collection classes. PriorityQueue was also added in Java 1.5. The latest addition to the non-thread-safe collections framework is ArrayDeque, which was added in Java 1.6.

Lists

  • ArrayList – the most useful List implementation. Backed by an array and an int – position of the first not used element in the array. Like all Lists, expands itself when necessary. Has constant element access time. Cheap updates at the tail (constant complexity), expensive at the head (linear complexity) due to ArrayList invariant – all elements start from index = 0 in the underlying array, which means that everything to the right from the update position must be moved to the right for insertions and to the left for removals. CPU-cache friendly collection due to being backed by an array (unfortunately, not too friendly, because contains Objects, which are just pointers to the actual objects).
  • LinkedListDeque implementation – each Node consists of a value, prev and next pointers. It means that element access/updates have linear complexity (due to an optimization, these methods do not traverse more than a half of the list, so the most expensive elements are located in the middle of the list). You need to use ListIterators if you want to try to write fast LinkedList code. If you want a Queue/Deque implementation (you need to access only first and last elements) – consider using ArrayDeque instead.
  • Vector – a prehistoric version of ArrayList with all synchronized methods. Use ArrayList instead.

Continue reading

java.util.IdentityHashMap

by Mikhail Vorontsov

IdentityHashMap is the only map in JDK useful for tracking objects by identity (object is identical only to itself) instead of tracking by equality (if object1.equals( object2 ), they are equal). IdentityHashMap is used mostly in cases when you can’t modify tracked objects definition by either adding additional fields or writing equals and hashCode methods.

Let’s see how some of these limitations could be removed:

  • If you can’t add equals and hashCode to the class (or it has different implementations of these methods), it could be solved by Trove maps custom hashing strategies.
  • If there is a primary key field in the object – try to use an ordinary map with this field as a key instead of IdentityHashMap with object itself as a key.

IdentityHashMap calls System.identityHashCode to get object identity hash code – some semi-unique int value identifying each object. System.identityHashCode belongs to a set of Java intrinsic methods, which is a small set of core Java methods, which are directly (or nearly directly) mapped onto single (or just a few) CPU instructions.

You can find the list of Java 8 intrinsic methods in this listing. Look for do_intrinsic macros in the second half of the file.

Continue reading

hashCode method performance tuning

by Mikhail Vorontsov

In this chapter we will discuss various implications of hashCode method implementation on application performance.

The main purpose of hashCode method is to allow an object to be a key in the hash map or a member of a hash set. In this case an object should also implement equals(Object) method, which is consistent with hashCode implementation:

  • If a.equals(b) then a.hashCode() == b.hashCode()
  • If hashCode() was called twice on the same object, it should return the same result provided that the object was not changed

hashCode from performance point of view

From the performance point of view, the main objective for your hashCode method implementation is to minimize the number of objects sharing the same hash code. All JDK hash based collections store their values in an array. Hash code is used to calculate an initial lookup position in this array. After that equals is used to compare given value with values stored in the internal array. So, if all values have distinct hash codes, this will minimize the possibility of hash collisions. On the other hand, if all values will have the same hash code, hash map (or set) will degrade into a list with operations on it having O(n2) complexity.

For more details, read about collision resolution in hash maps. JDK is using a method called open addressing, but there is another method called “chaining” – all key-value pairs with the same hash code are stored in a linked list.

Continue reading

java.util.LinkedList performance

by Mikhail Vorontsov

We will discuss LinkedList performance in this article. Generally, it is a bad idea to use it in performance-critical code. But, sometimes you may need it…

LinkedList is a list implementation with each node having pointers to previous and next nodes. Such implementation allows to add/remove/update elements fast, but only if:

  • Either they are first/last elements or
  • You have scrolled to required element using ListIterator

In all other cases list modification is O(n) complexity operation. Access to list elements (get) is also O(n) operation (actually, list will scroll either from head or from tail, so it would take no more than size() / 2 operations to access any LinkedList node.

Author is aware of only two cases when you could use a LinkedList:

  1. You are implementing a FIFO buffer and you don’t need to add/remove elements to the middle of the buffer (or rarely need to do it). Adding/removing from LinkedList head/tail is very fast. Nevertheless, consider using java.util.ArrayDeque in this case. It is also optimized for fast head/tail operations (more on this later).
  2. You need to add/remove elements from the middle of the list too often.

Continue reading

java.util.ArrayList performance guide

by Mikhail Vorontsov

We will discuss most of possible ArrayList performance problems in this article. ArrayList methods will be divided into several groups and their performance will be discussed.

ArrayList is a general list implementation suitable for most use cases. It is backed by an Object array, which size is dynamically adjusted while user adds or removes elements from the list. If you need a resizable list of primitive type values, you should read a Trove library article.

Adding elements to the list

There are 2 pairs of methods used to add elements:

  • add(E)
  • add(int, E)
  • addAll(List<E>)
  • addAll(int, List<E>)

Single argument methods are adding new elements to the list tail. They are the fastest ones. Methods specifying insertion position have to copy all array elements to the right from insertion point by System.arraycopy call, that’s why these methods have O(n) complexity (performance penalty is small if new element is added near the tail, but getting larger when insertion point moves towards the head). This means that you shouldn’t add too many elements to the head of big ArrayList. Adding 1M elements to the head of List<String> took 125 seconds on my computer. Adding 0.5M elements took 30 seconds, which proves that n calls of a method with O(n) complexity have O(n2) complexity (adding 2 times more elements took 4 times longer).

If there is not enough space in the array to fit new elements, it should be extended. Extension logic was changed in Java 7: previously new array size was oldSize * 3 / 2 + 1, but it is oldSize * 3 in Java 7.

Continue reading

Use case: how to compact a long-to-long mapping

by Mikhail Vorontsov

In this post we will discuss what techniques could be applied to an integer-to-integer map in order to further reduce memory consumption. Trove collections will be used in this post.

Suppose we have 2 similar message streams. Each stream contains “main” messages, which define new identifiers, and “child” messages, which refer to earlier defined identifiers. Identifiers in both streams are positive long values starting from 1, but there may be some gaps in both identifier sequences.

We need to merge two such streams into one stream, keeping references between “main” and “child” messages. We are allowed to change any identifiers as long as references will not be broken.

Here is an example of how it should work. First id you see is 1 from stream A – you will map it to 1. Next comes 1 from stream B – you will map it to 2. The next one is 4 from stream B (two ids are missing in stream B) – you will map it to 3. It is followed by 2 from stream A – it will be mapped to 4. Now you’ll get 1 from stream A again – you should use previously assigned 1 in this case. It is followed by 2 from stream A – you should use previously assigned 4. And so on…

Continue reading