Tag Archives: datatype optimization

Creating an exception in Java is very slow

by Mikhail Vorontsov


29 June 2014 update – now using JMH for testing. Added 2 ways to avoid the cost of exceptions. Made some editorial changes to the original text to reflect JMH usage.

Filling in the stack trace is slow…

Creating an exception in Java is a very slow operation. Expect that throwing an exception will cost you around 1-5 microseconds. Nearly all this time is spent on filling in the exception thread stack. The deeper the stack trace is, the more time it will take to populate it.

Usually we throw an exception in case of unexpected problems. This means that we don’t expect exceptions to be thrown at a rate of thousands per second per process. But sometimes you will encounter a method which uses exceptions for more likely events. We have already seen a good (actually bad) example of such behavior in Base 64 encoding and decoding performance article: sun.misc.BASE64Decoder is extremely slow due to using an exception for “give me more data” requests:

at java.lang.Throwable.fillInStackTrace(Throwable.java:-1)
at java.lang.Throwable.fillInStackTrace(Throwable.java:782)
- locked <0x6c> (a sun.misc.CEStreamExhausted)
at java.lang.Throwable.<init>(Throwable.java:250)
at java.lang.Exception.<init>(Exception.java:54)
at java.io.IOException.<init>(IOException.java:47)
at sun.misc.CEStreamExhausted.<init>(CEStreamExhausted.java:30)
at sun.misc.BASE64Decoder.decodeAtom(BASE64Decoder.java:117)
at sun.misc.CharacterDecoder.decodeBuffer(CharacterDecoder.java:163)
at sun.misc.CharacterDecoder.decodeBuffer(CharacterDecoder.java:194)

You may encounter the same problem if you try to run pack method from String packing part 2: converting Strings to any other objects with a string starting with a digit, but followed by letters. Let’s see how long does it take to pack ‘12345’ and ‘12345a’ with that method:

Benchmark                        (m_param)   Mode   Samples         Mean   Mean error    Units
t.StringPacking2Tests.testPack      12345a  thrpt        10        0.044        0.000   ops/us
t.StringPacking2Tests.testPack       12345  thrpt        10        7.934        0.154   ops/us

As you can see, we were able to convert “12345” from String about 200 times faster than “12345a”. Most of processing time is again being spent filling in stack traces:

at java.lang.Throwable.fillInStackTrace(Throwable.java:-1)
at java.lang.Throwable.fillInStackTrace(Throwable.java:782)
- locked <0x87> (a java.lang.NumberFormatException)
at java.lang.Throwable.<init>(Throwable.java:265)
at java.lang.Exception.<init>(Exception.java:66)
at java.lang.RuntimeException.<init>(RuntimeException.java:62)
at java.lang.IllegalArgumentException.<init>(IllegalArgumentException.java:53)
at java.lang.NumberFormatException.<init>(NumberFormatException.java:55)
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.valueOf(Long.java:540)
at tests.StringPacking2Tests.pack(StringPacking2Tests.java:69)
...

Continue reading

An overview of memory saving techniques in Java

by Mikhail Vorontsov

This article will give you the general advices on memory consumption optimization in Java.

Memory usage optimization is the most important type of optimization in Java. Current systems are limited by memory access times rather than by CPU frequency (otherwise, why CPU producers are implementing all these L1, L2 and L3 caches?). It means that by reducing your application memory footprint you will most likely improve your program data processing speed by making your CPU to wait for smaller amount of data. Now let’s get back to Java.

General Java memory layout information

First of all, we have to revise the memory layout of Java objects: any Java Object occupies at least 16 bytes, 12 out of which are occupied by a Java object header. Besides that, all Java objects are aligned by 8 bytes boundary. It means that, for example, an object containing 2 fields: int and byte will occupy not 17 (12 + 4 + 1), but 24 bytes (17 aligned by 8 bytes).

Each Object reference occupies 4 bytes if the Java heap is under 32G and XX:+UseCompressedOops is turned on (it is turned on by default in the recent Oracle JVM versions). Otherwise, Object references occupy 8 bytes.

All primitive data types occupy their exact byte size:

byte, boolean 1 byte
short, char 2 bytes
int, float 4 bytes
long, double 8 bytes

In essence, this information is sufficient for Java memory optimization. But it will be more convenient if you will be aware of arrays / String / numeric wrappers memory consumption.

Continue reading

String packing part 2: converting Strings to any other objects

by Mikhail Vorontsov

As we have found out in the previous string packing article, even an empty String will occupy at least 48 bytes. In some cases we may expect that some String variables will contain values which could be easily converted to primitive values. Unfortunately, other values of these variables may not be converted, so you will still need a String.

You may notice it either for single fields of some objects or you may have some collection of strings, like Map<Integer, String> – it doesn’t matter: you may apply this chapter ideas for both cases.

In order to save memory you need to replace Strings with Objects. You need a conversion method, which will be applied to all Strings you are trying to store and return either an original String or any other less memory-hungry Object. When you will need your original string back, all you need is just to call toString() method on your stored Object (probably, supporting null values as well). All other code must access your fields via such set/get methods.

The easiest types of replacement objects to support are Character for strings with length = 1, Integer for shorter numbers (because it has better caching support than other wrapper types), Long for longer numbers and Double for decimal point numbers.

There are still a few easy to forget pitfalls. First of all, all integer number conversion methods, like Integer.valueOf, support strings starting with zeroes, like “01234”. They will skip leading zeroes and convert the rest of the input string (result will be 1234 in the example). Conversion of 1234 back to string will return “1234”, of course. That’s why we can’t use numbers starting with ‘0’ (except zero itself).

Continue reading

hashCode method performance tuning

by Mikhail Vorontsov

In this chapter we will discuss various implications of hashCode method implementation on application performance.

The main purpose of hashCode method is to allow an object to be a key in the hash map or a member of a hash set. In this case an object should also implement equals(Object) method, which is consistent with hashCode implementation:

  • If a.equals(b) then a.hashCode() == b.hashCode()
  • If hashCode() was called twice on the same object, it should return the same result provided that the object was not changed

hashCode from performance point of view

From the performance point of view, the main objective for your hashCode method implementation is to minimize the number of objects sharing the same hash code. All JDK hash based collections store their values in an array. Hash code is used to calculate an initial lookup position in this array. After that equals is used to compare given value with values stored in the internal array. So, if all values have distinct hash codes, this will minimize the possibility of hash collisions. On the other hand, if all values will have the same hash code, hash map (or set) will degrade into a list with operations on it having O(n2) complexity.

For more details, read about collision resolution in hash maps. JDK is using a method called open addressing, but there is another method called “chaining” – all key-value pairs with the same hash code are stored in a linked list.

Continue reading

Use case: compacting price field disk representation

by Mikhail Vorontsov

In this article I’ll describe a small trick you can use while working with prices (at least, in countries with rather expensive currency).

There are 2 well-known ways to store price in memory and on disk:

  • In smallest currency units of your currency, using integral type of your choice
  • Keep it as double and don’t worry about number of smallest currency units in any currency. This approach has advantage if you are writing a system which supports prices in fractions of smallest currency unit (for example, sub-penny prices on a large number of stock exchanges).

Impact on compression

If you will ever need to compress your data as a stream, when integral types have one crucial advantage: they are shorter and most prices use only a small number of bits from a corresponding datatype. Floating point types, on the other hand, have sign, exponent and mantissa at various places of a datatype binary representation (see IEEE-754 standard for details), which generally makes data compression less effective.

Continue reading

Use case: how to compact a long-to-long mapping

by Mikhail Vorontsov

In this post we will discuss what techniques could be applied to an integer-to-integer map in order to further reduce memory consumption. Trove collections will be used in this post.

Suppose we have 2 similar message streams. Each stream contains “main” messages, which define new identifiers, and “child” messages, which refer to earlier defined identifiers. Identifiers in both streams are positive long values starting from 1, but there may be some gaps in both identifier sequences.

We need to merge two such streams into one stream, keeping references between “main” and “child” messages. We are allowed to change any identifiers as long as references will not be broken.

Here is an example of how it should work. First id you see is 1 from stream A – you will map it to 1. Next comes 1 from stream B – you will map it to 2. The next one is 4 from stream B (two ids are missing in stream B) – you will map it to 3. It is followed by 2 from stream A – it will be mapped to 4. Now you’ll get 1 from stream A again – you should use previously assigned 1 in this case. It is followed by 2 from stream A – you should use previously assigned 4. And so on…

Continue reading