Tag Archives: long bitset

Bit sets

by Mikhail Vorontsov

From time to time you may need to map an integer key to one or a few flags (boolean values). Bit sets are the best data structure in such cases.

There is only one bit set implementation in the core Java: java.util.BitSet. Internally it is using a long[], each bit of which maps to one integer: bit 0 in the first long is mapped to key 0, bit 1 – to key 1 and so on. This means that if you want to map keys starting from some high number, like 100,000,000 to booleans, then an unchanged BitSet may not be a best choice, because it will allocate all bits for keys from 0 to highest bit you’ve ever set in your bit set. For a bit set with a single set bit at position 100,000,000 it means allocation of about 12.5M memory.

Problem of high base key can be avoided be adding and subtracting this base key on all bit set operations (but, of course, you’ll need to know this base key in advance). The same should be done if you need to map negative values to booleans: java.util.BitSet does not support negative keys.

The third problem to avoid is mapping of very large values, which are outside of int values range. Ordinary java.util.BitSet supports only int keys.

And the last, but not the least is partitioning problem – if you need to store a few values here and a few values there – you still need to allocate all the memory to store keys from 0 to a highest set bit.

Continue reading