When hash-maps are an order of magnitude too slow
Recently I was writing some performance critical code. It was about an algorithm which had to run in linear time. Out of habit, I started to use hash-maps in order to make linear time possible. The execution speed was not as expected so I profiled the code. Hash-map lookups were consuming the largest fraction of run-time, so there was no point in tuning the other parts of the code.
Seeking inspiration in the performance characteristics of the data structures, it turned out that hash-maps have only time eC – "effectively constant" – when doing lookups but there are other data structures, namely BitSets which can do lookups in time C – "(fast) constant time". That sparked the idea of experimenting with BitSets and arrays for the aforementioned performance critical lookups.
Since there was no apparent performance difference between BitSets and arrays, the following Scala code focuses on arrays due to their broader applicability.
import scala.collection.mutable.HashSet
// Suppose we have sparse indices between zero and one million
val n:Int = 1e6.toInt
// Lets generate some random data, to simulate the problem
val random = new java.security.SecureRandom
val b:Array[Boolean] = Array.fill(n) { random.nextBoolean }
// Via a hash set it can be efficiently checked if a particular
// index is present, right? So we store our data in a HashMap
val h:HashSet[Int] = new HashSet[Int]()
val r = 0 to n-1
for (i <- r){
if(b(i)) h += i
}
// And here is our naive timing function
def time[A](a: => A): (A,Long) = {
val now = System.nanoTime
val result = a
val micros = (System.nanoTime - now) / 1000
(result, micros)
}
// Now we want to query a number of arbitrary indices.
// HashMap goes first
def queryHashMap():Long = {
var numHits:Long = 0
for(i <- r){
if(h.contains(i)) numHits += 1
}
numHits
}
time(queryHashMap)
// But what if we just used an array instead of the hash map
def queryArray():Long = {
var numHits = 0
for(i <- r){
if(b(i)) numHits += 1
}
numHits
}
time(queryArray)
// After some warm-up calls to make the JVM hotspot compiler
// really come up with efficient machine code, lets run both
// for 100 times
(1 to 100)
.map(_=> (time(queryHashMap)._2.toDouble / time(queryArray)._2.toDouble))
.reduce(_ + _)/100
// -> array is approx. 7 times faster
It turned out that the array based lookups where 7 to 9 times faster. After thinking about it, this result is completely plausible since obviously less CPU cycles are involved in getting a bit at a random memory address than having to compute the hash function first.
Limitations
This solution does not fit all cases. Keys were transformed to space efficient indices without gaps beforehand. Otherwise an array can not replace a hash-map. Also, memory was traded for faster execution speed.