Recently had to implement some microbenchmarks for a project. First time I ran them, I thought they had gone stuck. It turned out they weren't, but the processing took forever.

Some smaller benchmarks revealed that runtime was not scaling linearly with the input size as one would expect for the problem. It turned out that code was iterating over the input to find corresponding entries in the output.

We changed the processing so that it creates the output in the same order as the input and just compare entries at the same position. With that we were able to cut runtime from a few hours to a few seconds.

Add Comment