Lately, I’ve been playing around with testing the various main components of OpenStack Infra, namely the puppet manifests.
I ran into an interesting problem last week where starting Gerrit would work on first try, then it would fail afterwards.
The interesting thing is that if I increased the timeout value of the Gerrit init upstart script to some ludicrous high value (900 seconds), it would eventually start at some point.
I thought it could be due to upstream using a forked Gerrit version, but the git diff showed the differences were minimal.
As I was trying this Gerrit test on a HP Cloud instance, I tried running it on my rusty but still working home server on a Vagrant VM.
Turned out it would start and stop immediately without any problems, thus the problem clearly had something up with running it on a cloud instance.
I shared the problem with my colleagues and one of them said ‘hey, this could be something about entropy’.
Suddenly, something clicked on my mind and I remembered that in upstream the Nodepool images had haveged package baked in, thus I did an apt-get install haveged and voila, Gerrit would start and stop without ANY problems.
P.S. Thanks to my colleague Nicola Heald for putting me on track to resolution on this problem, I spent a whole morning doing all sorts of testing and didn’t think about entropy!