When using a binary cache store, the queue runner receives NARs from the build machines, compresses them, and uploads them to the cache. However, keeping multiple large NARs in memory can cause the queue runner to run out of memory. This can happen for instance when it's processing multiple ISO images concurrently.
The fix is to use a TokenServer to prevent the builder threads to store more than a certain total size of NARs concurrently (at the moment, this is hard-coded at 4 GiB). Builder threads that cause the limit to be exceeded will block until other threads have finished.
The 4 GiB limit does not include certain other allocations, such as for xz compression or for FSAccessor::readFile(). But since these are unlikely to be more than the size of the NARs and hydra.nixos.org has 32 GiB RAM, it should be fine.
aintainCount mc(nrStepsCopyingFrom);
/* Query the size of the output paths. */size_t totalNarSize = 0;to << cmdQueryPathInfos << outputs;to.flush();while (true) {if (readString(from) == "") break;readString(from); // deriverreadStrings<PathSet>(from); // referencesreadLongLong(from); // download sizetotalNarSize += readLongLong(from);}printMsg(lvlDebug, format("copying outputs of ‘%s’ from ‘%s’ (%d bytes)")% step->drvPath % machine->sshName % totalNarSize);/* Block until we have the required amount of memoryavailable. FIXME: only need this for binary cachedestination stores. */auto resStart = std::chrono::steady_clock::now();auto memoryReservation(memoryTokens.get(totalNarSize));auto resStop = std::chrono::steady_clock::now();
result.accessor = destStore->getFSAccessor();
auto resMs = std::chrono::duration_cast<std::chrono::milliseconds>(resStop - resStart).count();if (resMs >= 1000)printMsg(lvlError, format("warning: had to wait %d ms for %d memory tokens for %s")% resMs % totalNarSize % step->drvPath);
/* Token server to prevent threads from allocating too many bigstrings concurrently while importing NARs from the buildmachines. When a thread imports a NAR of size N, it will firstacquire N memory tokens, causing it to block until that manytokens are available. */nix::TokenServer memoryTokens;
available. Calling get() will return a Token object, representingownership of a token. If no token is available, get() will sleepuntil another thread returns a token. */
available. Calling get(N) will return a Token object, representingownership of N tokens. If the requested number of tokens isunavailable, get() will sleep until another thread returns atoken. */
auto curTokens(ts->curTokens.lock());while (*curTokens >= ts->maxTokens)
if (tokens >= ts->maxTokens)throw NoTokens(format("requesting more tokens (%d) than exist (%d)") % tokens);auto inUse(ts->inUse.lock());while (*inUse + tokens > ts->maxTokens)
}