When using a binary cache store, the queue runner receives NARs from the build machines, compresses them, and uploads them to the cache. However, keeping multiple large NARs in memory can cause the queue runner to run out of memory. This can happen for instance when it's processing multiple ISO images concurrently.
The fix is to use a TokenServer to prevent the builder threads to store more than a certain total size of NARs concurrently (at the moment, this is hard-coded at 4 GiB). Builder threads that cause the limit to be exceeded will block until other threads have finished.
The 4 GiB limit does not include certain other allocations, such as for xz compression or for FSAccessor::readFile(). But since these are unlikely to be more than the size of the NARs and hydra.nixos.org has 32 GiB RAM, it should be fine.
SL3WSRACCX2IMJHHLTRAUQT7QDLCOKYLVO2FEHWIHXM5GPKSRJTQC B2L4T3X63XVYJQXEDU4WT5Y4R6PMDXGC6WN2KGOMHRQILSABNQOAC UVNTWTWGQOFKDAJ2ROJYT4U2N4EUXKNWZWPHOM42WPLUL4ALXRJQC BRAESISHTN4IIWUBVDMPDMY7QLMJDKX7GQ7K6NSJN66L5VPWSX3QC 73YR46NJNYZQKHA3QDJCAZYAKC2CGEF5LIS44NOIPDZU6FX6BDPQC N4IROACVZ4MU73J5SM6WXJMKQSFR3VN5SOKENNNZNEGMTGB2Q3HAC 5AIYUMTBY6TFQTBRP3MJ2PYWUMRF57I77NIVWYE74UMEVQMBWZVQC A2GL5FOZ3UJ2NM5RPRWTNPFTKLBA54B2UC6UIYO4M3N3RFNC4BTAC LE4VZIY5VZ52FOP5QQRIJINWIMWTAPRTZTGO77JXUEPGRPRSQYMAC FITVNQ2SVM6KSOF5P3HHWJYQ3WMQYDJGAONCBIZ7OF7CPXGMA36QC 6EO3HVNAAVJLBYTBSHXQADDTYF7JJU3ANHS5SKHSSROOW4AXIZ4AC 24BMQDZAWDQ7VNIA7TIROXSOYLOJBNZ2E4264WHWNJAEN6ZB3UOAC PLOZBRTR6USSGJX7GR2RZKNPVYG2Q6QM7LW6IA35MKL63ZTQVD7QC IK2UBDAU6QKUXHJG3SXJKYGIIXRDKI6UVRTFC6ZVDXDCGNCMEWVAC EYR3EW6JVHNVLXMI57FUVPHQAHPETBML4H44OGJFHUT54KTTHIGQC HJOEIMLRDVQ2KZI5HGL2HKGBM3AHP7YIKGKDAGFUNKRUXVRB24NAC GJV2J5HXFKVF7BXNFMTI6ZZMYADXIKTFIQJ3FONJ7DPVCUJZ2L4AC EOO4EFWD2BJCGF3ZKS2QR3XDW4WHUGH2EHSOFVK6GMI5BUBZW6QQC MB3TISH2KYBIGY6XJKMN4HO2S6TCN2GORJENMECCKLXGGIRS2O2AC MaintainCount mc(nrStepsCopyingFrom);
/* Query the size of the output paths. */size_t totalNarSize = 0;to << cmdQueryPathInfos << outputs;to.flush();while (true) {if (readString(from) == "") break;readString(from); // deriverreadStrings<PathSet>(from); // referencesreadLongLong(from); // download sizetotalNarSize += readLongLong(from);}printMsg(lvlDebug, format("copying outputs of ‘%s’ from ‘%s’ (%d bytes)")% step->drvPath % machine->sshName % totalNarSize);/* Block until we have the required amount of memoryavailable. FIXME: only need this for binary cachedestination stores. */auto resStart = std::chrono::steady_clock::now();auto memoryReservation(memoryTokens.get(totalNarSize));auto resStop = std::chrono::steady_clock::now();
result.accessor = destStore->getFSAccessor();
auto resMs = std::chrono::duration_cast<std::chrono::milliseconds>(resStop - resStart).count();if (resMs >= 1000)printMsg(lvlError, format("warning: had to wait %d ms for %d memory tokens for %s")% resMs % totalNarSize % step->drvPath);
/* Token server to prevent threads from allocating too many bigstrings concurrently while importing NARs from the buildmachines. When a thread imports a NAR of size N, it will firstacquire N memory tokens, causing it to block until that manytokens are available. */nix::TokenServer memoryTokens;
available. Calling get() will return a Token object, representingownership of a token. If no token is available, get() will sleepuntil another thread returns a token. */
available. Calling get(N) will return a Token object, representingownership of N tokens. If the requested number of tokens isunavailable, get() will sleep until another thread returns atoken. */
auto curTokens(ts->curTokens.lock());while (*curTokens >= ts->maxTokens)
if (tokens >= ts->maxTokens)throw NoTokens(format("requesting more tokens (%d) than exist (%d)") % tokens);auto inUse(ts->inUse.lock());while (*inUse + tokens > ts->maxTokens)
}