Edd Barrett, Tue 24 October 2023.
tags: Compilers Build-systems
I'm currently working on a project that's based on LLVM
and I have to build (and re-build) it from source quite a lot. Although the
LLVM build system is highly parallel, my development machine isn't beefy, and
doesn't have many cores. Building LLVM can take a long time.
I do however, have SSH access to machines with lots of cores! It's just that
those are not machines I can realistically develop on.
This article discusses how to build LLVM quickly on a remote machine using two
quite old tools that you may have heard of:
I'm not discussing anything particularly profound here, but I found setting
this up (
distcc in particular) quite fiddly. Since there are no guides
on how to do this specific to LLVM, I thought I'd write it down here and
hopefully save someone else (at least my future-self) some time.
Background information and goals
Let's start by briefly discussing what we are working with and what we want to
ccache is a compiler cache for C and C++. It works by creating a mapping from
hashed source files to compiled object files. When you compile a
source file, the C preprocessor is invoked and the resulting text is hashed.
The cache is then queried with this hash, at which point one of two things can
A cache miss. The C/C++ compiler is invoked and the resulting object file is
stored into the cache. The compiler is (compared to the cache) slow, so we
want to avoid this case as much as we can.
A cache hit. The pre-compiled object file is immediately available in the
cache and there's no need to invoke a compiler.
After building a project from scratch for the first time, you could clean your
build directory and rebuild everything without invoking the compiler at all
(assuming your cache is large enough).
(With caveats , you could also share the cache between users
so as to not duplicate compilation effort)
In the words of the project's website,
distcc is a "a fast, free distributed
C/C++ compiler". It lets you spread your compilation jobs across many hosts,
instead of just compiling things locally.
It has two modes:
"plain" mode: where preprocessed files are distributed and compiled.
"pump" mode: where raw, unpreprocessed source files are distributed and
The two modes have benefits and drawbacks. Plain mode, for example, isn't as
sensitive to discrepancies between the libraries on the different systems
involved: since preprocessing is local, all compile hosts use the same library
versions (note that linking is not distributed).
Pump mode, on the other hand is faster, since both compilation and
preprocessing can be distributed. According to the
distcc manual pump mode
can speed things up by "up to an order of magnitude over plain
My goal is to build the LLVM code base faster using
want one local
ccache, and to use
distcc upon cache misses.
I'll use plain mode
distcc because I didn't want to have to keep the
libraries in sync on all the machines (anyway, the
distcc manual page says
distcc's pump mode is not compatible with
I also want to do all of the network communication over SSH, since the machines
are on a network with other hosts that we don't control.
How's it done?
First some terminology. Let's call the machine that will initiate the
compilation the "initiator", and the remote machine(s) that will receive jobs
over SSH "remotes".
Suppose we have two remotes:
(For brevity, I'm assuming all of the hosts run Debian)
Installing stuff and starting the
The first thing to do is install
distcc on all hosts involved:
Then on the remotes ensure that only the localhost is allowed to start
compile jobs (we can do this because jobs coming in over SSH will be considered
On the remotes, edit
/etc/distcc/clients.allow so that they contain only one
distcc on the remotes:
# systemctl start distcc && systemctl enable distcc
On the initiator, install
Now we need to make sure that the initiator can SSH to all of the remotes
without a password. Chances are, you already know how to do this, but here's a
To make a no password SSH key, do something like:
-t chooses an encryption algorithm -- choose one you are comfortable with)
When prompted for a passphrase, just hit enter. This ensures that we don't have
to enter a passphrase every time the initiator wants to send a compile job to a
Put the public part of the key into
~/.ssh/authorized_keys on all the
remotes, then on the initiator put something like this in
(substitute the path for the key you just made and adjust the hostname(s) as
Now check you can SSH from the initiator to all of the remotes without a
To speed up SSH comms, we can multiplex connections using a master connection.
This means that a fresh SSH connection doesn't have to be established every
time the initiator wants to send a job to a remote.
To enable a master connection, you will want to expand the entry in your
~/.ssh/config to look more like:
On your remotes, make sure that
set high enough for the number of compile jobs you've allocated to each remote.
(If you are not planning on using a SSH master connection, also read up on
(Don't forget to restart
Scheduling the simplest compile job
Before we dive into LLVM, let's do something smaller first. Let's compile
nothing in a distributed fashion :)
$ cd /tmp && touch empty.c
$ DISTCC_VERBOSE=1 DISTCC_HOSTS="@remote1/128 @remote2/128" ccache distcc -c empty.c
If all goes well, you should get a
empty.o file and see output like this:
distcc exec on localhost: x86_64-linux-gnu-gcc -E /tmp/empty2.c
distcc exec on @remote1/128: x86_64-linux-gnu-gcc -c -o empty2.o /tmp/empty2.c
You can see that preprocessing was local, but compilation was remote.
A quick primer on
DISTCC_HOSTS. It lets us choose where to schedule jobs.
It's a space separated list and (for our purposes) each entry starts with
[user]@hostname[/jobs] for an SSH host, or
localhost/[jobs] for the local
jobs is the maximum number of compilation jobs to
send to the host at once -- for a remote, I usually set this to the core count.
If you don't want to have to set
DISTCC_HOSTS in the environment every time,
you can edit
Now would be a good time to inspect what
ccache has cached:
$ ccache --show-stats
cache directory /home/vext01/.cache/ccache
primary config /home/vext01/.config/ccache/ccache.conf
secondary config (readonly) /etc/ccache.conf
stats updated Thu May 4 11:56:23 2023
cache hit (direct) 0
cache hit (preprocessed) 0
cache miss 1
cache hit rate 0.00 %
cleanups performed 0
files in cache 2
cache size 8.2 kB
max cache size 5.0 GB
You can see that we had one cache miss. If you re-run the compile command then
you will see
cache_hit (direct) increment while
cache_miss stays at 1. You
will also notice that there is no
distcc logging output the second time
around. This is because
ccache never invoked a
distcc! It used the object
file from the first time we compiled the file.
If you got this all working, it's time to get this working with the LLVM build
ccache for LLVM.
LLVM uses cmake as its build system, so we have to trick cmake into invoking
ccache and (where necessary)
distcc instead of using a C/C++ compiler.
LLVM's cmake setup already provisions for
ccache which makes this much
easier. To use
-DLLVM_CCACHE_BUILD=On when you configure the
build. We should also probably use (roughly) the same C/C++ compilers on all
remotes, so it would be prudent to pass
Configuring LLVM would therefore look something like this:
mkdir -p build
cd build && cmake \
... are the rest of your configure arguments)
But how do we put
distcc into the mix? The easiest way I've found is to set
CCACHE_PREFIX=distcc in the environment when building.
Then there's one last consideration. By default
cmake (or rather the tool
cmake is generating for, e.g.
ninja) spawns a number of parallel
compile jobs suitable for the core count of local host's CPU. We need to up
this so that we can make the most of the larger number of cores available to us
on the remotes. For two remotes, each with 128 cores each, 256 parallel jobs
So when I build LLVM (after configuring the build), it looks something like
DISTCC_HOSTS="@remote1/128 @remote2/128" CCACHE_PREFIX=distcc cmake --build build -j256
If you got it right, LLVM should build faster using the cores of the remotes.
You should also find that rebuilds from scratch should be even faster due to
the cache. After finishing a build, remove your entire
reconfigure and rebuild. You should be surfing on a wave of cache hits. The
build only slows during linking and when LLVM's
tablegen stuff is run. These tasks cannot
be cached as they are not C/C++ compilation jobs.
You may have to fiddle around with
-j a bit to find what works best --
remember that although you are compiling remotely, you are still preprocessing
locally. Can the local host handle 256 concurrent preprocessors?
I hope this helps. This setup is working well for me. In 2023, I'm using this
setup on a daily basis for development of our experimental
If you spot any mistakes, please email me, or message me on mastodon.
- With a few build system hacks, sharing a cache between users should be
possible. At the time of writing, LLVM's build system sets
CCACHE_HASHDIR=yes, which makes all cache lookups sensitive to the
directory in which a user is building. This means that compiling the same
file twice, but in different directories (as different users typically will),
will result in a cache misses. One issue with sharing the cache over
different build directories is that any paths that get encoded into the
resulting object files at compile-time (e.g. paths in DWARF debug sections)
could be incorrect for later consumers of the cache.