⚙️ Turbocharging Banana Pi R4 Kernel Builds with distcc and ccache

⚙️ Turbocharging Banana Pi R4 Kernel Builds with distcc and ccache

Building Linux kernels on embedded devices like the Banana Pi R4 can be painfully slow. Out-of-the-box, a full kernel build on the R4 can take over two hours. Fortunately, we can massively accelerate this by offloading compilation to powerful x86_64 machines using distcc and caching results with ccache. Here’s how to set up a distributed build system and cut build times dramatically.


Hardware Setup

We’re working with:

Banana Pi R4 *3

  • CPU: 4-core ARM Cortex-A73
  • RAM: 8GB
  • OS: Debian Bookworm (arm64)
  • Storage: eMMC, 64GB SD card, 256Gb NVMe

Homer (Intel i9-13900K)

  • Architecture: x86_64
  • Cores: 24 cores (32 threads)
  • RAM: 128 GB DDR5
  • Host OS: Arch Linux: Chroot: Aarch64 Debian Bookworm

Worker (Xeon E5-2683 v4)

  • Architecture: x86_64
  • Cores: 16 cores (32 threads)
  • RAM: 256 GB
  • Host OS: Arch Linux: Chroot: Aarch64 Debian Bookworm

Combined, Homer and Worker boast 384 GB RAM and dozens of cores to share the build load.


Why distcc and ccache?

distcc distributes compile jobs across multiple machines. Instead of compiling locally, the source code is preprocessed on the R4 and sent to the x86 hosts for compilation, then the resulting object files are returned.

ccache caches compiler outputs. If a file hasn’t changed, ccache returns the object file instantly, avoiding repeated compilation.

Used together:

  • distcc speeds up builds by distributing work.
  • ccache avoids unnecessary work.
  • Both reduce build times and heat on the Banana Pi R4.

In practice, distcc and ccache can reduce a kernel build on the R4 from ~2 hours to under 20 minutes.


Setting up the chroot environment

The goal is to let Homer and Worker compile aarch64 code even though they’re x86_64. We use:

  • Debian Bookworm rootfs (arm64)
  • qemu-aarch64-static
  • binfmt_misc for automatic binary translation

Create the chroot

On Homer and Worker:

# Bootstrap the Debian arm64 rootfs
debootstrap --arch=arm64 bookworm /mnt/bpi-chroot http://deb.debian.org/debian

Copy QEMU into the chroot:

cp /usr/bin/qemu-aarch64-static /mnt/bpi-chroot/usr/bin/

Bind mount essential filesystems:

mount -t proc /proc /mnt/bpi-chroot/proc
mount --bind /sys /mnt/bpi-chroot/sys
mount --bind /dev /mnt/bpi-chroot/dev

Enter the chroot:

chroot /mnt/bpi-chroot /bin/bash

Inside the chroot:

apt update
apt install distcc ccache build-essential

Installing distcc and ccache

On Debian/Ubuntu

sudo apt install distcc ccache qemu-user-static

On Arch Linux

sudo pacman -S distcc ccache qemu-user-static

The distcc wrapper script

The kernel build calls compilers directly. We must force it to use distcc.

On the R4:

mkdir -p ~/distcc_bin

cat > ~/distcc_bin/aarch64-linux-gnu-gcc <<'EOF'
#!/bin/bash
exec distcc aarch64-linux-gnu-gcc "$@"
EOF

chmod +x ~/distcc_bin/aarch64-linux-gnu-gcc

export PATH=~/distcc_bin:$PATH

This ensures any call to aarch64-linux-gnu-gcc uses distcc automatically, simplifying setup and avoiding the need to modify kernel Makefiles directly.


Configuring distcc hosts

On the R4, edit:

/etc/distcc/hosts

Example contents:

172.16.0.1
172.16.0.2

Replace these with the IPs of Homer and Worker.


Starting distccd

Inside each chroot on Homer and Worker:

/usr/bin/distccd \
  --pid-file=/var/run/distccd.pid \
  --log-file=/var/log/distccd.log \
  --daemon \
  --allow 127.0.0.1 \
  --allow 172.16.0.0/24 \
  --allow 172.17.0.0/24 \
  --listen 172.16.0.1 \
  --nice 10 \
  --jobs 28

Adjust --listen and --jobs based on hardware capacity.


Testing distcc

This simple test verifies distcc setup:

On the R4:

echo 'int main() { return 0; }' > test.c
DISTCC_VERBOSE=1 distcc aarch64-linux-gnu-gcc -c test.c -o test.o

Confirming distributed builds:

  • Use distccmon-text 1 on Homer and Worker to view live job distribution.
  • Inspect /var/log/distccd.log for compile job logs.
  • Use ifstat to monitor network traffic indicating distributed builds.

If your Banana Pi R4 shows high CPU load but no network activity, distcc is misconfigured.


Building the kernel

Compile the kernel from the Frank-W repo:

export ARCH=arm64
export CROSS_COMPILE=aarch64-linux-gnu-
make -j60
  • The -j60 flag utilizes up to 60 threads.
  • Check load to avoid memory exhaustion.
  • Check cache efficiency with ccache -s.
ccache -s

Why binfmt_misc?

Without binfmt_misc, an x86 host can’t run ARM binaries. This Linux kernel feature automatically invokes QEMU for aarch64 binaries, allowing transparent cross-compilation in the chroot environment.


Real-World Results

  • Banana Pi R4 local build: ~2 hours
  • Using distcc: ~20 minutes or less
  • Network traffic peaks around ~10 Gbps during builds.

This setup transforms the Banana Pi R4 from a slow single-board computer into a serious development platform. Happy hacking!

Conclusion

With the right tooling, you don’t have to settle for 2-hour kernel builds. Combining distcc, ccache, and smart chrooting turns the Banana Pi R4 into a serious development platform, slashing build times and keeping your workflow productive.


Want to replicate this? Feel free to reach out or check the ongoing work at:

https://articles.akadata.ltd/how-to-speed-up-bpi-r4-development-with-nfs-root-and-cross-chrooting/