Merinorus

Building Android ROMs on limited RAM: zRAM vs. zswap comparison

Building an Android ROM is a memory-intensive task. The RAM requirements to build AOSP (Android Open Source Project) have kept growing for the last 15 years:

I’ll show you how I still build ROMs within a reasonable amount of time with 16GB of RAM. Even less actually, since I’m compiling in WSL (Windows Subsystem for Linux), which is basically a Linux virtual machine running on Windows.

Spoiler: 8 GB is tight to build Android, but 16 GB is plenty. Read below.

Why does AOSP build consume so much memory

From my experience, building a custom ROM based on AOSP indeed requires about 30-40GB of memory. It may be more for ROMs with more requirements, such as LineageOS-based ROMs.

Just lower the number of jobs!

Well, this doesn’t work. Why? The Android build process occurs in multiple stages:

  1. Build dependency analysis: The build system (Soong) analyzes de build dependencies by parsing the .bp blueprint files. A dependency graph is generated to determine the build order, with relationships between thousands of modules. As far as I know, the number of jobs doesn’t matter: everything is loaded in memory, and that uses about 30 GB of memory or more with Android 14 (this includes RAM and swap). This step uses a huge fixed amount of memory, regardless of the number of jobs.
  2. Compilation and Linking: these steps can be run in parallel, so the number of jobs will determine how much memory will be needed. From my experience, with 8 jobs, less than 30 GB are used.
  3. Image generation: It is quite RAM-hungry, but I remember it consumes less than 10 GB on my build setup.

You’re telling me that if I have less than 32 GB of RAM, I’m screwed?

No, of course not. While you may theoretically need 30 GB, you can still compile AOSP smoothly with just 16 GB of RAM, or even less.

Virtually increase your system memory

If your system needs more RAM, adding more may not be an option: you are low on budget, you have maxed out the available RAM on your computer, or the RAM is soldered…

Yet, Linux provides two main techniques to virtually increase system memory: swapping memory to a physical drive, or using in-memory compression.

1. Use a SWAP partition on a physical drive

SWAP increases the available memory by extending it to the physical drives. For example, you can create a 30 GB SWAP partition on your SSD. I did that for months, and while it “works”, it’s painfully slow. Since everything must be loaded in memory and the SSD is so much slower than RAM, the slightest change needs 50 minutes on my machine for a rebuild.

→ You trade SSD space and IOPS to virtually increase the memory. The SSD becomes the bottleneck, and using a hard drive instead will be significantly slower.

2. Enable in-memory compression (zRAM / zswap)

Instead of swapping data into a physical drive, Linux can also use compression to store data directly in RAM. This is very interesting because you can usually achieve a 3:1 compression ratio, or even more depending on the data and the compression algorithm. Also, the swap performance is miles ahead of the SSD so you might not even notice the performance drop. The caveat is that the CPU must compress and decompress the memory for the application, so there is more latency and less throughput than reading for uncompressed RAM directly. You can choose the compression algorithm to arbitrate between speed and compression ratio.

Currently, the most supported ways to enable memory compression in Linux are zRAM and zswap.

→ You trade CPU cycles to virtually increase the memory. The CPU becomes the bottleneck.

Cool. Which one should I use?

Here are the main recommendations I found on the Internet:

Let’s compare to see if these recommendations hold.

Benchmark

Here’s the setup I’m using for benchmarking:

Fixed parameters:

Each result is the mean of three runs of the following script:

#!/bin/bash

NB_JOBS=8  # After verification, this has no impact on the Soong analysis stage.

# Log file to store memory usage
MEMORY_LOG_FILE="memory_use.log"

# Monitor the device where swap is written
DEVICE="sdb"

# Return the total RAM + Swap use in MB
get_total_memory_use(){
    free | awk '/^Mem:/ {ram_used=$3} /^Swap:/ {swap_used=$3; total_used=(ram_used + swap_used) / 1024; print total_used}'
}

# Function to monitor memory usage
monitor_memory() {
    while true; do
        get_total_memory_use >> "$MEMORY_LOG_FILE"
        sleep 1
    done
}

source build/envsetup.sh
breakfast redfin  # redfin is Google Pixel 5. Could be any other device.

# Add a dummy comment to the Android.bp file to trigger a new Soong analysis
echo "// dummy comment" >> Android.bp

echo "System memory use before benchmark: $(get_total_memory_use) MB (RAM + Swap)"

# Start memory monitoring in the background
monitor_memory &

# Store the PID of the monitoring process
MONITOR_PID=$!

echo $(date '+%Y-%m-%d %H:%M:%S'): Starting benchmark

# Get initial SSD swap partition write stats
initial_write=$(iostat -m | awk -v dev="$DEVICE" '$1 == dev {print $6}')

# Run the build. The "nothing" target is to do the Soong step only.
/usr/bin/time -v bash -c "source build/envsetup.sh && m -j${NB_JOBS} nothing"

# Stop the memory monitoring
kill "$MONITOR_PID"
wait "$MONITOR_PID" 2>/dev/null

current_write=$(iostat -m | awk -v dev="$DEVICE" '$1 == dev {print $6}')
write_mb=$((current_write - initial_write))

max_mem_used=$(awk 'NR==1 {max=$1} $1>max {max=$1} END {print max}' "$MEMORY_LOG_FILE")
echo "Maximum memory usage during build (RAM + Swap): ${max_mem_used} MB"
echo "Total MB Written to $DEVICE: $write_mb MB"
echo "$(date '+%Y-%m-%d %H:%M:%S'): End of benchmark\n"

SSD Swap performance

Let’s start with SSD swap only. Here is the time (hh:mm:ss) to build the dependency graph (Soong step only) for Lineageos 21 (Android 14), with, if any, the amount of memory swapped out to the SSD during the build:

Compression settings \ System RAM 7GB RAM 12GB RAM
No zRAM, no zswap 2:16:55 / 224 GB 2:00:37 / 216 GB

This is the time it took on my machine… for the graph dependency analysis only. Note how little difference between 7 GB and 12 GB of RAM! The majority of the build time is dedicated to memory swapping. Do this every day and your SSD will eventually die.

Only the memory swapped out to the SSD is measured: the other writes, such as the build output or other processes, are excluded from the results.

zRAM performance

Now, let’s try with zRAM. zRAM acts as a compressed swap partition stored directly in RAM, reducing or eliminating the need for an SSD swap partition. Still, I kept the SSD swap, but with a lower priority. So the system can swap out to the zRAM, until either of the two conditions is met:

#!/bin/sh

# Enable in-memory compressed swap with zRAM
echo lz4 | tee /sys/block/zram0/comp_algorithm
echo 40G | tee /sys/block/zram0/disksize
mkswap /dev/zram0
swapon -p 5 /dev/zram0  # Higher swap priority than the physical disk

# Change page cluster from 3 to 0. Better, as long as there is no physical disk swapping
echo 0 | tee /proc/sys/vm/page-cluster

The main settings to adjust are the zRAM device size and the compression algorithm. Here are the results:

Compression settings \ System RAM 7GB RAM 12GB RAM
24 GB zRAM [lz4] 1:39:11 / 126 GB 18:49 / 8 GB
24 GB zRAM [zstd] 59:16 / 61 GB 17:21
40 GB zRAM [lz4] 29:35 (FAIL) 15:01
40 GB zRAM [lz4hc] 1:26:58 (FAIL) 34:46
40 GB zRAM [zstd] 34:57 19:08
40 GB zRAM [deflate] Not tested 33:09

No doubt, you can achieve much faster builds with in-memory compression. Here is an explanation of the results:

You can check how much RAM your zRAM device is using at any time:

> watch -n 5 sudo zramctl
NAME       ALGORITHM DISKSIZE  DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lz4            40G  6,5G  1,7G  1,7G       8 [SWAP]

During the Soong build, you can expect a compression ratio of approximately 4:1 to 5:1 when using lz4. With zstd, CPU load is higher but the ratio is about 10:1!

Note: I didn’t enable zRAM writeback, since it is not automatic: it requires a custom service to monitor memory pressure and write data back on SSD when needed.

Zswap performance

Zswap works a bit differently than zRAM. It requires a swap partition (hard drive, SSD… or even zRAM!) and acts as a compressed cache in RAM. So, when the system starts to swap out, the data will be compressed and kept on an allocated space of RAM. The size is configurable: for instance, you can set 20% (max_pool_percent=20) of your RAM dedicated to the compressed space.

#!/bin/sh

# Enable in-memory compressed swap cache with zswap

# The zpool allocator must be set to zsmalloc to achieve the best compression ratios
echo zsmalloc | sudo tee /sys/module/zswap/parameters/zpool
echo zstd | tee /sys/module/zswap/parameters/compressor

# Should be very high for specific cases when a lot of RAM is needed in a short time, ie. AOSP building
echo 80 | tee /sys/module/zswap/parameters/max_pool_percent

# If the page is read back to RAM, keep it also compressed (like zRAM)
echo N | tee /sys/module/zswap/parameters/exclusive_loads

# Change page cluster from 3 to 0. Fast as long as there is no SSD swapping
echo 0 | tee /proc/sys/vm/page-cluster

Let’s see the results:

Compression settings \ System RAM 7GB RAM 12GB RAM
zswap, max_pool_percent=20 [lz4] 59:59 / 105 GB 00:25:21 / 33 GB
zswap, max_pool_percent=20 [zstd] 51:55 / 73 GB 21:53 / 14 GB
zswap, max_pool_percent=60 [lz4] 44:52 / 26 GB 23:00
zswap, max_pool_percent=60 [zstd] 44:11 / 3 GB 17:24
zswap, max_pool_percent=80 [lz4] 52:26 / 8 GB 15:50
zswap, max_pool_percent=80 [zstd] 44:30 15:29

The results are close to the benchmark with zRAM:

Note: zswap seems to handle memory pressure better than zRAM: with 7 GB of RAM, 80% (maximum) allocated to zswap, and lz4 compression, it most probably swaps out to the SSD long before the allocated space is full. To achieve the same result with zRAM, you must monitor it and trigger memory recompression and/or write back manually, a script or a service.

Conclusion: zRAM or zswap?

Features zRAM zswap
Can work without a swap partition yes no
High compression ratio yes yes (with zsmalloc as zpool allocator)
Multiple compression algorithms & recompression yes (manual) no
Write back to the SSD yes (manual) yes (automatic)
Compressed write back to the SSD yes no

Both technologies offer close performances.

If you can’t have a swap partition, use zRAM. A lot of known devices shouldn’t have a physical drive as a swap: Raspberry Pi and Android devices use zRAM as swap because the Flash storage is too slow and not durable enough with repetitive write operations. Synology NAS also use zRAM.

If your system already comes with a swap partition on a hard drive or an SSD, both will work, but it might be easier to use zswap. However! your kernel must support zsmalloc as the zpool allocator, otherwise, you will end with a much lower compression ratio than with zRAM.

On a recent machine with a fast NVMe SSD, I’d probably use zswap — it’s simpler to set up and the NVMe is fast enough that swapping to disk isn’t much of a penalty. For machines with limited CPU and/or storage performance, I’d give a slight advantage to zRAM, for two reasons:

These two advantages come at a cost: write-back and recompression are not automatic, and you must tinker with the settings and write a dedicated service for your system to behave correctly. This goes beyond the scope of this article.

For the compression algorithm: lz4 will offer a good compression ratio with minimal performance cost. However, if memory constraints remain an issue, you can opt for zstd to free up more RAM, though it comes at the expense of higher CPU usage. It’s the trade-off between speed and memory efficiency.

Finally, not all kernels support both zRAM and zswap, so you may just end up using whatever is available on your system. Both technologies are in active development and keep converging in features, so keep an eye on their evolution.

#Android #Linux