Marcel van der Boom
open-menu closeme
About
Archive
rss
  • NX-GZIP hardware acceleration on Talos II POWER9

    calendar 2026-05-15 · 5 min read · power9 ppc64le talos2 hardware compression linux performance  ·
    Share: Copy link to post src pdf

    POWER9 processors1 ship with two hardware compression engines: NX-842 (a kernel crypto API accelerator for the 842 algorithm) and NX-GZIP (a gzip/deflate-compatible engine accessible from userspace). NX-GZIP is the interesting one: it provides zlib-compatible acceleration via LD_PRELOAD without rebuilding any software. This post documents a possible setup on a Talos II running Arch Linux ppc64le (kernel 6.19.11) and shows some simple benchmarks to get a feel on what to expect with respect to performance improvement.

    Background

    POWER9 exposes the compression hardware through the VAS (Virtual Accelerator Switchboard) subsystem2, there are two engines:

    • NX-842: 842 algorithm, registered in the kernel crypto API. Loads automatically once nx-compress-powernv.ko is active. No userspace setup required.3
    • NX-GZIP: gzip/deflate engine, exposed as /dev/crypto/nx-gzip. Requires a one-time NVRAM flag and the libnxz userspace library.2

    The Arch Linux ppc64le kernel has CONFIG_PPC_VAS=y and CONFIG_CRYPTO_DEV_NX=y already set — no kernel rebuild is needed.

    Step 1: Enable VAS userspace access in NVRAM

    To enable exposure of the engine, the skiboot firmware reads the vas-user-space NVRAM option at boot. A kexec (fast reset) is not sufficient — only a full IPL picks up NVRAM changes.

    1  sudo nvram --update-config "vas-user-space=enable" --partition ibm,skiboot
    2  # Full power cycle required — kexec will not work
    3
    4  # Verify
    5  sudo nvram -p ibm,skiboot --print-config
    6  # "ibm,skiboot" Partition
    7  # --------------------------
    8  # vas-user-space=enable

    After reboot, verify in the OPAL message log:

    1grep "vas-user-space" /sys/firmware/opal/msglog
    2# NVRAM: Searched for 'vas-user-space' found 'enable'
    3# VAS: Initialized chip 0 / VAS: Initialized chip 8
    4# NX0: gzip Coprocessor Enabled / NX8: gzip Coprocessor Enabled

    Step 2: Verify /dev/crypto/nx-gzip exists

    With CONFIG_PPC_VAS and CONFIG_CRYPTO_DEV_NX enabled (both present in the Arch ppc64le kernel), the device node is created automatically by the VAS subsystem on boot:

    1ls -la /dev/crypto/nx-gzip
    2# crw------- 1 root root 236, 0 ...

    Step 3: Create system group and udev rule

    To make usage of the device a bit easier, we assign it a group and proper permissions. udev requires a system group (GID < 1000) for device node rules — a regular user group silently fails with "Not a system group" and the rule is ignored.

    1sudo groupadd --system nx-gzip
    2sudo usermod -aG nx-gzip $USER
    3echo 'KERNEL=="nx-gzip", GROUP="nx-gzip", MODE="0660"' | \
    4  sudo tee /etc/udev/rules.d/99-nx-gzip.rules
    5sudo udevadm control --reload
    6sudo udevadm trigger --subsystem-match=nx-gzip

    Verify:

    1sudo udevadm test /devices/virtual/nx-gzip/nx-gzip 2>&1 | grep -E "GROUP|MODE"
    2# GROUP="nx-gzip": Set group ID: 939
    3# MODE="0660":    Set mode: 0660

    Users in the nx-gzip group can now use the crypto device. Log out and back in (or use newgrp nx-gzip) for group membership to take effect.

    Step 4: Build libnxz

    The library that catches calls to libz and reroutes them to the hardware lives at github.com/libnxz/power-gzip. Clone it.

    There is onne build fix needed in lib/nx_gzlib.c line 140 declares digit as char* but strpbrk returns const char*, which gcc -Werror rejects:

    1--- a/lib/nx_gzlib.c
    2+++ b/lib/nx_gzlib.c
    3@@ -140 +140 @@
    4-	char* digit;
    5+	const char* digit;
    

    If you do not apply the patch manually, these lines will give you a working library:

    1cd ~/dat/src/power-gzip
    2./configure
    3sed -i 's/\tchar\* digit;/\tconst char* digit;/' lib/nx_gzlib.c
    4make -j$(nproc) -C lib
    5# Produces: lib/.libs/libnxz.so.0.0.65

    Step 5: Use via LD_PRELOAD

    libnxz intercepts zlib API calls (compress, deflate, inflate, etc.) from programs that dynamically link libz.so. It does not work with the gzip binary — GNU gzip bundles its own deflate and never calls libz.so.

    The generic use of the library, for programs that use libz, is:

    1LD_PRELOAD=<path-to>/libnxz.so <program>

    To verify hardware is actually used:

    1  # Run it
    2  NX_GZIP_LOGFILE=/tmp/nx.log NX_GZIP_VERBOSE=2 NX_GZIP_TRACE=8 LD_PRELOAD=.../libnxz.so <program>
    3
    4  # Verify use
    5  grep "deflate(nx)" /tmp/nx.log   # count should be > 0

    This works with python3, via the zlib module, rsync, pigz, Java, and any program that dynamically links libz.so.

    Does not work with: gzip, zstd, lz4 — these have their own compression implementations.

    Before wrapping a binary, check that it does link libz.so and not libz-ng.so.2 (zlib-ng):

    1ldd /usr/bin/someprogram | grep libz
    2# Must show libz.so.1 — not libz-ng.so.2

    For example, git on my machine links libz-ng.so.2, so libnxz cannot intercept it — zero NX operations will be recorded.

    Benchmarks

    Benchmark: Python's zlib.compress() with level 1, comparing software zlib against NX-GZIP under LD_PRELOAD.

     1import zlib, os, time
     2
     3# Random data (incompressible — worst case)
     4data = os.urandom(1024 * 1024) * 50   # 50 MB
     5t = time.perf_counter()
     6out = zlib.compress(data, 1)
     7print(f'{len(data) / (time.perf_counter() - t) / 1024**2:.0f} MB/s')
     8
     9# Compressible data
    10data = (b'Hello world this is some compressible text data ' * 64) * (1024 * 32)   # 96 MB
    11t = time.perf_counter()
    12out = zlib.compress(data, 1)
    13print(f'{len(data) / (time.perf_counter() - t) / 1024**2:.0f} MB/s')

    NX-GZIP results (zlib level 1, POWER9 DD2.2, 2 sockets / 8 cores):

    Input type Size Software zlib NX-GZIP Speedup
    Random data 50 MB 24 MB/s 710 MB/s ~29×
    Compressible 96 MB 355 MB/s 6000 MB/s ~17×

    NX-842 (kernel crypto API, separate engine, via nx-compress-powernv.ko):

    Path Throughput Speedup
    Software 842 fallback ~104 MB/s
    NX-842 hardware ~11000 MB/s ~106×

    It's an artificial benchmark, but the benefit is pretty clear.

    Real-world use: mksquashfs (270 MB source tree, 32 threads)

    Input: the skiboot git repository (~270 MB of source code).

    Variant Wall time User CPU CPU% Output size
    Software 0.516s 7.87s 1561% 63.8 MB
    NX-GZIP 0.102s 0.22s 522% 67.2 MB
    Speedup ~5× ~36× -5% ratio

    NX trades ~5% compression ratio for 5× wall-time and 36× CPU reduction. At 32 threads the software path saturates all cores; NX offloads deflate to the coprocessor and frees CPU for other work. ERR_NX_TARGET_SPACE retries in the verbose log are normal for highly compressible data — the engine splits chunks, not a software fallback.

    Example Wrapper scripts

    To use, say with curl, the accelerated compression a simple wrapper script early in your path is sufficient. A thin wrapper script in ~/bin/ transparently applies LD_PRELOAD for curl that links libz.so:

    1#!/bin/sh
    2exec env LD_PRELOAD=/usr/local/lib/libnxz.so /usr/bin/curl "$@"

    Programs verified to link libz.so and confirmed working: curl, wget, cargo, ffmpeg, bsdtar, mksquashfs, unsquashfs, qemu-img, mariadb-dump, pg_dump, pg_restore, pg_basebackup.

    References

    • libnxz wiki: Enable nx-gzip on POWER9
    • power-gzip source on GitHub

    1

    Wikipedia: POWER9

    2

    kernel.org: VAS userspace API

    3

    power-gzip wiki: NX-842 compression accelerator

  • Use GUIX to fix an upstream package issue

    calendar 2023-12-25 · 5 min read · guix ppc64le openssh power9  ·
    Share: Copy link to post src pdf

    I am using a Talos II PowerPC (ppc64le) machine as my daily computer. This poses some challenges every now and then as the support for that architecture is less ubiquitous than the default x86-64. It's the price to pay for having a completely open, documented machine.

    On this system I run archpower distribution to get a linux kernel onto the machine, but I prefer to run GUIX on top of it for package management. I still need to run the native pacman package manager though, some packages are either not in GUIX yet (not so much of a problem in practice) or do not support ppc64le at all which is a bigger problem.

    There's a third category, which I'd like to write about in this post. Packages that do support the architecture, but it's not a first class citizen. With this I mean that support is there, but the PowerPC package gets less attention and testing. This is somewhat inevitable as far less people use these machines rather than a normal pc, so it's likely they won't get as much eyes.

    Depending on your distribution and package manager you can get stuck on a certain version for a package if there are limited options to upgrade or building the package locally is an effort out of your reach or time availability. Often the only option, for me anyways, is waiting for upstream to fix the situation. For this reason I still can't have packages that depend on rust in GUIX.

    Because GUIX is basically a scheme library, you can use it to define how packages get into your machine and there is usually a better option to create a solution without having to figure out all the build details of the package itself.

    One example of this is a recent ppc64le specific issue with OpenSSH.

    The problem was the release (9.6p1) which was packaged for GUIX, and got eventually into my system on a GUIX pull command. However, the package build failed, apparently only on ppc64le machines due to this issue.

    Because OpenSSH is used by many other packages, its failing build affected all those packages.

    Here's what I did in GUIX to get to the new OpenSSH version which contained a number of security fixes which I wanted to have.

    First, I created a package definition called openssh-next which takes the existing openssh package in GUIX and inherit from it in such a way that the fix for the problem outlined above will be included. In this case, take the commit just after the release from upstream.

     1  ;; Define openssh-next package which takes openssh from upstream which has the fix applied
     2  ;; See https://github.com/openssh/openssh-portable/commit/1036d77b34a5fa15e56f516b81b9928006848cbd
     3  (define-public openssh-next
     4    (let ((xcommit "1036d77b34a5fa15e56f516b81b9928006848cbd"))
     5      (package
     6        (inherit openssh)
     7        (name "openssh-next")
     8        (version "9.6p1-1")
     9        (native-inputs
    10         (list autoconf
    11               automake
    12               pkg-config))
    13        (source
    14         (origin
    15           (method git-fetch)
    16           (uri (git-reference
    17                 (url "https://github.com/openssh/openssh-portable.git")
    18                 (commit xcommit)))
    19           (file-name (git-file-name name version))
    20           (patches (search-patches "openssh-trust-guix-store-directory.patch"))
    21           (sha256
    22            (base32 "1sary1ig972l4zjvpzncf9whfp5ab8snff2fw9sy5a8pda5n2a7w")))))))

    The crux in the above snippets is the inherit line which hides all the package definition complexity and the adapted source block which takes a specific git commit from the OpenSSH repository. Another way would be to add an extra patch line in the source block which contains just the upstream fix for ppc64.

    With this new package definition, the new version can be installed, but all packages which depend on openssh are still using the original version. We want some way to go over everything that depends on openssh and replace its dependency with the new openssh-next package.

    This is where GUIX can make your life easier. The GUIX api provides a couple of ways to do this. I built it up around the functions package-input-rewriting/spec and package-mapping.

    The first takes a list of replacements, where each element of the list is a pair of a package spec and a procedure passed with a package to replace it with.

    The second is a function to apply a function to a package to apply the defined replacement to a package. I wrapped both functions to make the call for the relevant packages a bit simpler.

     1  ;; Given a package spec, Replace input `old` with `new` for that package incl. its dependents
     2  ;; Return a procedure which takes the package as parameter
     3  (define (package-input-replace old new)
     4    (package-input-rewriting/spec
     5     `((,old . ,(const (specification->package new))))))
     6
     7  ;; Apply the input replace for openssh
     8  ;; pass each package which fails to build due to openssh as dependency
     9  (define (openssh-fix package)
    10    ((package-mapping
    11      (package-input-replace "openssh" "openssh-next"))
    12     (specification->package package)))
    13
    14  ;; Two examples on what to put in the manifest
    15  (openssh-fix "gvfs")
    16  (openssh-fix "remmina")

    With this solution in place, the whole local package set builds again (takes a while though, as it often does with GUIX) and I can take advantage of the new OpenSSH release. There is a bit of 'keeping an eye on it' involved from this point on though. If upstream fixes the issue I want to take the above scheme code out of my manifest again and start using the upstream openssh package again instead of my openssh-next definition.

  • Jaguar XKR leather interior restoration

    calendar 2023-12-06 · 1 min read · jaguar car restoration video  ·
    Share: Copy link to post src pdf

    I had the leather interior of my Jaguar XKR restored last week and the supplier provided me with a video of it; in the "before/after" style. Rob did an amazing job on the whole of the interior.

    /assets/video/xkr-interior.mp4
    Impression of XKR interior restoration
  • Process DMARC reports with sieve

    calendar 2022-08-10 · 4 min read · sieve dovecot mail  ·
    Share: Copy link to post src pdf

    I get a lot of DMARC reports because I host mail for a couple of domains. Most of these mails require no attention as they are just notifications that others use one of our domains. I want to separate these mails from my normal mail workflow and auto archive them if I haven't looked at them within, say, 2 weeks.

    Doing this with sieve server-side has my preference, but apparently it's not trivial to determine the age of a message, which is the core logic needed here. Also, the processing of sieve rules is normally only during reception of messages, not ad-hoc or on some other event, although dovecot and pigeonhole have some options for this, among others the sieve-filter tool.

    I really only found one implemenation online which roughly solves the same problem I was having, but this involved more than needed I think.

    My solution consists of 3 parts:

    1. the sieve script that handles DMARC reports on reception and age-ing;
    2. use of an extension that calls an external program to evaluate expressions to determine age;
    3. a daily job that runs the sieve script in the scope of the designated folder.

    Here's the sieve script which deals with DMARC reports both in the normal INBOX flow and a special treatment after 14 days. The latter part is not automatic by dovecot on reception of emails, but triggered by a run of the sieve-filter program.

     1    require ["date","fileinto","relational","variables","environment","imap4flags",
     2             "vnd.dovecot.execute", "vnd.dovecot.environment"];
     3
     4    # Parameters
     5    set "dmarc_folder" "Folder.for.dmarc-reports";
     6    set "purge_days" "14";
     7
     8    # Move DMARC notifications when received
     9    if environment :is "vnd.dovecot.default-mailbox" "INBOX" {
    10      if anyof (
    11        header :contains "From" "dmarcreport@microsoft.com",
    12        header :contains "From" "noreply-dmarc-support@google.com",
    13        header :contains "From" "opendmarc@mail.arctype.co",
    14        header :contains "From" "opendmarc@box.euandre.org"  )
    15      {
    16        addflag "\\Seen";
    17        fileinto "${dmarc_folder}";
    18        stop;
    19      }
    20    }
    21
    22    # When running in the dmarc_folder, archive when age is <purge_days>
    23    if environment :is "vnd.dovecot.default-mailbox" "${dmarc_folder}"
    24    {
    25      if currentdate :matches "julian" "*"
    26      {
    27        # Run a simple bc expresssion to get <purge_days> ago from todays julian day
    28        execute :output "purge_date" "bc" "${1} - ${purge_days}";
    29
    30        # Compare this with Date header and archive when age reached
    31        if date :value "le" "Date" "julian" "${purge_date}"
    32        {
    33          fileinto "Trash";
    34          stop;
    35        }
    36      }
    37    }

    The first part of the sieve script just moves the mails into the dmarc-reports folder and is a normal sieve processing rule. The second part runs if the default folder is the dmarc-reports folder. If so, it uses the ext_program extension of the sieve interpreter to let the bc program evaluate the expression for the age of the message.

    This uses a tiny script in the configured sieve execute bin directory of the ext_programs extension

    1  #!/bin/sh
    2  echo ${1} | /usr/bin/bc

    which just pipes the input given by the sieve line into the bc program. On returning, stdout is put into the purge_date variable. I'm using execute because I do not need to pipe the whole message into the external program, but specify input specifically.

    With the above configuration I can set a cron job in the crontab of the vmail user to run

    1  sieve-filter -We -u <mymailaccount> \
    2               /path/to/vmail/mymailaccount/sieve/dmarc-archiver.sieve \
    3               Folder.for.dmarc-reports

    which executes the sieve script mentioned above in the IMAP folder <dmarc_folder> only.

    I'm not sure why sieve makes it so difficult to get the age of an email (unless I'm missing something). Protonmail solves this by having a custom extension 'vnd.proton.eval' which does something similar like the above, but in the scope of the sieve language itself without having to shell out to an external program explicitly. (I think; I have not seen their implementation)

    My approach above obviously has some drawbacks:

    • the bc external program is called for every mail that matches, fine for 10 or 20 I guess, but rather inefficient if the amount of matched messages is big. For now, not a problem.
    • unsure what sort of security consequences this has, the execution scope and environment is very limited, but we're still giving control to a script calling other programs.
  • Sunset on dutch beach

    calendar 2017-11-22 · 0 min read · hiking nl photo dro  ·
    Share: Copy link to post src pdf
    :inline
    • ««
    • «
    • 1
    • 2
    • 3
    • 4
    • 5
    • »
    • »»

Social timeline…

Recent Posts

  • NX-GZIP hardware acceleration on Talos II POWER9
  • Use GUIX to fix an upstream package issue
  • Jaguar XKR leather interior restoration
  • Process DMARC reports with sieve
  • Sunset on dutch beach

Tags

COBRA 82 DONOR-PARTS 30 XARAYA 22 INTEGRATION 19 CURRENT-AFFAIRS 18 GARAGE 16 REAR-SUSPENSION 14 ENGINE 12 TOOLS 12 CODING 9 FRONT-SUSPENSION 9 BRAKES 8 CHASSIS 7 INFO 7
Tag Cloud
ANIMALS1 APPLE1 BITCOIN1 BRAKES8 CAR1 CARS1 CHASSIS7 CLAWS2 COBRA82 CODING9 COMPRESSION1 CONTROL1 CSS1 CURRENT-AFFAIRS18 DONOR-PARTS30 DOVECOT1 DRO1 EMACS3 ENGINE12 EXHAUST1 FEATURED1 FINANCE1 FOSS1 FRONT1 FRONT-SUSPENSION9 GARAGE16 GEARBOX3 GPX,1 GUIX1 HARDWARE1 HIKING1 HIKING,1 HOLIDAY2 HTML1 HUNGARY1 IDEAS6 INFO7 INTEGRATION19 INTEGRATION,1 JAGUAR1 JEKYLL2 JEKYLL,1 LINUX2 MAIL1 MARKDOWN1 NL1 OOPS6 OPENOBJECT1 OPENSSH1 ORG-MODE3 ORG-MODE,1 OSX1 PEOPLE3 PERFORMANCE1 PHOTO3 PHOTOS1 POWER92 PPC64LE2 REAR-SUSPENSION14 RESTORATION1 REVISION1 REVISION-CONTROL7 RUBY1 SEGWAY1 SIEVE1 STATUSNET1 SUSPENSION1 SUUNTO1 TALOS21 THEMES1 TOOLS12 TOYS1 USA1 VIDEO1 WEB1 WORDPRESS1 XARAYA22
[A~Z][0~9]
Marcel van der Boom

Copyright © 2003-  MARCEL VAN DER BOOM. All rights reserved

to-top