Reproducible software provisioning
for HPC and RSE using Spack

Martin Lang,¹² Henning Glawe,¹² Jehferson Mello,¹² Hans Fangohr¹²³

¹Max Planck Institute for the Structure and Dynamics of Matter, Hamburg, Germany
²Center for Free-Electron Laser Science, Hamburg, Germany
³University of Southampton, Southampton, United Kingdom

martin.lang@mpsd.mpg.de

Slides: https://s.gwdg.de/pYcb48

Outline

  • Introduction
  • Spack package manager
  • Setup for our HPC
  • Summary

Introduction

Overall aim

  • We run an HPC at our institute
  • Provide full software stack on local HPC
    • Pre-installed packages: Octopus, Python via miniforge, …
    • Toolchains (compiler + MPI + other dependencies)
    • Expose software stack via environment modules
  • Ability to install same software stack on laptop or desktop
  • Script/automate installation as much as possible

Final setup

$ module avail
--------------------- /opt_mpsd/linux-debian12/25a/lmod/Core ---------------------
   gcc/12.3.0    gcc/14.2.0            (L)   intel-oneapi-compilers/2023.2.4
   gcc/13.2.0    miniforge3/24.11.2-1        intel-oneapi-compilers/2025.0.0

------------------ /opt_mpsd/linux-debian12/25a/lmod/gcc/14.2.0 ------------------
   fftw/3.3.10       octopus-dependencies/full              py-h5py/3.12.1
   gsl/2.8           openblas/0.3.28                        python/3.11.7
   hdf5/1.14.5       openmpi/4.1.5              (L)         valgrind/3.23.0

------- /opt_mpsd/linux-debian12/25a/lmod/openmpi/4.1.5-jmeo3kl/gcc/14.2.0 -------
   berkeleygw/3.1.0         fftw/3.3.10       netcdf-fortran/4.6.1
   berkeleygw/4.0           hdf5/1.14.5       netlib-scalapack/2.2.0

Installing software is hard

  • Packages provide a variety of features, different options and syntax for almost all packages
  • Various different build systems with different syntax and features
    • plain Makefiles
    • Autotools
    • CMake
  • Version constraints on dependencies (direct and transitive)

Package managers

  • Automate installation process
  • Abstract the different build tools
  • Examples:
    • Distribution package manager (apt, rpm, …)
    • pip
    • Conda
    • Nix
    • Spack

Spack

https://spack.readthedocs.io

Spack demo:
installing and managing packages

Spack: minimal example

$ git clone https://github.com/spack/spack.git
$ cd spack && source share/spack/setup-env.sh
$ spack install zlib
[...]
==> Installing zlib-1.3.1-u37gjgxqi7fzjox4gmajk4wxj7jyisn2 [4/4]
==> No binary for zlib-1.3.1-u37gjgxqi7fzjox4gmajk4wxj7jyisn2 found: installing from source
==> Fetching https://mirror.spack.io/_source-cache/archive/9a/9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23.tar.gz
==> No patches needed for zlib
==> zlib: Executing phase: 'edit'
==> zlib: Executing phase: 'build'
==> zlib: Executing phase: 'install'
==> zlib: Successfully installed zlib-1.3.1-u37gjgxqi7fzjox4gmajk4wxj7jyisn2
  Stage: 0.24s.  Edit: 0.41s.  Build: 1.17s.  Install: 0.04s.  Post-install: 0.01s.  Total: 1.97s
[+] /spack/opt/spack/linux-debian12-zen3/gcc-12.2.0/zlib-1.3.1-u37gjgxqi7fzjox4gmajk4wxj7jyisn2
$ spack find
-- linux-debian12-zen3 / gcc@12.2.0 -----------------------------
gcc-runtime@12.2.0  glibc@2.36  gmake@4.4.1  zlib@1.3.1
==> 4 installed packages

Simple spec syntax

spack install zlib@1.2.13~optimize ^gmake@4.1 %gcc@12.2.0
$ spack info zlib
MakefilePackage:   zlib

Description:
  A free, general-purpose, legally unencumbered lossless data-compression
  library.

Homepage: https://zlib.net

Preferred version:
  1.3.1     http://zlib.net/fossils/zlib-1.3.1.tar.gz

Safe versions:
  1.3.1     http://zlib.net/fossils/zlib-1.3.1.tar.gz
  1.3       http://zlib.net/fossils/zlib-1.3.tar.gz
  1.2.13    http://zlib.net/fossils/zlib-1.2.13.tar.gz

Deprecated versions:  None

Variants:
  build_system [makefile]   generic, makefile
    Build systems supported by the package

  optimize [true]           false, true
    Enable -O2 for a more optimized lib

  pic [true]                false, true
    Produce position-independent code (for shared libs)

  shared [true]             false, true
    Enables the build of shared libraries.

Build Dependencies:
  c  cxx  gmake

Link Dependencies:
  None

Run Dependencies:
  None

Licenses:  Zlib

Multiple instances of zlib

$ spack spec -l zlib
-   yrjz5zr  zlib@1.3.1+optimize+pic+shared build_system=makefile platform=linux os=debian13 target=zen3 %c,cxx=gcc@14.2.0
-   ftt2ys2      ^gmake@4.4.1~guile build_system=generic platform=linux os=debian13 target=zen3 %c=gcc@14.2.0
...
$ spack spec -l zlib@1.2.13
-   cgmszik  zlib@1.2.13+optimize+pic+shared build_system=makefile platform=linux os=debian13 target=zen3 %c,cxx=gcc@14.2.0
-   ftt2ys2      ^gmake@4.4.1~guile build_system=generic platform=linux os=debian13 target=zen3 %c=gcc@14.2.0
$ spack spec -l zlib@1.2.13~optimize
-   5ocghed  zlib@1.2.13~optimize+pic+shared build_system=makefile platform=linux os=debian13 target=zen3 %c,cxx=gcc@14.2.0
-   ftt2ys2      ^gmake@4.4.1~guile build_system=generic platform=linux os=debian13 target=zen3 %c=gcc@14.2.0
$ spack spec -l zlib@1.2.13~optimize ^gmake@4.1
-   m6aoqx7  zlib@1.2.13~optimize+pic+shared build_system=makefile platform=linux os=debian13 target=zen3 %c,cxx=gcc@14.2.0
-   xneupdj      ^gmake@4.1~guile build_system=generic patches:=ca60bd9 platform=linux os=debian13 target=zen3 %c=gcc@14.2.0
$ spack location -i zlib@1.3.1
.../spack/opt/spack/linux-zen3/zlib-1.3.1-yrjz5zrq33hmxyud6owlnllfanicnisq

Other spack features

  • environments: group software together
  • automatic generation of module files
  • source mirror and binary cache
  • use external packages, e.g. Slurm

Python packages

$ spack list py-
py-3to2                                   py-munch
py-4suite-xml                             py-munkres
py-a2wsgi                                 py-murmurhash
py-abcpy                                  py-mutagen
py-abipy                                  py-mx
py-about-time                             py-mxfold2
py-absl-py                                py-myhdl
py-accelerate                             py-mypy
.
.
.
py-multidict                              py-zope-event
py-multiecho                              py-zope-interface
py-multipledispatch                       py-zstandard
py-multiprocess                           py-zxcvbn
py-multiqc                                pypy-bootstrap
==> 2774 packages

Defining a package: py-tomli/package.py

# Copyright Spack Project Developers. See COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)

from spack_repo.builtin.build_systems.python import PythonPackage

from spack.package import *


class PyTomli(PythonPackage):
    """Tomli is a Python library for parsing TOML.

    Tomli is fully compatible with TOML v1.0.0."""

    homepage = "https://github.com/hukkin/tomli"
    pypi = "tomli/tomli-2.0.1.tar.gz"
    git = "https://github.com/hukkin/tomli.git"

    maintainers("charmoniumq")

    license("MIT")

    version("2.0.1", sha256="de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f")
    version("1.2.2", sha256="c6ce0015eb38820eaf32b5db832dbc26deb3dd427bd5f6556cf0acac2c214fee")
    version("1.2.1", sha256="a5b75cb6f3968abb47af1b40c1819dc519ea82bcc065776a866e8d74c5ca9442")

    # https://github.com/hukkin/tomli/blob/2.0.1/pyproject.toml#L2
    depends_on("py-flit-core@3.2:3", type="build")

    # https://github.com/hukkin/tomli/blob/2.0.1/pyproject.toml#L13
    depends_on("python@3.6:", type=("build", "run"))
    depends_on("python@3.7:", type=("build", "run"), when="@2.0.1:")

Our setup

Why Spack?

  • Provides required flexibility
  • Focus on HPC (different architectures, optimised compilation)
  • Large software stack for HPC and RSE (~8500 packages)
  • Active development and active community (~1500 contributors, ~40.000 commits)
  • Distribute our own software

Required software

  • toolchains
    • compilers: gcc and intel in multiple versions
    • MPI: openmpi and intel-oneapi-mpi
    • other dependencies: ~30–40 packages depending on toolchain
  • applications
    • Python environment based on miniforge (anaconda-like)

Custom tooling

Our workflow

  • separate Spack environment per toolchain
  • group toolchains into MPSD software releases
    (git branches)
  • install full toolchain with a single command:

    $ mpsd-software install 25a gcc-14_2_0
    
  • overall ~600 packages in one software release
  • compile for 6 different (micro-)architectures available in HPC system

HPC user perspective

Challenges

  • too minimalistic/strange errors from the Spack concretizer
  • problematic mirrors (e.g. dropbox, also intel)
  • bugs
    • in the actual packages
    • in Spack's package definitions
    • in Spack
  • Spacks documentation (albeit quite good) lacks some details in some areas
  • not completely independent of OS

Summary