3. Software
3.1. SCM Installation Guidelines
3.1.1. For beginners (Docker+Jupyter)
The simplest way to start using the SCM (symbolic-compartmental-model) package is using our Docker image. This works on any OS (Windows, MacOS, Linux) and does not require any prerequisites except a Docker runner (which is free and simple to install). You can then do all the coding, including running analyses and generating plots, using your browser (via an interface called Jupyter Notebooks).
If this is what you want, simply follow the instructions at Running SCM in a JupyterLab Environment using Docker.
3.1.2. For Python novices (PyPI package)
SCM (symbolic-compartmental-model) can be installed with any recent installation of pip.
It is highly recommended to use a unique Python virtual environment. If you don’t know how to do that there are many online guides for it, such as this one.
After setting up the environment, install SCM using
pip install symbolic-compartmental-model
3.1.3. For advanced developers (cloning from git)
SCM is completely open source and community developed, hosted on GitLab. If you’d like to contribute to its development, feel free to fork, open new issues, and suggest pull requests.
3.1.4. Another advanced option (CLI)
We also have a Command Line Interface (CLI) which can be useful in some scenarios. Follow the instructions at Using the SCM Command-Line Interface.
3.2. Running SCM in a JupyterLab Environment using Docker
This section explains how to run the pre-configured JupyterLab environment on your own computer using Docker. You do not need to install Python or any dependencies — Docker takes care of everything.
3.2.1. What you need
Docker Desktop — the free application that runs Docker on your machine. Download it from https://docs.docker.com/get-docker/ and follow the installation instructions for your operating system (Windows, macOS, or Linux). After installation, start Docker Desktop and wait until it shows a green “running” status in the system tray.
The
docker-compose.ymlfile —download it hereand place it in a folder of your choice. You do not need to install anything else from the repository — this single file is enough.
3.2.2. Starting the Docker container
Open a terminal (macOS/Linux) or PowerShell / Command Prompt (Windows) and
navigate to the folder that contains docker-compose.yml:
cd /path/to/folder-with-docker-compose
Then run:
docker compose up
Docker will pull the pre-built image from Docker Hub and start JupyterLab. This may take a minute the first time while the image downloads. When you see output like:
jupyter | http://localhost:8888/lab
the server is ready.
3.2.3. Opening JupyterLab
Open your web browser and go to:
http://localhost:8888/lab
You will see the JupyterLab interface with four folders in the file browser on the left:
scripts_readonly/— example notebooks which can help you start your own.scripts/— the place to store your analysis notebooks.data/— the place for your input files.results/— where you save outputs.
The first one (scripts_readonly/) is an internal read-only directory, and can only be viewed from within JupyterLab.
The other three are created when you run Docker for the first time, and will be empty at first. They are intended
for saving your own scripts and data, as well as writing the results. Importantly, even if you stop Docker container
the files in these directories will continue to be available and you will be able to access them using your file browser.
When starting the Docker container again, the files still will be there.
3.2.4. Uploading your data
In the JupyterLab file browser, navigate to the
data/folder.Click the Upload Files button (arrow icon at the top of the file browser).
Select your CSV file from your computer.
Alternatively, you can copy files directly into the data/ folder on your
computer — they will appear inside the Docker container immediately.
3.2.5. Running a notebook
Navigate the
scripts_readonly/folder in the file browser and copy one of the examples (e.g. using Ctrl+C).Navigate to the
scripts/folder and paste the notebook there (using the mouse right-click).Double-click on the notebook (
.ipynbfile) to open it.Follow any instructions in the notebook to point it at your data file in
data/.Run the notebook by selecting Run → Run All Cells from the menu, or by pressing
Shift+Entercell by cell.
3.2.6. Saving results
Save any output files (plots, processed CSV files, etc.) to the results/
folder inside JupyterLab. Because that folders are linked to your computer, the
files will be available in the data/, scripts/, and results/ folders
next to docker-compose.yml even after the Docker container is stopped.
3.2.7. Stopping the Docker container
Press Ctrl+C in the terminal where docker compose up is running, then
run:
docker compose down
This shuts down the Docker container cleanly. Your data/, results/, scripts/ folders
are not affected and you will still be able to access them using a file browser.
3.2.8. Starting again
After the first run, subsequent starts are fast because the image is already cached locally:
docker compose up
To get the latest version of the image (e.g. after a new release), run:
docker compose pull
docker compose up
3.3. Using the SCM Command-Line Interface
scm-fit is a command-line tool for fitting a symbolic compartmental model to labeling
time-series data stored in a CSV file. It wraps the fit()
method and exposes most of its tuning knobs as command-line flags.
3.3.1. Download
Pre-built binaries for Linux x86_64 and Windows x86_64 are attached to each release (no Python installation required — all dependencies are bundled). Download them from the Releases page.
Linux:
chmod +x scm-fit-linux-x86_64
./scm-fit-linux-x86_64 --help
Windows — download scm-fit-windows-x86_64.exe and run it from a
Command Prompt or PowerShell:
scm-fit-windows-x86_64.exe --help
The binaries are produced by PyInstaller with Python 3.13 as part of the CI pipeline.
3.3.2. Quick start
scm-fit data.csv
The CSV file must contain at least two columns named time and unlabeled_fraction.
scm-fit data.csv -n 3 --niter 100 --seed 42 -o fit.pdf
3.3.3. Synopsis
scm-fit [-h] [-n N] [-g MU] [--lb LB] [--ub UB] [-o FILE]
[--check-mass-balance] [--no-basinhopping] [--no-precompile]
[--niter N] [-T T] [--stepsize S] [--niter-success N] [--seed SEED]
[--tol TOL] [--maxiter N]
csv_file
3.3.4. Positional argument
csv_filePath to a CSV file containing the labeling time series. The file must include columns named
time(time points) andunlabeled_fraction(observed fraction of unlabeled molecules at each time point). All other columns are ignored.
3.3.5. Model arguments
-n N/--num_states N(default: 2)Number of compartment states in the model, between 1 and 10. Each state represents a distinct pool with its own turnover rate constant k. Increasing the number of states allows the model to describe more complex, multi-exponential labeling dynamics, but raises the risk of over-fitting for sparse data. Use the reported BIC and ΔAIC values to compare models of different complexity.
-g MU/--growth-rate MU(default: 0.0)Specific growth rate µ (same units as the turnover rate constants, typically h⁻¹ or d⁻¹). Set this to the measured growth rate of the organism when fitting data from a growing culture. A non-zero value activates a feasibility check that penalises parameter combinations for which no positive steady-state pool-size vector exists under the given growth rate.
--lb LB(default: 0.01)Lower bound applied to all rate parameters during optimisation. Parameters are constrained to the interval
[lb, ub]. Setting a tighter lower bound (e.g.1e-4) is useful when very slow turnover is expected; raising it can speed up convergence when slow rates are biologically implausible.--ub UB(default: 10.0)Upper bound applied to all rate parameters. If the fitted rates cluster near the upper bound, consider increasing
--ubto ensure the optimiser can explore faster turnover.-o FILE/--output FILESave a plot of the data and the fitted curve to FILE. The format is inferred from the file extension (e.g.
fit.pdf,fit.png). Requiresmatplotlib.
3.3.6. Fit options
--check-mass-balance(default: off)Enforce a mass-balance constraint during optimisation. When enabled, the objective function adds a large penalty for any parameter combination that violates the mass-balance condition (total influx ≠ total efflux across all pools). This is appropriate for closed systems where no net synthesis or degradation is expected. Leaving this off (the default) allows the model to describe open systems or growing cultures where net flux may exist.
--no-basinhopping(default: off — basin-hopping is on)Disable the basin-hopping global search and fall back to a single local minimisation. Basin-hopping is a stochastic global optimisation algorithm that repeatedly perturbs the current solution and runs a local minimiser, accepting or rejecting the new point based on a Metropolis criterion. It is strongly recommended for problems with multiple local minima (which is common for multi-state compartmental models). Use
--no-basinhoppingonly when the optimisation landscape is known to be unimodal, or when a quick deterministic run is needed for scripting.--no-precompile(default: off — pre-compilation is on)Disable symbolic pre-compilation of the labeling function f*(*t). By default, the matrix exponential is computed symbolically once and then turned into a fast numerical lambda. This is approximately 20× faster than numerical matrix exponentiation at every time point. However, the symbolic approach can fail with a division-by-zero when two or more eigenvalues of the turnover matrix are numerically identical (degenerate spectrum). In that case, disable pre-compilation with this flag so that the matrix exponential is evaluated numerically at each time step, which is more robust.
3.3.7. Basin-hopping options
These options are ignored when --no-basinhopping is set.
--niter N(default: 20)Number of basin-hopping iterations. Each iteration perturbs the current solution and runs a local minimisation. Increasing
--nitergives the optimiser more opportunities to escape local minima and is recommended for problems with many parameters (e.g.-n 4or higher) or noisy data. Values of 50–200 are common for difficult fits; the trade-off is proportionally longer run time.-T T/--temperature T(default: 1.0)Temperature parameter for the Metropolis accept/reject criterion. After each local minimisation the new solution is accepted if its objective value is lower than the current best, or with probability
exp(−ΔE / T)if it is higher (whereΔEis the increase in objective value). A higher temperature makes the search more exploratory (worse solutions are accepted more readily), while a lower temperature makes it greedier. For well-scaled residuals in[0, 1], the default of 1.0 works well; if the objective values are much larger (e.g. hundreds), try increasingTproportionally so that uphill moves are still occasionally accepted.--stepsize S(default: 0.5)Initial step size for the random displacement applied between basin-hopping iterations. The step is drawn uniformly from
[−S, +S]and added to each parameter. Because parameters are bounded to[lb, ub], the effective step is clipped. A larger step size promotes exploration of distant basins but may also waste iterations on infeasible regions. Ifub − lbis much smaller than 1, reduce--stepsizeaccordingly.--niter-success N(default: disabled)Stop the basin-hopping loop early if the best objective value does not improve for N consecutive iterations. This is a convergence criterion that can substantially reduce run time when a good solution is found quickly. For example,
--niter 100 --niter-success 20will run up to 100 iterations but stop after 20 consecutive non-improving steps.--seed SEED(default: no fixed seed)Integer random seed passed to the basin-hopping random number generator. Setting a fixed seed makes runs fully reproducible. Without a seed, each run may find a slightly different solution due to the stochastic perturbations (though all should be close if
--niteris large enough).
3.3.8. Local minimizer options
--tol TOL(default: solver default)Convergence tolerance for the local minimiser that is called at each basin-hopping iteration. The exact meaning depends on the underlying solver, but generally a smaller value requires a tighter fit before declaring convergence, at the cost of more function evaluations per iteration. The default (
None) lets the solver choose an appropriate tolerance automatically.--maxiter N(default: solver default)Maximum number of iterations allowed for the local minimiser in each basin-hopping step. Increasing this can help when the local landscape is flat and the minimiser struggles to converge within the default budget. Reducing it speeds up individual steps but may produce less accurate local solutions.
3.3.9. Examples
Fit a 2-state model (default) to data.csv and display results:
scm-fit data.csv
Fit a 3-state model with 100 basin-hopping iterations and a fixed random seed, saving the result plot:
scm-fit data.csv -n 3 --niter 100 --seed 42 -o fit.pdf
Fit a growing culture (µ = 0.05 h⁻¹) with mass-balance enforcement:
scm-fit data.csv -g 0.05 --check-mass-balance
Run a quick deterministic fit without basin-hopping (useful for scripting):
scm-fit data.csv --no-basinhopping
Use numerical matrix exponentiation when the symbolic approach fails:
scm-fit data.csv --no-precompile
Run a thorough global search with early stopping and higher temperature:
scm-fit data.csv --niter 200 --niter-success 30 -T 2.0 --stepsize 1.0
3.4. Config File Generator
The interactive form below lets you design a symbolic compartmental model and
download the SBtab config file used by scm-fit-config. Fill in the model
structure, free parameters, contributed-turnover matrix, and observed pool
weights, then click Generate to download config.csv.
Once you have a config file and a data CSV, run the fit with:
scm-fit-config data.csv config.csv -o results.csv
Download the scm-fit-config pre-built binary (Linux x86_64 and Windows
x86_64, no Python required) from the
Releases page.
3.5. Fit Results Viewer
The interactive viewer below lets you inspect the output of scm-fit-config.
Upload the CSV results file to see the fitted labeling curve overlaid on your
measured data, the fitted model network, and a table of goodness-of-fit statistics.