Skip to content

Best Practices for Reproducibility

Reproducibility is one of the main reasons we use Docker for simulations. It ensures that the same code, when run tomorrow or on someone else’s system, produces the exact same result.

Here are a few simple but powerful habits to make your simulation setups fully reproducible.

1. Use Specific Image Versions

Avoid pulling images with the :latest tag — they can change over time. Always use a versioned tag so your environment stays consistent.

For example:

bash
docker pull iitrabhi/fenics_notebook:latest

This locks your setup to a known working version.

2. Keep a Dockerfile in Every Project

Even if you’re just using a base image, include a small Dockerfile that records your environment setup. This acts like a lab record for your software — anyone can rebuild your exact setup later.

Example:

dockerfile
FROM iitrabhi/fenics_notebook:2024.1
RUN pip install numpy matplotlib

Then build it using:

bash
docker build -t my_project .

3. Document Mounted Volumes and Paths

Your simulation might use mounted folders for meshes, scripts, or results. Always note these in your project’s README.md — for example:

docker run -v D:\Codes:/root/ -w /root/ iitrabhi/fenics_notebook

This ensures that anyone running your setup knows exactly where data is stored and shared.

4. Use Docker Compose for Multi-Container Workflows

If your project involves more than one container (say FEniCS, ParaView, and Jupyter), define them in a docker-compose.yml. This file captures your full environment and lets you start everything with one command:

bash
docker-compose up

It keeps your workflow portable and organized.

5. Keep Images Lean

Install only what you need for the current project. Using smaller base images (like ubuntu:20.04 or python:3.10-slim) makes builds faster and easier to share — especially when working with multiple simulations or collaborators.

6. Version Control Everything

Keep your Dockerfiles, compose files, and simulation scripts under GitHub or GitLab. This gives you a time-stamped history of your environment — perfect for revisiting old runs or tracking improvements.

7. Test Your Environment on Another Machine

The final step to confirm reproducibility: run your setup on a different system — a teammate’s laptop, a lab workstation, or even a cloud instance. If everything runs smoothly without modification, your workflow is truly reproducible.

In Short

Reproducibility isn’t about memorizing commands — it’s about clarity and consistency.

If you can rebuild your simulation environment from scratch next month — and get the same result — you’re already following the right practices.

Next, we will cover how to set up your system to run codes and simulations efficiently on your machine.