Using milabench (DEVELOPERS)

To use milabench, you need:

A YAML configuration file to define the benchmarks to install, prepare or run.
The base directory for code, virtual environments, data and outputs, set either with the $MILABENCH_BASE environment variable or the --base option. The base directory will be automatically constructed by milabench and will be organized as follows:

$MILABENCH_BASE/
|- venv/                            # Virtual environments and dependencies
|  |- bench1/                       # venv for benchmark bench1
|  |- ...                           # etc
|- code/                            # Benchmark code
|  |- bench1/                       # Code for benchmark bench1
|  |- ...                           # etc
|- data/                            # Datasets
|  |- dataset1/                     # A dataset
|  |- ...                           # etc
|- runs/                            # Outputs of benchmark runs
   |- calimero.2022-03-30_15:00:00/ # Auto-generated run name
   |  |- bench1.0.stdout            # Output for the first run of bench1
   |  |- bench1.0.stderr            # Stderr for the first run of bench1
   |  |- bench1.0.data              # Structured data for the first run of bench1
   |  |- bench1.1.stdout            # Output for the second run of bench1
   |  |- ...                        # etc
   |- blah/                         # Can set name with --run

It is possible to change the structure in the YAML to e.g. force benchmarks to all use the same virtual environment.

Important options

Use the --select option with a comma-separated list of benchmarks in order to only install/prepare/run these benchmarks (or use --exclude to run all benchmarks except a specific set).
You may use --use-current-env to force the use the currently active virtual environment.

milabench install

milabench install --config config/standard.yaml --select mybench

Installs the benchmark specified in the definition field of the benchmark’s YAML, relative to the YAML file itself.
Creates/reuses a virtual environment in $MILABENCH_BASE/venv/mybench (unless install_group is set to something different) and installs all pip dependencies in it.

milabench prepare

milabench prepare --config config/standard.yaml --select mybench

Prepares data for the benchmark into $MILABENCH_BASE/data/dataset_name. Multiple benchmarks can share the same data. Some benchmarks need no preparation, so the prepare step does nothing.
May also download model weights or preprocess data.

milabench run

milabench run --config config/standard.yaml --select mybench

Creates a certain number of tasks from the benchmark using the plan defined in the YAML. For instance, one plan might be to run it in parallel on each GPU on the machine.
The benchmark is run from that directory using a command like voir [VOIR_OPTIONS] main.py [SCRIPT_OPTIONS] * Both option groups are defined in the YAML. * The VOIR_OPTIONS determine/tweak which instruments to use and what data to forward to milabench. * The SCRIPT_OPTIONS are benchmark dependent.
Standard output/error and other data (training rates, etc.) are forwarded to the main dispatcher process and saved into $MILABENCH_BASE/runs/run_name/mybench.run_number.stdout (.stderr / .data) (the name of the directory is printed out for easy reference).

milabench pin

milabench pin --config config/standard.yaml --select mybench --variant cuda

The basic idea behind milabench pin is to pin software versions for stability and reproducibility. Using the command above, the base requirements in benchmarks/mybench/requirements.in will be saved in requirements.cuda.txt. If variant is not specified, the value of install_variant in the config file will be used (in standard.yaml, which is install_value: "{{arch}}"; that resolves to either “rocm” or “cuda” depending on the machine’s architecture).

For a given variant, the installation is also constrained by constraints/variant.txt, if the file exists. The file specifies appropriate constraints for the architecture, CUDA version, or other constraints that are specific to the environment.

You can add more constraints with --constraints path/to/constraints.txt.

milabench report

TODO.

milabench report --config config/standard.yaml --runs <path_to_runs>

milabench compare

TODO.