Scaling
Milabench is able to select a batch size depending on the underlying GPU capacity.
The feature is drivent by the config/scaling.yaml
file,
which holds information about the memory usage of a given bench
given the batch size.
convnext_large-fp32:
arg: --batch-size
default: 128
model:
8: 5824.75 MiB
16: 8774.75 MiB
32: 14548.75 MiB
64: 26274.75 MiB
128: 49586.75 MiB
Auto Batch size
To enable batch resizing an environment variable can be specified. It will use the capacity inside the system.yaml configurattion file.
system:
arch: cuda
gpu:
capacity: 81920 MiB
nodes: []
MILABENCH_SIZER_AUTO=1 milabench run --system system.yaml
For better performance, a multiple constraint can be added. This will force batch size to be a multiple of 8.
MILABENCH_SIZER_MULTIPLE=8 milabench run
Batch size override
The batch size can be globally overriden
MILABENCH_SIZER_BATCH_SIZE=64 milabench run
Memory Usage Extractor
To automate batch size <=>
memory usage data gathering
a validation layer that retrieve the batch size and the memory usage
can be enabled.
In the example below, once milabench has finished running it will generate a new scaling configuration with the data extracted from the run.
export MILABENCH_SIZER_SAVE="newscaling.yaml"
MILABENCH_SIZER_BATCH_SIZE=64 milabench run