Scaling
=======

Milabench is able to select a batch size depending on the
underlying GPU capacity.

The feature is drivent by the ``config/scaling.yaml`` file, 
which holds information about the memory usage of a given bench
given the batch size.


.. code-block:: yaml

   convnext_large-fp32:
     arg: --batch-size
     default: 128
     model:
       8: 5824.75 MiB
       16: 8774.75 MiB
       32: 14548.75 MiB
       64: 26274.75 MiB
       128: 49586.75 MiB


Auto Batch size
---------------

To enable batch resizing an environment variable can be specified.
It will use the capacity inside the `system.yaml` configurattion file.

.. code-block:: yaml

   system:
     arch: cuda
     gpu:
       capacity: 81920 MiB
     nodes: []


.. code-block:: bash
    
   MILABENCH_SIZER_AUTO=1 milabench run --system system.yaml


For better performance, a multiple constraint can be added.
This will force batch size to be a multiple of 8.

.. code-block:: bash
   
   MILABENCH_SIZER_MULTIPLE=8 milabench run


Batch size override
-------------------

The batch size can be globally overriden

.. code-block:: bash

   MILABENCH_SIZER_BATCH_SIZE=64 milabench run


Memory Usage Extractor
----------------------

To automate batch size ``<=>`` memory usage data gathering
a validation layer that retrieve the batch size and the memory usage
can be enabled.

In the example below, once milabench has finished running it will
generate a new scaling configuration with the data extracted from the run.


.. code-block:: bash

   export MILABENCH_SIZER_SAVE="newscaling.yaml"
   MILABENCH_SIZER_BATCH_SIZE=64 milabench run