Conftest

Fixtures and test utilities.

This module contains PyTest fixtures that are used by tests.

How this works#

Our goal here is to make sure that the way we create networks/datasets/algorithms during tests match as closely as possible how they are created normally in a real run. For example, when running python project/main.py algorithm=image_classifier.

We achieve this like so: All the components of an experiment are created using fixtures. The first fixtures to be invoked are the ones that feed the command-line arguments given the choice of configs.

Then, the dict_config fixture is created, which is the Hydra config that is created from loading the configs with the command-line arguments. This is the same as the input to the main function: an omegaconf.DictConfig.

If there are interpolations in the configs, they are resolved and the result is the config fixture.

From there, the different components are created using the config fixture, like the datamodule, trainer, algorithm, etc.

---
title: Fixture dependency graph
---
flowchart TD
datamodule_config[
    <a href="#project.conftest.datamodule_config">datamodule_config</a>
] -- 'datamodule=A' --> command_line_arguments
algorithm_config[
    <a href="#project.conftest.algorithm_config">algorithm_config</a>
] -- 'algorithm=B' --> command_line_arguments
command_line_overrides[
    <a href="#project.conftest.command_line_overrides">command_line_overrides</a>
] -- 'seed=123' --> command_line_arguments
command_line_arguments[
    <a href="#project.conftest.command_line_arguments">command_line_arguments</a>
] -- load configs for 'datamodule=A algorithm=B seed=123' --> dict_config
dict_config[
    <a href="#project.conftest.dict_config">dict_config</a>
] -- instantiate objects from configs --> config
config[
    <a href="#project.conftest.config">config</a>
] --> datamodule & algorithm
datamodule[
    <a href="#project.conftest.datamodule">datamodule</a>
] --> algorithm
algorithm[
    <a href="#project.conftest.algorithm">algorithm</a>
] -- is used by --> some_test
algorithm & datamodule -- is used by --> some_other_test

original_datadir #

original_datadir(original_datadir: Path)

Overwrite the original_datadir fixture value to change where regression files are created.

By default, they are in a folder next to the source. Here instead we move them to a different folder to keep the source code folder as neat as we can.

TODO: The large regression files (the .npz files containing tensors) could be stored in a cache on $SCRATCH and referenced to via a symlink in the test folder. There could be some issues though if scratch is cleaned up.

algorithm_config #

algorithm_config(request: FixtureRequest) -> str | None

The algorithm config to use in the experiment, as if algorithm=<value> was passed.

This is parametrized with all the configurations for a given algorithm type when using the included tests, for example as is done in project.algorithms.image_classifier_test.

datamodule_config #

datamodule_config(request: FixtureRequest) -> str | None

The datamodule config to use in the experiment, as if datamodule=<value> was passed.

algorithm_network_config #

algorithm_network_config(
    request: FixtureRequest,
) -> str | None

The network config to use in the experiment, as in algorithm/network=<value>.

command_line_arguments #

command_line_arguments(
    algorithm_config: str | None,
    datamodule_config: str | None,
    algorithm_network_config: str | None,
    command_line_overrides: tuple[str, ...],
    request: FixtureRequest,
)

Fixture that returns the command-line arguments that will be passed to Hydra to run the experiment.

The algorithm_config, network_config and datamodule_config values here are parametrized indirectly by most tests using the project.utils.testutils.run_for_all_configs_of_type function so that the respective components are created in the same way as they would be by Hydra in a regular run.

dict_config #

dict_config(
    command_line_arguments: tuple[str, ...],
    tmp_path_factory: TempPathFactory,
) -> DictConfig

The omegaconf.DictConfig that is created by Hydra from the command-line arguments.

This fixture returns exactly what would be the input to the main function.

Any interpolations in the configs will not have been resolved at this point.

config #

config(dict_config: DictConfig) -> Config

The experiment configuration, with all interpolations resolved.

datamodule #

datamodule(
    dict_config: DictConfig,
) -> LightningDataModule | None

Fixture that creates the datamodule for the given config.

algorithm #

algorithm(
    config: Config,
    datamodule: LightningDataModule | None,
    trainer: Trainer | JaxTrainer,
    seed: int,
    device: device,
)

Fixture that creates the "algorithm" (usually a LightningModule).

seed #

seed(
    request: FixtureRequest, make_torch_deterministic: None
)

Fixture that seeds everything for reproducibility and yields the random seed used.

accelerator #

accelerator(request: FixtureRequest)

Returns the accelerator to use during unit tests.

By default, if cuda is available, returns "cuda". If the tests are run with -vvv, then also runs CPU.

devices #

devices(
    accelerator: str, request: FixtureRequest
) -> Generator[
    list[int] | int | Literal["auto"], None, None
]

Fixture that creates the 'devices' argument for the Trainer config.

Splits up the GPUs between pytest-xdist workers when using distributed testing. This isn't currently used in the CI.

TODO: Design dilemna here: Should we be parametrizing the devices command-line override and force experiments to run with this value during tests? Or should we be changing things based on this value in the config?

command_line_overrides #

command_line_overrides(
    request: FixtureRequest,
) -> tuple[str, ...]

Fixture that makes it possible to specify command-line overrides to use in a given test.

Tests that require running an experiment should use the experiment_config fixture below.

Multiple test using the same overrides will use the same experiment.

setup_with_overrides #

setup_with_overrides(
    overrides: (
        str
        | ParameterSet
        | list[str]
        | list[ParameterSet]
        | list[str | ParameterSet]
    ),
)

Configures tests to run with the hydra configs that are loaded with these command-line args.

The command-line arguments are used to create the Hydra config (the input to the main function). From there the different components (trainer, algorithm, callbacks, optionally datamodule) are created by fixtures with the same names.

This should be applied on tests that use some of these components created from Hydra configs, for example:

@setup_with_overrides("algorithm=example trainer.max_epochs=1")
def test_something(dict_config: omegaconf.DictConfig):
    '''This test receives the `dict_config` loaded from Hydra with the given overrides.'''
    assert dict_config["algorithm"]["_target_"] == "project.algorithm.image_classifier.ImageClassifier"
    assert dict_config["trainer"]["max_epochs"] == 1

make_torch_deterministic #

make_torch_deterministic()

Set torch to deterministic mode for unit tests that use the tensor_regression fixture.

pytest_runtest_makereport #

pytest_runtest_makereport(item: Function, call: CallInfo)

Used to setup the pytest.mark.incremental mark, as described in the pytest docs.

See this page

pytest_runtest_setup #

pytest_runtest_setup(item: Function)

Used to setup the pytest.mark.incremental mark, as described in this page.

pytest_generate_tests #

pytest_generate_tests(metafunc: Metafunc) -> None

Allows one to define custom parametrization schemes or extensions.

This is used to implement the parametrize_when_used mark, which allows one to parametrize an argument when it is used.

See https://docs.pytest.org/en/7.1.x/how-to/parametrize.html#how-to-parametrize-fixtures-and-test-functions