Llm finetuning
Example: fine-tuning a language model (GPT, GPT-2, CTRL, OPT, etc.) on a text dataset.
Large chunks of the code here are taken from this example script in the transformers GitHub repository.
If you haven't already, you should definitely check out this walkthrough of that script from the HuggingFace docs.
NetworkConfig #
Configuration options related to the choice of network.
When instantiated by Hydra, this calls the target
function passed to the decorator. In this
case, this creates pulls the pretrained network weights from the HuggingFace model hub.
TokenizerConfig #
Configuration options for the tokenizer.
DatasetConfig
dataclass
#
Configuration options related to the dataset preparation.
dataset_path
instance-attribute
#
dataset_path: str
Name of the dataset "family"?
For example, to load "wikitext/wikitext-103-v1", this would be "wikitext".
LLMFinetuningExample #
Bases: LightningModule
Example of a lightning module used to fine-tune a huggingface model.
setup #
setup(stage: str)
Hook from Lightning that is called at the start of training, validation and testing.
TODO: Later perhaps we could do the preprocessing in a distributed manner like this: https://discuss.huggingface.co/t/how-to-save-datasets-as-distributed-with-save-to-disk/25674/2
configure_optimizers #
Prepare optimizer and schedule (linear warmup and decay)