Run Config

RunConfig

class pytorch_accelerated.run_config.TrainerRunConfig(num_epochs: int, train_per_device_batch_size: int, train_dl_kwargs: dict, eval_per_device_batch_size: int, eval_dl_kwargs: dict, gradient_accumulation_steps: int, gradient_clip_value: Number | None, train_total_batch_size: int, eval_total_batch_size: int, num_update_steps_per_epoch: int, max_num_train_steps: int | None, is_local_process_zero: bool, is_world_process_zero: bool, is_distributed: bool, mixed_precision: str, num_processes: int)[source]

An immutable dataclass holding values representing the current state of the Trainer

Parameters:

num_epochs – the number of epochs in the current training run
train_per_device_batch_size – the device size per batch used during training epochs
train_dl_kwargs – the arguments that have been used to create the training dataloader
eval_per_device_batch_size – the device size per batch used during evaluation epochs
eval_dl_kwargs – the arguments that have been used to create the evaluation dataloader
gradient_accumulation_steps – the number of gradient accumulation steps which will be used during training
gradient_clip_value – the value used to determine the threshold to clip the gradients of the model’s parameters
train_total_batch_size – the total batch size used during training
eval_total_batch_size – the total batch size used during evaluation
num_update_steps_per_epoch – the number of steps per training epoch where the model’s parameters will be updated
max_num_train_steps – the maximum number of steps to train for, if present, this will take precedence over num_epochs
is_local_process_zero – True if the current process is the main process on the current node, False otherwise
is_world_process_zero – True if the current process is the main process across all nodes, False otherwise
is_distributed – True if the trainer is set up to perform distributed training, False otherwise
mixed_precision – A string containing the type of mixed precision the trainer is set up to use, no otherwise
num_processes – the number of processes being used during training