Run Config
RunConfig
- class pytorch_accelerated.run_config.TrainerRunConfig(num_epochs: int, train_per_device_batch_size: int, train_dl_kwargs: dict, eval_per_device_batch_size: int, eval_dl_kwargs: dict, gradient_accumulation_steps: int, gradient_clip_value: Number | None, train_total_batch_size: int, eval_total_batch_size: int, num_update_steps_per_epoch: int, num_local_update_steps_per_epoch: int, max_num_train_steps: int | None, is_local_process_zero: bool, is_world_process_zero: bool, is_distributed: bool, mixed_precision: str, num_processes: int)[source]
An immutable dataclass holding values representing the current state of the
Trainer- Parameters:
num_epochs – the number of epochs in the current training run
train_per_device_batch_size – the device size per batch used during training epochs
train_dl_kwargs – the arguments that have been used to create the training dataloader
eval_per_device_batch_size – the device size per batch used during evaluation epochs
eval_dl_kwargs – the arguments that have been used to create the evaluation dataloader
gradient_accumulation_steps – the number of gradient accumulation steps which will be used during training
gradient_clip_value – the value used to determine the threshold to clip the gradients of the model’s parameters
train_total_batch_size – the total batch size used during training
eval_total_batch_size – the total batch size used during evaluation
num_update_steps_per_epoch – the number of steps per training epoch where the model’s parameters will be updated
num_local_update_steps_per_epoch – the number of steps per training epoch where the model’s parameters will be updated on each process
max_num_train_steps – the maximum number of steps across all update steps to train for, if present, this will take precedence over
num_epochsis_local_process_zero –
Trueif the current process is the main process on the current node,Falseotherwiseis_world_process_zero –
Trueif the current process is the main process across all nodes,Falseotherwiseis_distributed –
Trueif the trainer is set up to perform distributed training,Falseotherwisemixed_precision – A string containing the type of mixed precision the trainer is set up to use,
nootherwisenum_processes – the number of processes being used during training