Run Config

RunConfig

class pytorch_accelerated.run_config.TrainerRunConfig(num_epochs: int, train_per_device_batch_size: int, train_dl_kwargs: dict, eval_per_device_batch_size: int, eval_dl_kwargs: dict, gradient_accumulation_steps: int, gradient_clip_value: Number | None, train_total_batch_size: int, eval_total_batch_size: int, num_update_steps_per_epoch: int, max_num_train_steps: int | None, is_local_process_zero: bool, is_world_process_zero: bool, is_distributed: bool, mixed_precision: str, num_processes: int)[source]

An immutable dataclass holding values representing the current state of the Trainer

Parameters:
  • num_epochs – the number of epochs in the current training run

  • train_per_device_batch_size – the device size per batch used during training epochs

  • train_dl_kwargs – the arguments that have been used to create the training dataloader

  • eval_per_device_batch_size – the device size per batch used during evaluation epochs

  • eval_dl_kwargs – the arguments that have been used to create the evaluation dataloader

  • gradient_accumulation_steps – the number of gradient accumulation steps which will be used during training

  • gradient_clip_value – the value used to determine the threshold to clip the gradients of the model’s parameters

  • train_total_batch_size – the total batch size used during training

  • eval_total_batch_size – the total batch size used during evaluation

  • num_update_steps_per_epoch – the number of steps per training epoch where the model’s parameters will be updated

  • max_num_train_steps – the maximum number of steps to train for, if present, this will take precedence over num_epochs

  • is_local_process_zeroTrue if the current process is the main process on the current node, False otherwise

  • is_world_process_zeroTrue if the current process is the main process across all nodes, False otherwise

  • is_distributedTrue if the trainer is set up to perform distributed training, False otherwise

  • mixed_precision – A string containing the type of mixed precision the trainer is set up to use, no otherwise

  • num_processes – the number of processes being used during training