ner.trainers.trainer module#

class ner.trainers.trainer.Trainer(model: Module, optimizer: Optimizer, data_collator: DataCollator, train_data: Dataset, val_data: Dataset | None = None, grad_clip_max_norm: float | None = None, use_class_weights: bool = False, class_weights: List | ndarray | Tensor | None = None, tracker: Tracker | None = None, device: device = device(type='cpu'), label_colname='NER')#

Bases: object

Creates a Trainer to train a neural network (here, FFNN or RNN).

Parameters:

modelModule: The neural model to train: NERPredictor.
optimizertorch.optim.Optimizer: The optimizer to be used in training.
data_collatorDataCollator: The data collator to collate data into batches for batched processing.
train_dataDataset: The training data (arrow format) to train the model.
val_dataOptional[Dataset], default: None: The validation data (arrow format) to evaluate the model.
grad_clip_max_normOptional[float], default: None: The maximum norm to use in gradient clipping.
use_class_weightsbool, default: True: Whether to use class weights in the loss computation.
class_weightsOptional[Union[List, np.ndarray, torch.Tensor]], default: None: The class weights to use in the loss computation. If use_class_weights = True, but class_weights are not specified, class weights are automatically computed using class distribution in the training data.
trackerOptional[Tracker], default: None: Tracker to use for logging.
devicetorch.device, default: torch.device(“cpu”): The device (e.g., cuda) to be used for training and validation.
label_colnamestr, default: “NER”: The name of the column in the arrow dataset that contains the NER labels.

static _compute_class_weights(train_data, label_colname='NER')#

Takes in the training data and computes the class weights for the loss function using class distribution in the training data.

Parameters:

train_dataDataset: The training data (arrow format) used to train the model.
label_colnamestr: The name of the column in the arrow dataset that contains the NER labels.

Returns:

class_weightsnp.ndarray: The computed class weights for the loss function (as a numpy array).

_eval_epoch(dataloader) → Dict[str, float]#

Evaluates the model for one epoch on the given dataloader.

Parameters:

dataloaderDataLoader: The (validation) Dataloader used in evaluating the trained model.

Returns:

Dict[str, float]: A dictionary of metrics, includes loss, precision, recall, accuracy, F1, and weighted-average of entity-level F1 scores.

_train_epoch(dataloader) → Dict[str, float]#

Trains the model for one epoch on the given dataloader.

Parameters:

dataloaderDataLoader: The (train) Dataloader used in training the model.

Returns:

Dict[str, float]: A dictionary of metrics, includes loss, precision, recall, accuracy, F1, and weighted-average of entity-level F1 scores.

from_checkpoint(checkpoint_path: str) → None#

Loads the epoch, model state, and optimizer state from the given path.

Parameters:

checkpoint_pathstr: Path to load the checkpoint from.

save_checkpoint(checkpoint_path: str) → None#

Saves the training epoch, model state, and optimizer state to the given path.

Parameters:

checkpoint_pathstr: Path to save the checkpoint to.

static test(test_data: Dataset, data_collator: DataCollator, model: Module, batch_size: int = 128, num_workers: int = 0, index_colname: str = 'index', device: device = device(type='cpu')) → Dict[str, List[Tuple[int]]]#

Tests the trained model on the given test (unseen) data.

Parameters:

test_dataDataset: The test data (arrow format) to test the model.
data_collatorDataCollator: The data collator to collate data into batches for batched processing.
modelModule: The neural model to test: NERPredictor.
batch_sizeint, default: 128: The batch size to be used in training and evaluation.
num_workersint, default: 0: The number of workers to use for data loading; enable multiprocess data loading by simply setting the argument num_workers to a positive value.
index_colnamestr, default: “index”: The name of the column in the arrow dataset that contains the token indices.
devicetorch.device, default: torch.device(“cpu”): The device (e.g., cuda) to be used for testing.

Returns:

Dict[str, List[Tuple[int]]]: A dictionary of named-entity spans with keys being “LOC”, “PER”, “MISC”, and “ORG”.

train_and_eval(batch_size: int = 128, num_epochs: int = 8, checkpoint_every: int = 1, num_workers: int = 0) → None#

Trains and evaluates the model for the given number of epochs.

Parameters:

batch_sizeint, default: 128: The batch size to be used in training and evaluation.
num_epochsint, default: 8: The total number of epochs to train for.
checkpoint_everyint, default: 1: The frequency (of epochs) of saving a checkpoint.
num_workersint, default: 0: The number of workers to use for data loading; enable multiprocess data loading by simply setting the argument num_workers to a positive value.