ner.trainers.trainer module#

class ner.trainers.trainer.Trainer(model: Module, optimizer: Optimizer, data_collator: DataCollator, train_data: Dataset, val_data: Dataset | None = None, grad_clip_max_norm: float | None = None, use_class_weights: bool = False, class_weights: List | ndarray | Tensor | None = None, tracker: Tracker | None = None, device: device = device(type='cpu'), label_colname='NER')#

Bases: object

Creates a Trainer to train a neural network (here, FFNN or RNN).

Parameters:
modelModule

The neural model to train: NERPredictor.

optimizertorch.optim.Optimizer

The optimizer to be used in training.

data_collatorDataCollator

The data collator to collate data into batches for batched processing.

train_dataDataset

The training data (arrow format) to train the model.

val_dataOptional[Dataset], default: None

The validation data (arrow format) to evaluate the model.

grad_clip_max_normOptional[float], default: None

The maximum norm to use in gradient clipping.

use_class_weightsbool, default: True

Whether to use class weights in the loss computation.

class_weightsOptional[Union[List, np.ndarray, torch.Tensor]], default: None

The class weights to use in the loss computation. If use_class_weights = True, but class_weights are not specified, class weights are automatically computed using class distribution in the training data.

trackerOptional[Tracker], default: None

Tracker to use for logging.

devicetorch.device, default: torch.device(“cpu”)

The device (e.g., cuda) to be used for training and validation.

label_colnamestr, default: “NER”

The name of the column in the arrow dataset that contains the NER labels.

static _compute_class_weights(train_data, label_colname='NER')#

Takes in the training data and computes the class weights for the loss function using class distribution in the training data.

Parameters:
train_dataDataset

The training data (arrow format) used to train the model.

label_colnamestr

The name of the column in the arrow dataset that contains the NER labels.

Returns:
class_weightsnp.ndarray

The computed class weights for the loss function (as a numpy array).

_eval_epoch(dataloader) Dict[str, float]#

Evaluates the model for one epoch on the given dataloader.

Parameters:
dataloaderDataLoader

The (validation) Dataloader used in evaluating the trained model.

Returns:
Dict[str, float]

A dictionary of metrics, includes loss, precision, recall, accuracy, F1, and weighted-average of entity-level F1 scores.

_train_epoch(dataloader) Dict[str, float]#

Trains the model for one epoch on the given dataloader.

Parameters:
dataloaderDataLoader

The (train) Dataloader used in training the model.

Returns:
Dict[str, float]

A dictionary of metrics, includes loss, precision, recall, accuracy, F1, and weighted-average of entity-level F1 scores.

from_checkpoint(checkpoint_path: str) None#

Loads the epoch, model state, and optimizer state from the given path.

Parameters:
checkpoint_pathstr

Path to load the checkpoint from.

save_checkpoint(checkpoint_path: str) None#

Saves the training epoch, model state, and optimizer state to the given path.

Parameters:
checkpoint_pathstr

Path to save the checkpoint to.

static test(test_data: Dataset, data_collator: DataCollator, model: Module, batch_size: int = 128, num_workers: int = 0, index_colname: str = 'index', device: device = device(type='cpu')) Dict[str, List[Tuple[int]]]#

Tests the trained model on the given test (unseen) data.

Parameters:
test_dataDataset

The test data (arrow format) to test the model.

data_collatorDataCollator

The data collator to collate data into batches for batched processing.

modelModule

The neural model to test: NERPredictor.

batch_sizeint, default: 128

The batch size to be used in training and evaluation.

num_workersint, default: 0

The number of workers to use for data loading; enable multiprocess data loading by simply setting the argument num_workers to a positive value.

index_colnamestr, default: “index”

The name of the column in the arrow dataset that contains the token indices.

devicetorch.device, default: torch.device(“cpu”)

The device (e.g., cuda) to be used for testing.

Returns:
Dict[str, List[Tuple[int]]]

A dictionary of named-entity spans with keys being “LOC”, “PER”, “MISC”, and “ORG”.

train_and_eval(batch_size: int = 128, num_epochs: int = 8, checkpoint_every: int = 1, num_workers: int = 0) None#

Trains and evaluates the model for the given number of epochs.

Parameters:
batch_sizeint, default: 128

The batch size to be used in training and evaluation.

num_epochsint, default: 8

The total number of epochs to train for.

checkpoint_everyint, default: 1

The frequency (of epochs) of saving a checkpoint.

num_workersint, default: 0

The number of workers to use for data loading; enable multiprocess data loading by simply setting the argument num_workers to a positive value.