Train an ML Capability
Prerequisites
This tutorial expects you:
- You already have labelled data in a Highlighter Assessment Process
- You are using an existing Model Template (ie: Training a model supported by Highlighter Training, not a custom model)
The steps are as follows:
- Create a capability: Define the inputs/outputs of the capability
- Create datasets: Create an immutable snapshot of the data used to train/test the model
- Configure the training run: Select the type of model you wish to train and configure it
- Schedule training:
- Inspect training metrics:
- Deploy the capability
Create a Capability
- From the Develop tab select Capabilites/Library and click New Capability
- Select a Capability Type:
-
DetectorElement
: Locates and classifies objects within an image using a bounding box or polygon. ie: Locates all cats and dogs in images and classifies the as cat or dog -BoxClassifierElement
: Performs classification on regions of an image. Typically downstream of aDetectorElement
. ie: Given a collection of cat regions produced by the above cat/dogDetectorElement
, classify if the cat is evil or not -BoxEmbedderElement
: Similar toBoxClassifierElement
but returns embeddings for the regions rather than classifications - Give it a name and description
- Refer to below for information on each tab:
DetectorElement
:- Interface:
- inputs: image
- outputs: entities
- Parameters:
- ToDo
- Model Parameters:
- All Head set to "0" for now
- Add each output of your capability to a different Position. Must start at 0 and increment by 1
- Interface:
BoxClassifierElement
:- Interface:
- inputs: image, entities
- outputs: entities
- Parameters:
- ToDo
- Model Parameters:
- All Head set to "0" for now
- Add each output of your capability to a different Position. Must start at 0 and increment by 1
- Interface:
BoxEmbedderElement
:- Interface:
- inputs: image, entities
- outputs: entities
- Parameters:
- ToDo
- Model Parameters:
- All Head set to "0" for now
- Add each output of your capability to a different Position. Must start at 0 and increment by 1
- Interface:
Create datasets
- From the Develop tab select Datasets and create an empty dataset with a useful name
- From the Develop tab select Search
- Check Only latest submissions
- Use the other fields to search for the desired data
- Click + Add to Dataset and select your dataset
Configure the training run
- From the Develop tab select Training then click Train new model
- Select the Capability (Model) whose interface this training run should follow. This will determine the output classes of the trained model and filtering that will be performed on the dataset prior to training.
- Name the Training run, select the datasets for each split. The train set cannot have any overlay with the dev or test set - Train: Used the train the model - Dev: Holdout set used to compute the metrics during training - Test: [Optional] Additional holdout set typically used to compute metrics for academic reporting.
- Select the Model Template
- Apply overrides
Schedule training
- Click Train
- If training; a. Succeeds, you will receive an email with a link to the training metrics and the training run artefact. b. Fails, you will receive an email about the failure