Scaffolding a new agent

How to use Scaffolds to bootstrap your Highlighter agents

Setup

If you haven't already, you should checkout Getting Started With Highlighter SDK

Create simple Agent

  1. hl generate agent. This will:
  • create an agents/ dir with an agent definition and a data source Capability for the data type specified in the prompts
  • create a src/<title_slug>/<capability_name>.py with a dummy implementation
  • add a ## Run Agent section to your README.md
  1. Run the command in the ## Run Agent section of the README.md

Create a new Capability and add it to the Agent

  • Install the cv2 optional dependency for image processing: pip install highlighter-sdk[cv2]
  • Download some weights from Huggingface, https://huggingface.co/SpotLab/YOLOv8Detection/blob/3005c6751fb19cdeb6b10c066185908faf66a097/yolov8n.onnx
  • Make a weights directory and move them to weights/yolov8n.onnx
  • Create a new file in src/???/capabilities/detector_capability.py
  • Follow the instructions in the below code snippet
# Update your imports
from typing import Dict, List, Optional, Tuple
from uuid import UUID

from highlighter.agent.capabilities import StreamEvent
from highlighter.agent.capabilities.image_to_enum import OnnxYoloV8
from highlighter.core.data_models import DataSample

# Add "MyPersonDetector" to the __all__, it should now have "MyPersonDetector"
# and the name of your original Capability that was generated by the
# `hl generate agent` script
# Add this class to the end of the file
# This class extends the base OnnxYoloV8 Capability by adding some default
# parameters and a log statement, you can do whatever you want later.
class MyPersonDetector(OnnxYoloV8):
    class DefaultStreamParameters(OnnxYoloV8.DefaultStreamParameters):
        num_classes: int = 1
        class_lookup: Optional[Dict[int, Tuple[UUID, str]]] = {0: (UUID(int=0), "person")}
        #conf_thresh: float = 0.1
        #nms_iou_thresh: float = 0.5
        is_absolute: bool = False
    def process_frame(self, stream, data_samples: List[DataSample]) -> Tuple[StreamEvent, Dict]:
        stream_event, result = super().process_frame(stream, data_samples)
        self.logger.info(f"processed: {data_samples[0].media_frame_index} with {len(result['annotations'][0])} annotations")
        return stream_event, result
  • Modify your agent definition
# Update the "graph" value
"graph": ["(Source MyPersonDetector ImageOverlay ImageWrite)"],
# Add this to your "elements" list
   {
      "name": "MyPersonDetector",
      "parameters": {
          "onnx_file": "weights/yolov8n.onnx"
      },
      "input": [
        {
          "name": "data_files",
          "type": "List[DataFile]"
        }
      ],
      "output": [
        {
          "name": "annotations",
          "type": "List[Annotation]"
        }
      ],
      "deploy": {
        "local": {
          "module": "my_awesome_project.capabilities"
        }
      }
    },
    {
      "name": "ImageOverlay",
      "parameters": {
      },
      "input": [
        {
          "name": "data_files",
          "type": "List[DataFile]"
        },
        {
          "name": "annotations",
          "type": "List[Annotation]"
        }
      ],
      "output": [
        {
          "name": "data_files",
          "type": "List[DataFile]"
        }
      ],
      "deploy": {
        "local": {
          "module": "highlighter.agent.capabilities.image_transforms"
        }
      }
    },
    {
      "name": "ImageWrite",
      "parameters": {
          "output_dir": "image_overlays",
          "output_pattern": "{media_frame_index}.jpg"
      },
      "input": [
        {
          "name": "data_files",
          "type": "List[DataFile]"
        }
      ],
      "output": [
      ],
      "deploy": {
        "local": {
          "module": "highlighter.agent.capabilities"
        }
      }
    }

Run Detector Agent

Run all scripts from the top level of the repo

  • Create a directory called image_overlays
# process one file
hl agent start agents/YOUR_AGENT_DEF.json VIDEO_PATH
# process one file
hl agent start agents\YOUR_AGENT_DEF.json VIDEO_PATH

Training

Training a model is done in 3 steps (4 if you want to tweak the config)

  1. create a TrainingRun in Highlighter and configure the Datasets. (You can ignore the reset of the configuration for now)

  2. generate a new training run in your scaffold directory. This will download the Dataset annotations defined in the Highlighter TrainingRun and add a directory to the ml_training directory of the scaffold.

hl generate training-run TRAINING_RUN_ID MODEL_TYPE PROJECT_DIR

# MODEL_TYPE: {yolo-det} <-- More to come

optionally you can modify the PROJECT_DIR/ml_training/TRAINING_RUN_ID/trainer.py file to customize your training

  1. start the training: hl train start TRAINING_RUN_DIR
  2. Modify the config in ml_training/TRAINING_RUN_ID/ and train again