Skip to main content
config.video controls how the pipeline samples frames, and detects meaningful changes. Increased sensitivity will capture more images, and a lower sensitivity will capture less images.
Default fields are: screenshot_interval_seconds is 1, sensitivity is 0.1, openai_model is gpt-4.1, whisper_model is whisper-1. You only need to send fields you want to override.
import type { JobInput } from "@trelent/data-ingestion";

const job: JobInput = {
  connector: { type: "url", urls: ["https://signed.example.com/webinar.mp4"] },
  output: { type: "s3-signed-url" },
  config: {
    video: {
      screenshot_interval_seconds: 0.5,
      sensitivity: 0.15,
      openai_model: "gpt-4o-mini",
    },
  },
};

Field reference

config.video.screenshot_interval_seconds
number
default:"1"
Number of seconds between sampled frames. Lower values capture more context for fast-moving scenes; higher values save cost for static footage.
config.video.sensitivity
number
default:"0.1"
Normalized threshold for detecting visual changes. Raise it capture subtle transitions; lower it to ignore small changes.
config.video.openai_model
string
default:"gpt-4.1"
LLM used for visual reasoning and summarization. Pick a lighter model (for example, gpt-4o-mini) when you need lower latency.
config.video.whisper_model
string
default:"whisper-1"
Model used for audio transcription. Swap this if you require multilingual or domain-specific tuning. Model must support verbose_json output format.
Fine-tune how the system judges frame similarity. The sensitivity field will generate default values for these fields, however if you want more granular control you can override them manually.
config.video.tile
integer | null
default:"null"
Tile size for block-based comparisons. Leave null to use the default value tuned for general content.
config.video.mad_thresh
number | null
default:"null"
Mean absolute deviation threshold. Lower numbers capture subtle noise at the expense of more detections.
config.video.local_ssim_drop
number | null
default:"null"
Triggers when the local SSIM (structural similarity index) drops below the specified value. Helpful for scene-change detection.
config.video.max_bad_frac
number | null
default:"null"
Fraction of tiles that can cross the SSIM threshold before the frame counts as “changed.” Lower fractions make detection stricter.