Quickstart: Process uploaded files

Use this quickstart to go from a file on your machine to Markdown output in three steps. It relies on the file connector, which processes files you upload directly to the Data Ingestion API.

Prerequisites

TRELENT_DATA_INGESTION_API_URL pointing to your deployment
TRELENT_DATA_INGESTION_API_TOKEN token with the FILE:UPLOAD permission
A PDF, video, or Office document to ingest (for example sample.pdf)

TypeScript
Python

Installation

bun add @trelent/data-ingestion

Upload a file

Upload a file to the API. It stores the object under your account and returns a file ID that stays valid until the upload expires.

import { DataIngestionClient } from "@trelent/data-ingestion";
import { readFileSync } from "fs";

const client = new DataIngestionClient();

const fileBuffer = readFileSync("sample.pdf");
const blob = new Blob([fileBuffer], { type: "application/pdf" });

const upload = await client.uploadFile(blob, "sample.pdf", { expiresInDays: 30 });
console.log("Uploaded file:", upload.id);

Files expire after 30 days by default. Adjust expiresInDays if you need a shorter retention period.

Start a job with the file connector

Use the returned file ID to create a job. The connector definition references one or more uploaded files and tells the service to pull inputs from managed storage.

import { DataIngestionClient } from "@trelent/data-ingestion";
import type { JobInput } from "@trelent/data-ingestion";

const client = new DataIngestionClient();

const job: JobInput = {
  connector: {
    type: "file_upload",
    file_ids: [upload.id],
  },
  output: {
    type: "s3-signed-url",
    expires_minutes: 120,
  },
};

const response = await client.submitJob(job);
console.log("Job submitted:", response.job_id);

On success, the API responds with a job_id. Capture that ID for status polling.

Poll job status and retrieve outputs

Query the job status until it becomes completed. When complete, the response includes delivery pointers (signed URLs or bucket locations) for generated Markdown and images.

import { DataIngestionClient } from "@trelent/data-ingestion";

const client = new DataIngestionClient();

const status = await client.getJobStatus(response.job_id, {
  includeMarkdown: false,
  includeFileMetadata: true,
});

console.log("Status:", status.status);
if (status.delivery) {
  console.log("Deliveries:", Object.keys(status.delivery));
}

If you need to reuse uploads across multiple jobs, call listFiles() to list every file ID still within its retention window.

If a file fails to process, check status.errors for details. Errors are keyed by the input identifier (file ID).

Installation

pip install trelent-data-ingestion

Upload a file

Upload a file to the API. It stores the object under your account and returns a file ID that stays valid until the upload expires.

from pathlib import Path
from trelent_data_ingestion_sdk import DataIngestionClient

client = DataIngestionClient()

file_data = Path("sample.pdf").read_bytes()
upload = client.upload_file(file_data, "sample.pdf", content_type="application/pdf", expires_in_days=30)
print("Uploaded file:", upload.id)

Files expire after 30 days by default. Adjust expires_in_days if you need a shorter retention period.

Start a job with the file connector

Use the returned file ID to create a job. The connector definition references one or more uploaded files and tells the service to pull inputs from managed storage.

from trelent_data_ingestion_sdk import DataIngestionClient, JobInput, FileUploadConnector, S3SignedUrlOutput

client = DataIngestionClient()

job = JobInput(
    connector=FileUploadConnector(file_ids=[upload.id]),
    output=S3SignedUrlOutput(expires_minutes=120),
)

response = client.submit_job(job)
print("Job submitted:", response.job_id)

On success, the API responds with a job_id. Capture that ID for status polling.

Poll job status and retrieve outputs

Query the job status until it becomes completed. When complete, the response includes delivery pointers (signed URLs or bucket locations) for generated Markdown and images.

from trelent_data_ingestion_sdk import DataIngestionClient

client = DataIngestionClient()

status = client.get_job_status(
    response.job_id,
    include_markdown=False,
    include_file_metadata=True,
)

print("Status:", status.status)
if status.delivery:
    print("Deliveries:", list(status.delivery.keys()))

If you need to reuse uploads across multiple jobs, call list_files() to list every file ID still within its retention window.

If a file fails to process, check status.errors for details. Errors are keyed by the input identifier (file ID).

Getting started

Connectors

Configuration

Outputs

Jobs

Files

Admin

Quickstart: Process uploaded files

Prerequisites

Installation

Installation

Getting started

Connectors

Configuration

Outputs

Jobs

Files

Admin

Documentation Index

​Prerequisites

​Installation

​Installation

Prerequisites

Installation

Installation