Skip to main content
Use this quickstart to go from a file on your machine to Markdown output in three steps. It relies on the file connector, which processes files you upload directly to the Data Ingestion API.

Prerequisites

  • DATA_INGESTION_API_URL (or TRELENT_DATA_INGESTION_API_URL) pointing to your deployment
  • DATA_INGESTION_API_TOKEN (or TRELENT_DATA_INGESTION_API_TOKEN)
  • A PDF, video, or Office document to ingest (for example sample.pdf)
1

Upload a file

Upload a file to the API. It stores the object under your account and returns a file ID that stays valid until the upload expires.
  • TypeScript
  • Python
import { DataIngestionClient } from "@trelent/data-ingestion";
import { readFileSync } from "fs";

const client = new DataIngestionClient();

const fileBuffer = readFileSync("sample.pdf");
const blob = new Blob([fileBuffer], { type: "application/pdf" });

const upload = await client.uploadFile(blob, "sample.pdf", { expiresInDays: 30 });
console.log("Uploaded file:", upload.id);
Files expire after 30 days by default. Adjust expires_in_days / expiresInDays if you need a shorter retention period.
2

Start a job with the file connector

Use the returned file ID to create a job. The connector definition references one or more uploaded files and tells the service to pull inputs from managed storage.
  • TypeScript
  • Python
import { DataIngestionClient } from "@trelent/data-ingestion";
import type { JobInput } from "@trelent/data-ingestion";

const client = new DataIngestionClient();

const job: JobInput = {
  connector: {
    type: "file_upload",
    file_ids: [upload.id],
  },
  output: {
    type: "s3-signed-url",
    expires_minutes: 120,
  },
};

const response = await client.submitJob(job);
console.log("Job submitted:", response.job_id);
On success, the API responds with a job_id. Capture that ID for status polling.
3

Poll job status and retrieve outputs

Query the job status until it becomes completed. When complete, the response includes delivery pointers (signed URLs or bucket locations) for generated Markdown and images.
  • TypeScript
  • Python
import { DataIngestionClient } from "@trelent/data-ingestion";

const client = new DataIngestionClient();

const status = await client.getJobStatus(response.job_id, {
  includeMarkdown: false,
  includeFileMetadata: true,
});

console.log("Status:", status.status);
if (status.delivery) {
  console.log("Deliveries:", Object.keys(status.delivery));
}
If you need to reuse uploads across multiple jobs, call listFiles() / list_files() to list every file ID still within its retention window.
Next: learn more about the file connector and how to manage uploads at scale.