Skip to content

Datasets

Endpoints for managing datasets. A dataset is a named collection of objects (files) owned by a tenant.


Create Dataset

Create a new dataset within the authenticated tenant.

POST /v1/datasets

Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}

Scope: datasets:write

Request

curl -X POST https://api.ipto.ai/v1/datasets \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "AP Invoices 2025",
    "description": "Invoice corpus for vendor dispute search",
    "source_modality": "document",
    "monetization_mode": "premium",
    "pricing_model": "demand_curve",
    "visibility": "restricted"
  }'
import requests

response = requests.post(
    "https://api.ipto.ai/v1/datasets",
    headers={"Authorization": "Bearer {token}"},
    json={
        "name": "AP Invoices 2025",
        "description": "Invoice corpus for vendor dispute search",
        "source_modality": "document",
        "monetization_mode": "premium",
        "pricing_model": "demand_curve",
        "visibility": "restricted",
    },
)
data = response.json()
const response = await fetch("https://api.ipto.ai/v1/datasets", {
  method: "POST",
  headers: {
    Authorization: "Bearer {token}",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "AP Invoices 2025",
    description: "Invoice corpus for vendor dispute search",
    source_modality: "document",
    monetization_mode: "premium",
    pricing_model: "demand_curve",
    visibility: "restricted",
  }),
});
const data = await response.json();

Request Body

Field Type Required Description
name string Yes Human-readable name for the dataset.
description string No Optional description of the dataset's contents and purpose.
source_modality string No Type of content. Default: "document".
monetization_mode string No Monetization strategy. One of open, premium, restricted. Default: "open".
pricing_model string No Pricing model applied to the dataset. Default: "demand_curve".
visibility string No Who can discover the dataset. One of private, restricted, listed. Default: "private".

Response

{
  "data": {
    "dataset_id": "dset_a1b2c3d4e5f6",
    "tenant_id": "tnt_f6e5d4c3b2a1",
    "name": "AP Invoices 2025",
    "description": "Invoice corpus for vendor dispute search",
    "source_modality": "document",
    "monetization_mode": "premium",
    "pricing_model": "demand_curve",
    "visibility": "restricted",
    "status": "active",
    "object_count": 0,
    "active_object_count": 0,
    "total_bytes": 0,
    "created_at": "2026-04-05T10:00:00Z",
    "updated_at": "2026-04-05T10:00:00Z",
    "last_ingest_at": null
  },
  "request_id": "req_ds001",
  "timestamp": "2026-04-05T10:00:00Z"
}

Response Fields

Field Type Description
dataset_id string Unique identifier for the dataset.
tenant_id string Identifier of the owning tenant.
name string Human-readable name.
description string \| null Description of the dataset.
source_modality string Content type classification.
monetization_mode string Monetization strategy.
pricing_model string Pricing model.
visibility string Visibility level (private, restricted, listed).
status string Current dataset status (e.g., active, archived).
object_count integer Total number of objects in the dataset.
active_object_count integer Number of objects that are approved and indexed.
total_bytes integer Total storage used by all objects in bytes.
created_at string ISO 8601 creation timestamp.
updated_at string ISO 8601 last-modification timestamp.
last_ingest_at string \| null ISO 8601 timestamp of the most recent object ingestion, or null.

List Datasets

List all datasets visible to the authenticated tenant.

GET /v1/datasets

Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}

Scope: datasets:read

Request

curl -X GET "https://api.ipto.ai/v1/datasets?limit=20" \
  -H "Authorization: Bearer {token}"
import requests

response = requests.get(
    "https://api.ipto.ai/v1/datasets",
    headers={"Authorization": "Bearer {token}"},
    params={"limit": 20},
)
data = response.json()
const response = await fetch(
  "https://api.ipto.ai/v1/datasets?limit=20",
  {
    headers: { Authorization: "Bearer {token}" },
  }
);
const data = await response.json();

Parameters

Field Type Required Description
cursor string No Opaque pagination cursor from a previous response.
limit integer No Maximum number of datasets to return. Default: 20.
status string No Filter by status (e.g., active, archived).
visibility string No Filter by visibility (private, restricted, listed).

Response

{
  "data": {
    "datasets": [
      {
        "dataset_id": "dset_a1b2c3d4e5f6",
        "tenant_id": "tnt_f6e5d4c3b2a1",
        "name": "AP Invoices 2025",
        "description": "Invoice corpus for vendor dispute search",
        "source_modality": "document",
        "monetization_mode": "premium",
        "pricing_model": "demand_curve",
        "visibility": "restricted",
        "status": "active",
        "object_count": 142,
        "active_object_count": 138,
        "total_bytes": 58720256,
        "created_at": "2026-04-05T10:00:00Z",
        "updated_at": "2026-04-05T12:30:00Z",
        "last_ingest_at": "2026-04-05T12:28:00Z"
      }
    ],
    "total": 1
  },
  "request_id": "req_ds002",
  "timestamp": "2026-04-05T12:30:00Z"
}

Response Fields

Field Type Description
datasets Dataset[] Array of dataset objects. See Create Dataset for field definitions.
total integer Total number of datasets matching the query.

Get Dataset

Retrieve a single dataset by ID.

GET /v1/datasets/{id}

Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}

Scope: datasets:read

Request

curl -X GET https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6 \
  -H "Authorization: Bearer {token}"
import requests

response = requests.get(
    "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
    headers={"Authorization": "Bearer {token}"},
)
data = response.json()
const response = await fetch(
  "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
  {
    headers: { Authorization: "Bearer {token}" },
  }
);
const data = await response.json();

Parameters

Field Type Required Description
id string Yes The dataset ID (path parameter).

Response

{
  "data": {
    "dataset_id": "dset_a1b2c3d4e5f6",
    "tenant_id": "tnt_f6e5d4c3b2a1",
    "name": "AP Invoices 2025",
    "description": "Invoice corpus for vendor dispute search",
    "source_modality": "document",
    "monetization_mode": "premium",
    "pricing_model": "demand_curve",
    "visibility": "restricted",
    "status": "active",
    "object_count": 142,
    "active_object_count": 138,
    "total_bytes": 58720256,
    "created_at": "2026-04-05T10:00:00Z",
    "updated_at": "2026-04-05T12:30:00Z",
    "last_ingest_at": "2026-04-05T12:28:00Z"
  },
  "request_id": "req_ds003",
  "timestamp": "2026-04-05T12:30:00Z"
}

Response Fields

See Create Dataset for the full list of dataset fields.


Update Dataset

Update mutable fields on an existing dataset. Only the fields included in the request body are modified.

PATCH /v1/datasets/{id}

Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}

Scope: datasets:write

Request

curl -X PATCH https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6 \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "AP Invoices 2025 (Updated)",
    "visibility": "listed"
  }'
import requests

response = requests.patch(
    "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
    headers={"Authorization": "Bearer {token}"},
    json={
        "name": "AP Invoices 2025 (Updated)",
        "visibility": "listed",
    },
)
data = response.json()
const response = await fetch(
  "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
  {
    method: "PATCH",
    headers: {
      Authorization: "Bearer {token}",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      name: "AP Invoices 2025 (Updated)",
      visibility: "listed",
    }),
  }
);
const data = await response.json();

Parameters

Field Type Required Description
id string Yes The dataset ID (path parameter).

Request Body

All fields are optional. Only provided fields are updated.

Field Type Required Description
name string No Updated dataset name.
description string No Updated description.
source_modality string No Updated content type.
monetization_mode string No Updated monetization strategy.
pricing_model string No Updated pricing model.
visibility string No Updated visibility level.
status string No Updated status (e.g., active, archived).

Response

{
  "data": {
    "dataset_id": "dset_a1b2c3d4e5f6",
    "tenant_id": "tnt_f6e5d4c3b2a1",
    "name": "AP Invoices 2025 (Updated)",
    "description": "Invoice corpus for vendor dispute search",
    "source_modality": "document",
    "monetization_mode": "premium",
    "pricing_model": "demand_curve",
    "visibility": "listed",
    "status": "active",
    "object_count": 142,
    "active_object_count": 138,
    "total_bytes": 58720256,
    "created_at": "2026-04-05T10:00:00Z",
    "updated_at": "2026-04-05T14:00:00Z",
    "last_ingest_at": "2026-04-05T12:28:00Z"
  },
  "request_id": "req_ds004",
  "timestamp": "2026-04-05T14:00:00Z"
}

Response Fields

See Create Dataset for the full list of dataset fields.


Delete Dataset

Delete a dataset and all of its objects. This operation is irreversible.

DELETE /v1/datasets/{id}

Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}

Scope: datasets:write

Request

curl -X DELETE https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6 \
  -H "Authorization: Bearer {token}"
import requests

response = requests.delete(
    "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
    headers={"Authorization": "Bearer {token}"},
)
data = response.json()
const response = await fetch(
  "https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
  {
    method: "DELETE",
    headers: { Authorization: "Bearer {token}" },
  }
);
const data = await response.json();

Parameters

Field Type Required Description
id string Yes The dataset ID (path parameter).

Response

{
  "data": {},
  "request_id": "req_ds005",
  "timestamp": "2026-04-05T14:05:00Z"
}

Response Fields

Returns an empty object on success.

Irreversible operation

Deleting a dataset permanently removes all associated objects, index entries, and metadata. This action cannot be undone.