Datasets¶
Endpoints for managing datasets. A dataset is a named collection of objects (files) owned by a tenant.
Create Dataset¶
Create a new dataset within the authenticated tenant.
Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}
Scope: datasets:write
Request¶
curl -X POST https://api.ipto.ai/v1/datasets \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{
"name": "AP Invoices 2025",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "restricted"
}'
import requests
response = requests.post(
"https://api.ipto.ai/v1/datasets",
headers={"Authorization": "Bearer {token}"},
json={
"name": "AP Invoices 2025",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "restricted",
},
)
data = response.json()
const response = await fetch("https://api.ipto.ai/v1/datasets", {
method: "POST",
headers: {
Authorization: "Bearer {token}",
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "AP Invoices 2025",
description: "Invoice corpus for vendor dispute search",
source_modality: "document",
monetization_mode: "premium",
pricing_model: "demand_curve",
visibility: "restricted",
}),
});
const data = await response.json();
Request Body¶
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Human-readable name for the dataset. |
description | string | No | Optional description of the dataset's contents and purpose. |
source_modality | string | No | Type of content. Default: "document". |
monetization_mode | string | No | Monetization strategy. One of open, premium, restricted. Default: "open". |
pricing_model | string | No | Pricing model applied to the dataset. Default: "demand_curve". |
visibility | string | No | Who can discover the dataset. One of private, restricted, listed. Default: "private". |
Response¶
{
"data": {
"dataset_id": "dset_a1b2c3d4e5f6",
"tenant_id": "tnt_f6e5d4c3b2a1",
"name": "AP Invoices 2025",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "restricted",
"status": "active",
"object_count": 0,
"active_object_count": 0,
"total_bytes": 0,
"created_at": "2026-04-05T10:00:00Z",
"updated_at": "2026-04-05T10:00:00Z",
"last_ingest_at": null
},
"request_id": "req_ds001",
"timestamp": "2026-04-05T10:00:00Z"
}
Response Fields¶
| Field | Type | Description |
|---|---|---|
dataset_id | string | Unique identifier for the dataset. |
tenant_id | string | Identifier of the owning tenant. |
name | string | Human-readable name. |
description | string \| null | Description of the dataset. |
source_modality | string | Content type classification. |
monetization_mode | string | Monetization strategy. |
pricing_model | string | Pricing model. |
visibility | string | Visibility level (private, restricted, listed). |
status | string | Current dataset status (e.g., active, archived). |
object_count | integer | Total number of objects in the dataset. |
active_object_count | integer | Number of objects that are approved and indexed. |
total_bytes | integer | Total storage used by all objects in bytes. |
created_at | string | ISO 8601 creation timestamp. |
updated_at | string | ISO 8601 last-modification timestamp. |
last_ingest_at | string \| null | ISO 8601 timestamp of the most recent object ingestion, or null. |
List Datasets¶
List all datasets visible to the authenticated tenant.
Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}
Scope: datasets:read
Request¶
Parameters¶
| Field | Type | Required | Description |
|---|---|---|---|
cursor | string | No | Opaque pagination cursor from a previous response. |
limit | integer | No | Maximum number of datasets to return. Default: 20. |
status | string | No | Filter by status (e.g., active, archived). |
visibility | string | No | Filter by visibility (private, restricted, listed). |
Response¶
{
"data": {
"datasets": [
{
"dataset_id": "dset_a1b2c3d4e5f6",
"tenant_id": "tnt_f6e5d4c3b2a1",
"name": "AP Invoices 2025",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "restricted",
"status": "active",
"object_count": 142,
"active_object_count": 138,
"total_bytes": 58720256,
"created_at": "2026-04-05T10:00:00Z",
"updated_at": "2026-04-05T12:30:00Z",
"last_ingest_at": "2026-04-05T12:28:00Z"
}
],
"total": 1
},
"request_id": "req_ds002",
"timestamp": "2026-04-05T12:30:00Z"
}
Response Fields¶
| Field | Type | Description |
|---|---|---|
datasets | Dataset[] | Array of dataset objects. See Create Dataset for field definitions. |
total | integer | Total number of datasets matching the query. |
Get Dataset¶
Retrieve a single dataset by ID.
Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}
Scope: datasets:read
Request¶
Parameters¶
| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | The dataset ID (path parameter). |
Response¶
{
"data": {
"dataset_id": "dset_a1b2c3d4e5f6",
"tenant_id": "tnt_f6e5d4c3b2a1",
"name": "AP Invoices 2025",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "restricted",
"status": "active",
"object_count": 142,
"active_object_count": 138,
"total_bytes": 58720256,
"created_at": "2026-04-05T10:00:00Z",
"updated_at": "2026-04-05T12:30:00Z",
"last_ingest_at": "2026-04-05T12:28:00Z"
},
"request_id": "req_ds003",
"timestamp": "2026-04-05T12:30:00Z"
}
Response Fields¶
See Create Dataset for the full list of dataset fields.
Update Dataset¶
Update mutable fields on an existing dataset. Only the fields included in the request body are modified.
Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}
Scope: datasets:write
Request¶
const response = await fetch(
"https://api.ipto.ai/v1/datasets/dset_a1b2c3d4e5f6",
{
method: "PATCH",
headers: {
Authorization: "Bearer {token}",
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "AP Invoices 2025 (Updated)",
visibility: "listed",
}),
}
);
const data = await response.json();
Parameters¶
| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | The dataset ID (path parameter). |
Request Body¶
All fields are optional. Only provided fields are updated.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Updated dataset name. |
description | string | No | Updated description. |
source_modality | string | No | Updated content type. |
monetization_mode | string | No | Updated monetization strategy. |
pricing_model | string | No | Updated pricing model. |
visibility | string | No | Updated visibility level. |
status | string | No | Updated status (e.g., active, archived). |
Response¶
{
"data": {
"dataset_id": "dset_a1b2c3d4e5f6",
"tenant_id": "tnt_f6e5d4c3b2a1",
"name": "AP Invoices 2025 (Updated)",
"description": "Invoice corpus for vendor dispute search",
"source_modality": "document",
"monetization_mode": "premium",
"pricing_model": "demand_curve",
"visibility": "listed",
"status": "active",
"object_count": 142,
"active_object_count": 138,
"total_bytes": 58720256,
"created_at": "2026-04-05T10:00:00Z",
"updated_at": "2026-04-05T14:00:00Z",
"last_ingest_at": "2026-04-05T12:28:00Z"
},
"request_id": "req_ds004",
"timestamp": "2026-04-05T14:00:00Z"
}
Response Fields¶
See Create Dataset for the full list of dataset fields.
Delete Dataset¶
Delete a dataset and all of its objects. This operation is irreversible.
Authentication: Authorization: Bearer {token} or X-API-Key: ipto_{prefix}_{secret}
Scope: datasets:write
Request¶
Parameters¶
| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | The dataset ID (path parameter). |
Response¶
Response Fields¶
Returns an empty object on success.
Irreversible operation
Deleting a dataset permanently removes all associated objects, index entries, and metadata. This action cannot be undone.