AI Agent Data Access¶
The problem¶
AI agents need programmatic access to curated private datasets. They cannot browse catalogs, negotiate licenses, or manually download files. They need a reliable API that provides scoped access, returns structured results, and tracks usage for billing and compliance.
Most data sources are not agent-ready. They require manual authentication flows, lack fine-grained access controls, and have no built-in mechanism for usage metering or citation tracking.
The solution¶
IPTO provides a REST API designed for machine access. Agents authenticate with scoped API keys, search across authorized datasets using hybrid retrieval, retrieve structured results with citation locators, and generate auditable billing events on every interaction.
How it works¶
1. Create an API key with scoped access¶
An administrator creates a tenant-scoped API key with specific permissions and optional dataset restrictions.
2. Search datasets¶
The agent sends a search query to the /v1/search endpoint. IPTO runs hybrid search (lexical + vector) across all authorized datasets and returns ranked results with snippets and citation locators.
3. Retrieve and cite results¶
Each returned result carries a retrieval_event_id. When the agent uses a result in its output, it records a citation event for provenance tracking and downstream billing.
4. Track usage¶
All search, retrieval, citation, and download events are logged. Administrators can review agent activity, spend, and access patterns through the buyer activity API.
Example workflow¶
Create an API key
Search datasets
Install the IPTO skill for AI agents
AI agent integration
After installing the IPTO skill, AI agents can interact with IPTO through natural language, without needing to construct API calls directly.
Create an API key
curl -X POST https://api.ipto.ai/v1/api-keys \
-H "Authorization: Bearer $SESSION_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "research-agent-prod",
"scopes": ["search:query", "usage:read"],
"dataset_access_mode": "allow_list",
"dataset_ids": ["dset_abc123", "dset_def456"]
}'
Search datasets
curl -X POST https://api.ipto.ai/v1/search \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "quarterly revenue breakdown by region",
"top_k": 10,
"retrieval_mode": "hybrid",
"include_citations": true
}'
Record a citation
import requests
BASE_URL = "https://api.ipto.ai/v1"
SESSION_TOKEN = "your_session_token"
# Create an API key
key_response = requests.post(
f"{BASE_URL}/api-keys",
headers={
"Authorization": f"Bearer {SESSION_TOKEN}",
"Content-Type": "application/json",
},
json={
"name": "research-agent-prod",
"scopes": ["search:query", "usage:read"],
"dataset_access_mode": "allow_list",
"dataset_ids": ["dset_abc123", "dset_def456"],
},
)
api_key = key_response.json()["data"]["secret"]
# Search datasets
search_response = requests.post(
f"{BASE_URL}/search",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"query": "quarterly revenue breakdown by region",
"top_k": 10,
"retrieval_mode": "hybrid",
"include_citations": True,
},
)
results = search_response.json()["data"]["results"]
# Record a citation for the top result
if results:
top_result = results[0]
requests.post(
f"{BASE_URL}/retrieval-events/{top_result['retrieval_event_id']}/cite",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"consumer_type": "agent_run",
"consumer_id": "run_20260405_001",
"search_unit_id": top_result["search_unit_id"],
},
)
const BASE_URL = "https://api.ipto.ai/v1";
const SESSION_TOKEN = "your_session_token";
// Create an API key
const keyResponse = await fetch(`${BASE_URL}/api-keys`, {
method: "POST",
headers: {
Authorization: `Bearer ${SESSION_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "research-agent-prod",
scopes: ["search:query", "usage:read"],
dataset_access_mode: "allow_list",
dataset_ids: ["dset_abc123", "dset_def456"],
}),
});
const { data: keyData } = await keyResponse.json();
const apiKey = keyData.secret;
// Search datasets
const searchResponse = await fetch(`${BASE_URL}/search`, {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
query: "quarterly revenue breakdown by region",
top_k: 10,
retrieval_mode: "hybrid",
include_citations: true,
}),
});
const { data: searchData } = await searchResponse.json();
// Record a citation for the top result
if (searchData.results.length > 0) {
const topResult = searchData.results[0];
await fetch(
`${BASE_URL}/retrieval-events/${topResult.retrieval_event_id}/cite`,
{
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
consumer_type: "agent_run",
consumer_id: "run_20260405_001",
search_unit_id: topResult.search_unit_id,
}),
}
);
}
Benefits¶
Why IPTO for AI agent data access
- Scoped access: API keys can be restricted to specific datasets and permission scopes, enforcing least-privilege access for each agent.
- Audit trail: Every search, retrieval, citation, and download is logged with timestamps, principal IDs, and billing amounts.
- Metered billing: Agents are billed per retrieval and per citation, giving operators clear cost visibility without flat-fee commitments.
- No manual intervention: Agents self-serve through the API. No human needs to approve individual searches or downloads.
- Citation tracking: Retrieval event IDs allow agents to record provenance when using results in downstream outputs.
- Hybrid search: Agents benefit from combined lexical and vector retrieval, supporting both keyword-precise and semantic queries.