Observability OTEL Endpoints Documentation¶
This documentation provides an overview of the OpenTelemetry (OTel)-based observability endpoints within AI Refinery. These endpoints enable you to query logs, metrics, and distributed traces from your AI applications. You can access telemetry data through direct API calls to monitor application performance, debug issues, and gain insights into your AI workloads.
Note: To use the Observability APIs, set the environment variable
USE_AIR_API_V2_BASE_URL=Truein your SDK environment. Queries to the observability endpoints will then usehttps://api-prod-k8s.airefinery.accenture.com/. This feature is available starting from SDK version 1.25.0. This is a temporary setup, and we will transition to the regular URL soon.
Overview¶
We provide access to three types of telemetry data, Logs, Metrics, and Traces , collected via OpenTelemetry. Thus, we have the following endpoints each for the corresponding telemetry data:
-
/logs- Query Loki for AIRefinery logs- Logs, stored in Loki, capture time-stamped records of discrete events for debugging and auditing.
-
/metrics- Query Prometheus for AIRefinery metrics- Metrics, stored in Prometheus, aggregate numerical measurements over time for monitoring performance trends.
-
/traces- Query Tempo for AIRefinery traces- Traces, stored in Tempo, track request flows across AIRefinery services for identifying agent workflows and dependencies.
All endpoints support two-scope filtering:
-
Organization-level: Filter by
organization_id(returns data for all projects) -
Project-level: Filter by
project_name(returns data for specific project)
Authentication¶
All endpoints require authentication, just like other AIRefinery services. A bearer access token in the request header is required for sdk version higher than 1.13.0.
Additionally, the
organization_idfrom the request is enforced. Tenants from each organization can only access observability data within their organization.
POST /observability/logs¶
Query Loki for AIRefinery logs. Users can view application logs with timestamps, filterable by labels and time range. These logs capture request handling, authentication flows, system interactions, and external dependency behavior, helping diagnose runtime issues and system health.
Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
organization_id |
string | No | Organization ID to filter logs (indexed as Loki label) |
project_name |
string | No | Project name to filter logs (indexed as Loki label) |
time_window |
string | No | Time range for logs (e.g., '5m', '1h', '24h'). Default: '24h' |
limit |
integer | No | Maximum number of log entries to return. Default: 500 |
Example Usage¶
Get logs for a specific organization within 1 hr:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/logs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{"organization_id": "org-123", "time_window": "1h"}'
Get 100 logs for a specific project within 30min:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/logs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"organization_id": "org-123",
"project_name": "project-x",
"time_window": "30m",
"limit": 100
}'
POST /observability/metrics¶
Query Prometheus for application metrics. This endpoint provides access to a series of metrics covering inference performance, agent operations, token consumption, RAI compliance, and session analytics. For a complete list of available metrics and their descriptions, see the configuration of observability data retrieval.
Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
metric |
string | Yes | Metric name from the configuration of observability data retrieval. (e.g., 'token_consumption', 'agent_task_total') |
organization_id |
string | No | Organization ID to filter metrics |
project_name |
string | No | Project name to filter metrics |
agent_name |
string | No | Agent name to filter metrics (for agent metrics) |
agent_class |
string | No | Agent class to filter metrics (e.g., 'ToolUseAgent', 'SearchAgent'). Useful for aggregating across all agents of a given type |
model_key |
string | No | Model identifier for inference metrics |
status |
string | No | Status filter for agent metrics (e.g., 'success', 'failure', 'timeout'). Default: 'success' |
category |
string | No | RAI rejection category filter: harassment, hate, self-harm, sexual, violence, illicit |
percentile |
string | No | Percentile for latency/distribution metrics (e.g., '0.50', '0.95', or '50', '95'). Default: '0.95' |
time_window |
string | No | Time range for rate/increase queries (e.g., '5m', '1h', '24h'). Default: '1h' |
step |
string | No | Bucket interval for time-series output (e.g., '15m', '1h', '1d'). When provided, returns matrix data over time. Default: '1h' |
Example Usage¶
Token consumption metrics (organization-level):
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "token_consumption",
"organization_id": "org-123",
"time_window": "1h"
}'
Agent task metrics (project-level):
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "agent_task_total",
"organization_id": "org-123",
"project_name": "project-x",
"time_window": "1h"
}'
Agent metrics filtered by agent class:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "agent_task_total",
"organization_id": "org-123",
"agent_class": "ToolUseAgent",
"time_window": "1h"
}'
Inference latency at p95 (default):
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "inference_latency",
"organization_id": "org-123",
"time_window": "1h"
}'
Inference latency at p50:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "inference_latency",
"organization_id": "org-123",
"time_window": "1h",
"percentile": "0.50"
}'
Token consumption filtered by agent:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "token_consumption",
"organization_id": "org-123",
"agent_name": "orchestrator",
"time_window": "1h"
}'
RAI rejection total filtered by category:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "rai_rejection_total",
"organization_id": "org-123",
"category": "harassment",
"time_window": "1h"
}'
Time-series token consumption (for charting):
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"metric": "token_consumption",
"organization_id": "org-123",
"time_window": "24h",
"step": "1h"
}'
POST /observability/traces¶
Query Tempo for distributed traces using trace definitions from the configuration of observability data retrieval. This endpoint provides access to request traces across AIRefinery services, enabling you to inspect agent workflows, identify performance bottlenecks, and debug cross-service interactions.
Parameters:¶
| Parameter | Type | Required | Description |
|---|---|---|---|
trace |
string | Yes | Trace name from the configuration of observability data retrieval. (e.g., 'inference_traces', 'distiller_traces') |
organization_id |
string | Yes | Organization ID to filter traces |
project_name |
string | No | Project name to filter traces |
trace_id |
string | No | Specific trace ID to retrieve |
time_window |
string | No | Time range for query (e.g., '5m', '1h', '24h') |
detail |
boolean | No | Whether to include detailed trace information. Default: true |
limit |
integer | No | Maximum number of traces to return. Default: 100 |
Example Usage¶
Organization-level inference traces:
curl -X POST "https://api-prod-k8s.airefinery.accenture.com/observability/traces" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"trace": "inference_traces",
"organization_id": "org-123",
"time_window": "1h"
}'
Project-level distiller traces:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"trace": "distiller_traces",
"organization_id": "org-123",
"project_name": "project-x",
"time_window": "30m"
}'
Get specific trace by ID:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"trace": "inference_traces",
"organization_id": "org-123",
"trace_id": "abc123def456"
}'
Search without detailed trace data:
curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <api-key>" \
-H "sdk_version: <sdk-version>" \
-d '{
"trace": "inference_traces",
"organization_id": "org-123",
"detail": false,
"limit": 50
}'
Notes¶
- The
organization_idfrom authentication is required in request payload to restrict access to observability data within the organization.- Time windows support units: 'm' (minutes), 'h' (hours), 'd' (days). Default: '1h'.
- The
percentileparameter accepts values in 0–1 format (e.g.,0.95) or 1–100 format (e.g.,95). Default:0.95(p95).- Any metric preset can return time-series data by passing the
stepparameter (e.g.,"step": "15m"). This returns matrix data with multiple data points over time, wheretime_windowcontrols the lookback period andstepcontrols the bucket interval.- The
agent_classfilter is available on all agent-related metrics, letting you aggregate by agent type (e.g.,ToolUseAgent,SearchAgent) instead of individual agent names.- The
statusfilter defaults tosuccessforagent_performance_rate. Passfailureortimeoutto query other status rates.- The
categoryfilter onrai_rejection_totallets you narrow down rejections to a specific category (e.g.,harassment,hate,violence).