Skip to content

Observability REST Endpoints

This page documents the REST endpoints for querying observability data from AI Refinery. These OpenTelemetry (OTel)-based endpoints enable you to query logs, metrics, and distributed traces from your AI applications through direct API calls.

Note: To use the Observability APIs, set the environment variable USE_AIR_API_V2_BASE_URL=True in your SDK environment. Queries to the observability endpoints will then use https://api-prod-k8s.airefinery.accenture.com/. This feature is available starting from SDK version 1.25.0.

For SDK-based access, see Observability API.

Overview

We provide access to three types of telemetry data, Logs, Metrics, and Traces , collected via OpenTelemetry. Thus, we have the following endpoints each for the corresponding telemetry data:

  • /logs - Query AIRefinery logs

    • Logs capture time-stamped records of discrete events for debugging and auditing.
  • /metrics - Query AIRefinery metrics

    • Metrics aggregate numerical measurements over time for monitoring performance trends.
  • /traces - Query AIRefinery traces

    • Traces track request flows across AIRefinery services for identifying agent workflows and dependencies.

All endpoints support two-scope filtering:

  • Organization-level: Filter by organization_id (returns data for all projects)

  • Project-level: Filter by project_name (returns data for specific project)

Authentication

All endpoints require authentication, just like other AIRefinery services. A bearer access token in the request header is required for sdk version higher than 1.13.0.

-H "Authorization: Bearer <api-key>"

Additionally, the organization_id from the request is enforced. Tenants from each organization can only access observability data within their organization.


POST /observability/logs

Query AIRefinery logs. Users can view application logs with timestamps, filterable by labels and time range. These logs capture request handling, authentication flows, system interactions, and external dependency behavior, helping diagnose runtime issues and system health.

Parameters:

Parameter Type Required Description
organization_id string No Organization ID to filter logs. Auto-resolved from bearer token if omitted
project_name string No Project name to filter logs
severity string No Filter logs by severity level: debug, info, warning, error
time_window string No Time range for logs (e.g., '5m', '1h', '24h'). Default: '24h'
limit integer No Maximum number of log entries to return. Default: 500

Example Usage

Get logs for a specific organization within 1 hr:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/logs \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
    -d '{"organization_id": "org-123", "time_window": "1h"}'

Get 100 logs for a specific project within 30min:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/logs \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "organization_id": "org-123",
    "project_name": "project-x",
    "time_window": "30m",
    "limit": 100
  }'


POST /observability/metrics

Query application metrics. This endpoint provides access to a series of metrics covering inference performance, agent operations, token consumption, RAI compliance, and session analytics. For a complete list of available metrics and their descriptions, see the configuration of observability data retrieval.

Parameters:

Parameter Type Required Description
metric string Yes Metric name from the configuration of observability data retrieval. (e.g., 'token_consumption', 'agent_task_total')
organization_id string No Organization ID to filter metrics. Auto-resolved from bearer token if omitted
project_name string No Project name to filter metrics
agent_name string No Agent name to filter metrics (for agent metrics)
agent_class string No Agent class to filter metrics (e.g., 'ToolUseAgent', 'SearchAgent'). Useful for aggregating across all agents of a given type
model_key string No Model identifier for inference metrics
session_id string No Session ID to filter metrics to a specific user session
status string No Status filter for agent metrics (e.g., 'success', 'failure', 'timeout'). Default: 'success'
category string No RAI rejection category filter: harassment, hate, self-harm, sexual, violence, illicit
percentile string No Percentile for latency/distribution metrics (e.g., '0.50', '0.95', or '50', '95'). Default: '0.95'
time_window string No Time range for rate/increase queries (e.g., '5m', '1h', '24h'). Default: '1h'
step string No Bucket interval for time-series output (e.g., '15m', '1h', '1d'). When provided, returns matrix data over time. Default: '1h'

Example Usage

Token consumption metrics (organization-level):

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "token_consumption",
    "organization_id": "org-123",
    "time_window": "1h"
  }'

Agent task metrics (project-level):

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "agent_task_total",
    "organization_id": "org-123",
    "project_name": "project-x",
    "time_window": "1h"
  }'

Agent metrics filtered by agent class:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "agent_task_total",
    "organization_id": "org-123",
    "agent_class": "ToolUseAgent",
    "time_window": "1h"
  }'

Inference latency at p95 (default):

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "inference_latency",
    "organization_id": "org-123",
    "time_window": "1h"
  }'

Inference latency at p50:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "inference_latency",
    "organization_id": "org-123",
    "time_window": "1h",
    "percentile": "0.50"
  }'

Token consumption filtered by agent:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "token_consumption",
    "organization_id": "org-123",
    "agent_name": "orchestrator",
    "time_window": "1h"
  }'

RAI rejection total filtered by category:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "rai_rejection_total",
    "organization_id": "org-123",
    "category": "harassment",
    "time_window": "1h"
  }'

Time-series token consumption (for charting):

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/metrics \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "metric": "token_consumption",
    "organization_id": "org-123",
    "time_window": "24h",
    "step": "1h"
  }'


POST /observability/traces

Query distributed traces using trace definitions from the configuration of observability data retrieval. This endpoint provides access to request traces across AIRefinery services, enabling you to inspect agent workflows, identify performance bottlenecks, and debug cross-service interactions.

Parameters:

Parameter Type Required Description
trace string Yes Trace name from the configuration of observability data retrieval. (e.g., 'inference_traces', 'distiller_traces')
organization_id string No Organization ID to filter traces. Auto-resolved from bearer token if omitted
project_name string No Project name to filter traces
trace_id string No Specific trace ID to retrieve
time_window string No Time range for query (e.g., '5m', '1h', '24h')
detail boolean No Whether to include detailed trace information. Default: true
limit integer No Maximum number of traces to return. Default: 100

Example Usage

Organization-level inference traces:

curl -X POST "https://api-prod-k8s.airefinery.accenture.com/observability/traces" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "trace": "inference_traces",
    "organization_id": "org-123",
    "time_window": "1h"
  }'

Project-level distiller traces:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "trace": "distiller_traces",
    "organization_id": "org-123",
    "project_name": "project-x",
    "time_window": "30m"
  }'

Get specific trace by ID:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "trace": "inference_traces",
    "organization_id": "org-123",
    "trace_id": "abc123def456"
  }'

Search without detailed trace data:

curl -X POST https://api-prod-k8s.airefinery.accenture.com/observability/traces \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key>" \
  -H "sdk_version: <sdk-version>" \
  -d '{
    "trace": "inference_traces",
    "organization_id": "org-123",
    "detail": false,
    "limit": 50
  }'


Notes

  • The organization_id is automatically resolved from the bearer token in the Authorization header. You do not need to include it in the request body — the server enforces tenant isolation by injecting the authenticated organization ID.
  • Time windows support units: 'm' (minutes), 'h' (hours), 'd' (days). Default: '1h'.
  • The percentile parameter accepts values in 0–1 format (e.g., 0.95) or 1–100 format (e.g., 95). Default: 0.95 (p95).
  • Any metric preset can return time-series data by passing the step parameter (e.g., "step": "15m"). This returns matrix data with multiple data points over time, where time_window controls the lookback period and step controls the bucket interval.
  • The agent_class filter is available on all agent-related metrics, letting you aggregate by agent type (e.g., ToolUseAgent, SearchAgent) instead of individual agent names.
  • The status filter defaults to success for agent_performance_rate. Pass failure or timeout to query other status rates.
  • The severity filter on /logs accepts values: debug, info, warning, error.
  • The session_id filter is available on most metrics for narrowing results to a specific user session.
  • The category filter on rai_rejection_total lets you narrow down rejections to a specific category (e.g., harassment, hate, violence).