Observability Router Configuration¶
Note: To use the Observability APIs, set the environment variable
USE_AIR_API_V2_BASE_URL=Truein your SDK environment. Queries to the observability endpoints will then usehttps://api-prod-k8s.airefinery.accenture.com/. This feature is available starting from SDK version 1.25.0. This is a temporary setup, and we will transition to the regular URL soon.
This document describes the metrics and traces defined in the observability router configuration. These definitions provide parameterized query templates for monitoring AIRefinery inference services, agent workflows, and user sessions, enabling access to common telemetry patterns without writing raw PromQL or TraceQL queries.
Metrics¶
Inference Metrics¶
inference_requests_total
- Total number of inference requests across filtered dimensions.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
average_response_time
- Average response time calculated over a specified time window.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
- `time_window` (required)
active_model_count
- Number of distinct models that have received requests.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
inference_request_rate_min
- Request rate per minute over a specified time window.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
- `time_window` (required)
model_usage_trend
- Per-model usage rate trend over time.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
- `time_window` (required)
response_time_performance
- 95th percentile response time performance.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
- `time_window` (required)
token_consumption
- 95th percentile token consumption.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `model_key` (optional)
- `time_window` (required)
Agent Metrics¶
agent_task_total
- Total agent tasks broken down by agent name and status (success/failure/timeout).
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
agent_success_rate
- Percentage of successful agent tasks.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
agent_task_throughput
- Agent task completion rate in tasks per second.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
- `time_window` (required)
agent_task_duration_p50 / agent_task_duration_p95
- 50th and 95th percentile task duration for agents.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
- `time_window` (required)
agent_dependency_calls
- Count of external dependency calls broken down by agent, API type, and source.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
agent_tool_calls
- Count of tool calls broken down by agent, API type, and tool name.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
agent_messages
- Inter-agent message counts by sender and receiver.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
agent_orchestration_overhead_avg / agent_orchestration_overhead_p50
- Average and 50th percentile orchestration overhead ratio.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `agent_name` (optional)
- `time_window` (required for p50)
Session Metrics¶
sessions_total
- Total number of sessions started.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
sessions_active
- Number of currently active sessions.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
session_duration_avg
- Average session duration in seconds.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `time_window` (required)
session_duration_p95
- 95th percentile session duration in seconds.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `time_window` (required)
session_requests_total
- Total requests processed within sessions.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
session_requests_rate
- Session request rate in requests per second.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
- `time_window` (required)
Traces¶
inference_traces
- Traces for inference service requests.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
distiller_traces
- Traces for distiller service operations.
Parameters:
- `organization_id` (required)
- `project_name` (optional)
Notes¶
- Required parameters: Must be provided for the query to work
- Optional parameters: Use
.*or.+for wildcard matching when omitted- Regex patterns: Use
=~operator for pattern matching (e.g.,project_name=~"prod.*")- Time windows: Use Prometheus duration format (e.g., "5m", "1h", "24h")