Skip to content

Observability API

The Observability API provides access to logs, metrics, and distributed traces collected from your AI Refinery projects via OpenTelemetry. You can use this API through the AIRefinery or AsyncAIRefinery client.

Note: The server automatically resolves the caller's organization_id from the bearer token, so you do not need to supply it.

Overview

The AIRefinery and AsyncAIRefinery clients expose three observability sub-clients:

Sub-client Endpoint Description
client.logs /observability/logs Query application logs
client.metrics /observability/metrics Query application metrics
client.traces /observability/traces Query distributed traces

Each sub-client provides a query(**parameters) method that accepts the same parameters as the corresponding REST endpoint.

  • On AsyncAIRefinery, query() is an async method — use await.
  • On AIRefinery, query() is a synchronous blocking method.

Method: query(**parameters) → dict

Submits a POST request to the configured endpoint.

Parameters

  • **parameters: Request body fields for the target endpoint (e.g., time_window, project_name, metric). See REST Endpoints for full parameter lists.

Returns

  • dict: The parsed JSON response from the observability service.

Raises

  • httpx.HTTPStatusError: If the server returns a non-2xx status code.

Logs

Query application logs with timestamps, filterable by project, severity, and time range. Logs capture request handling, authentication flows, system interactions, and external dependency behavior.

Asynchronous Log Retrieval

AsyncAIRefinery.logs.query()

Parameters:
  • project_name (string, Optional): Project name to filter logs.
  • severity (string, Optional): Filter by severity level: debug, info, warning, error.
  • time_window (string, Optional): Time range for logs (e.g., '5m', '1h', '24h'). Default: '24h'.
  • limit (integer, Optional): Maximum number of log entries to return. Default: 500.
Example Usage:
import asyncio
import os

from air import AsyncAIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


async def get_logs():
    client = AsyncAIRefinery(api_key=api_key)

    # Fetch recent error-level logs for a specific project
    response = await client.logs.query(
        project_name="project-x",
        severity="error",
        time_window="30m",
        limit=100,
    )
    print(response)


if __name__ == "__main__":
    asyncio.run(get_logs())

Synchronous Log Retrieval

AIRefinery.logs.query()

This method supports the same parameters and return structure as the asynchronous method (AsyncAIRefinery.logs.query()) described above.

Example Usage:
import os

from air import AIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


def get_logs():
    client = AIRefinery(api_key=api_key)

    # Fetch recent logs for a specific project
    response = client.logs.query(
        project_name="project-x",
        time_window="30m",
        limit=100,
    )
    print(response)


if __name__ == "__main__":
    get_logs()

Metrics

Query application metrics covering inference performance, agent operations, token consumption, RAI compliance, and session analytics. For a complete list of available metrics, see Metrics & Traces Reference.

Asynchronous Metrics Retrieval

AsyncAIRefinery.metrics.query()

Parameters:
  • metric (string, Required): Metric name from the Metrics & Traces Reference (e.g., 'token_consumption', 'agent_task_total').
  • project_name (string, Optional): Project name to filter metrics.
  • agent_name (string, Optional): Agent name to filter metrics.
  • agent_class (string, Optional): Agent class to filter metrics (e.g., 'ToolUseAgent', 'SearchAgent').
  • model_key (string, Optional): Model identifier for inference metrics.
  • session_id (string, Optional): Session ID to filter metrics to a specific user session.
  • status (string, Optional): Status filter (e.g., 'success', 'failure', 'timeout'). Default: 'success'.
  • category (string, Optional): RAI rejection category: harassment, hate, self-harm, sexual, violence, illicit.
  • percentile (string, Optional): Percentile for latency metrics (e.g., '0.50', '0.95'). Default: '0.95'.
  • time_window (string, Optional): Time range (e.g., '5m', '1h', '24h'). Default: '1h'.
  • step (string, Optional): Bucket interval for time-series output (e.g., '15m', '1h'). Default: '1h'.
Example Usage:
import asyncio
import os

from air import AsyncAIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


async def get_metrics():
    client = AsyncAIRefinery(api_key=api_key)

    # Fetch token consumption metrics
    response = await client.metrics.query(
        metric="token_consumption",
        project_name="project-x",
        time_window="1h",
    )
    print(response)


if __name__ == "__main__":
    asyncio.run(get_metrics())

Synchronous Metrics Retrieval

AIRefinery.metrics.query()

This method supports the same parameters and return structure as the asynchronous method (AsyncAIRefinery.metrics.query()) described above.

Example Usage:
import os

from air import AIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


def get_metrics():
    client = AIRefinery(api_key=api_key)

    # Fetch agent task metrics for a specific session
    response = client.metrics.query(
        metric="agent_task_total",
        project_name="project-x",
        session_id="sess-abc123",
        time_window="1h",
    )
    print(response)


if __name__ == "__main__":
    get_metrics()

Traces

Query distributed traces across AI Refinery services, enabling you to inspect agent workflows, identify performance bottlenecks, and debug cross-service interactions. For available trace presets, see Metrics & Traces Reference.

Asynchronous Trace Retrieval

AsyncAIRefinery.traces.query()

Parameters:
  • trace (string, Required): Trace name from the Metrics & Traces Reference (e.g., 'inference_traces', 'distiller_traces').
  • project_name (string, Optional): Project name to filter traces.
  • trace_id (string, Optional): Specific trace ID to retrieve.
  • time_window (string, Optional): Time range for query (e.g., '5m', '1h', '24h').
  • detail (boolean, Optional): Whether to include detailed trace information. Default: true.
  • limit (integer, Optional): Maximum number of traces to return. Default: 100.
Example Usage:
import asyncio
import os

from air import AsyncAIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


async def get_traces():
    client = AsyncAIRefinery(api_key=api_key)

    # Fetch distiller traces
    response = await client.traces.query(
        trace="distiller_traces",
        project_name="project-x",
        time_window="1h",
    )
    print(response)


if __name__ == "__main__":
    asyncio.run(get_traces())

Synchronous Trace Retrieval

AIRefinery.traces.query()

This method supports the same parameters and return structure as the asynchronous method (AsyncAIRefinery.traces.query()) described above.

Example Usage:
import os

from air import AIRefinery
from dotenv import load_dotenv


load_dotenv()

api_key = str(os.getenv("API_KEY"))


def get_traces():
    client = AIRefinery(api_key=api_key)

    # Fetch inference traces
    response = client.traces.query(
        trace="inference_traces",
        time_window="1h",
    )
    print(response)


if __name__ == "__main__":
    get_traces()