Explore the Capabilities of the Knowledge Graph API¶

Overview¶

The Knowledge Graph API in the AI Refinery SDK (AIR-SDK) empowers users to create, update, query, and visualize entity–relation graphs extracted from raw documents. These knowledge graphs can power:

Retrieval-Augmented Generation (RAG) applications
Multi-hop reasoning agents
Semantic search agents
Knowledge discovery pipelines

The API supports both:

GraphRAG: LLM-powered entity/relation extraction
FastGraphRAG: Lightweight NLP-based pipeline with LLM-assisted clustering and QA

Choose the right method based on your compute budget and latency constraints.

Goals¶

By the end of this tutorial, you’ll be able to:

Construct a knowledge graph from .txt files
Update the graph with new documents and elements
Query using multiple retrieval methods (basic, local, global, drift)
Visualize graph structures and communities

Configuration¶

1. Install AIR-SDK with Knowledge API Extras¶

pip install "airefinery-sdk[knowledge]"

2. Host Your Models¶

You must self-host your LLM and embedding models using an OpenAI-compatible endpoint such as Azure OpenAI.

AIR-deployed LLM endpoints are not supported for this API.

3. Set Environment Variables¶

export KNOWLEDGE_GRAPH_API_BASE_URL=<your_base_url>
export KNOWLEDGE_GRAPH_API_KEY=<your_api_key>

Background¶

Input Formats¶

The Knowledge Graph API supports two ways of ingesting documents, depending on whether you're creating a new graph or updating an existing one:

build(files_path=...)
- Accepts a folder containing .txt files
- Used to construct the initial knowledge graph from raw unstructured text
update(docs=...)
- Accepts a list of Document objects, each with TextElement nodes
- Used to incrementally add or modify content in an existing graph

Query Modes¶

The Knowledge Graph API supports multiple query modes tailored to different semantic retrieval needs. Once a graph is built and updated with documents, you can use these modes to retrieve contextually relevant answers from both structured and unstructured information.

basic: Embedding-based retrieval from raw text, similar to traditional RAG pipelines.
local: Combines graph entities and nearby context to answer entity-specific questions.
global: Leverages semantic clusters and high-level summaries to provide topic-wide insights.
drift: Integrates multiple views (local, community-level, and reasoning-based) to generate comprehensive answers with contextual nuance.

Example Usage¶

In this example, we will walk through the end-to-end process of working with the Knowledge Graph API:

Initialize the AIR client with your API credentials.
Configure the Knowledge Graph build process, including model endpoints and chunking parameters.
Build the knowledge graph from a folder of .txt files.
Optionally update the graph by adding new Document objects containing structured TextElement nodes.
Query the graph using one of the available retrieval modes (in this case, local).
Visualize the resulting graph to explore entities, relationships, and communities.

import os
import asyncio
from air import AsyncAIRefinery
from air.types import Document, TextElement, KnowledgeGraphConfig


load_dotenv() # loads your API_KEY from your local '.env' file
api_key=str(os.getenv("API_KEY"))

async def main():
    # Initialize AIR client
    client = AsyncAIRefinery(
        api_key=api_key
    )

    # Define configuration
    config = KnowledgeGraphConfig(
        type="GraphRAG",
        work_dir="work_dir",
        api_type="azure",
        llm_model="deployed-llm-model",
        embedding_model="deployed-embedding-model",
        chunk_size=1200,
        chunk_overlap=200,
    )

    # Access the Knowledge Graph client
    kg_client = await client.knowledge.get_graph()
    kg_client.create_project(graph_config=config)

    # Build the graph from raw text files
    await kg_client.build(files_path="data/text_files")

    # Optional: Update with a document object
    docs = [
        Document(
            filename="sample",
            file_type="pdf",
            elements=[
                TextElement(
                    id="doc-1",
                    text="The Sun is the star at the heart of our solar system...",
                    page_number=1,
                    element_type="text"
                )
            ],
        )
    ]
    await kg_client.update(docs=docs)

    # Query using the local graph view
    answer = await kg_client.query(query="What is the Sun made of?", method="local")
    print(answer)

    # Visualize the graph
    kg_client.visualize(max_community_size=3, community_level=-1)

if __name__ == "__main__":
    asyncio.run(main())

Output Artifacts¶

Build Output¶

graph.graphml — structured graph file
output/entities.parquet — entity table
output/relations.parquet — relations table
output/community_reports.parquet — community analysis

Query Output¶

Answer strings based on chosen retrieval mode

Visualization¶

Generates an SVG with:

Node colors representing graph communities
Edge shading representing relationship weights

Example Visualization¶

Knowledge Graph Visualization

Knowledge Graph SVG Visualization