Explore the Capabilities of the Knowledge Graph API¶
Overview¶
The Knowledge Graph API in the AI Refinery SDK (AIR-SDK) empowers users to create, update, query, and visualize entity–relation graphs extracted from raw documents. These knowledge graphs can power:
- Retrieval-Augmented Generation (RAG) applications
- Multi-hop reasoning agents
- Semantic search agents
- Knowledge discovery pipelines
The API supports both:
- GraphRAG: LLM-powered entity/relation extraction
- FastGraphRAG: Lightweight NLP-based pipeline with LLM-assisted clustering and QA
Choose the right method based on your compute budget and latency constraints.
Goals¶
By the end of this tutorial, you’ll be able to:
- Construct a knowledge graph from
.txtfiles - Update the graph with new documents and elements
- Query using multiple retrieval methods (
basic,local,global,drift) - Visualize graph structures and communities
Configuration¶
1. Install AIR-SDK with Knowledge API Extras¶
2. Host Your Models¶
You must self-host your LLM and embedding models using an OpenAI-compatible endpoint such as Azure OpenAI.
AIR-deployed LLM endpoints are not supported for this API.
3. Set Environment Variables¶
Background¶
Input Formats¶
The Knowledge Graph API supports two ways of ingesting documents, depending on whether you're creating a new graph or updating an existing one:
-
build(files_path=...)- Accepts a folder containing
.txtfiles - Used to construct the initial knowledge graph from raw unstructured text
- Accepts a folder containing
-
update(docs=...)- Accepts a list of
Documentobjects, each withTextElementnodes - Used to incrementally add or modify content in an existing graph
- Accepts a list of
Query Modes¶
The Knowledge Graph API supports multiple query modes tailored to different semantic retrieval needs. Once a graph is built and updated with documents, you can use these modes to retrieve contextually relevant answers from both structured and unstructured information.
- basic: Embedding-based retrieval from raw text, similar to traditional RAG pipelines.
- local: Combines graph entities and nearby context to answer entity-specific questions.
- global: Leverages semantic clusters and high-level summaries to provide topic-wide insights.
- drift: Integrates multiple views (local, community-level, and reasoning-based) to generate comprehensive answers with contextual nuance.
Example Usage¶
In this example, we will walk through the end-to-end process of working with the Knowledge Graph API:
- Initialize the AIR client with your API credentials.
- Configure the Knowledge Graph build process, including model endpoints and chunking parameters.
- Build the knowledge graph from a folder of
.txtfiles. - Optionally update the graph by adding new
Documentobjects containing structuredTextElementnodes. - Query the graph using one of the available retrieval modes (in this case,
local). - Visualize the resulting graph to explore entities, relationships, and communities.
import os
import asyncio
from air import AsyncAIRefinery
from air.types import Document, TextElement, KnowledgeGraphConfig
load_dotenv() # loads your API_KEY from your local '.env' file
api_key=str(os.getenv("API_KEY"))
async def main():
# Initialize AIR client
client = AsyncAIRefinery(
api_key=api_key
)
# Define configuration
config = KnowledgeGraphConfig(
type="GraphRAG",
work_dir="work_dir",
api_type="azure",
llm_model="deployed-llm-model",
embedding_model="deployed-embedding-model",
chunk_size=1200,
chunk_overlap=200,
)
# Access the Knowledge Graph client
kg_client = await client.knowledge.get_graph()
kg_client.create_project(graph_config=config)
# Build the graph from raw text files
await kg_client.build(files_path="data/text_files")
# Optional: Update with a document object
docs = [
Document(
filename="sample",
file_type="pdf",
elements=[
TextElement(
id="doc-1",
text="The Sun is the star at the heart of our solar system...",
page_number=1,
element_type="text"
)
],
)
]
await kg_client.update(docs=docs)
# Query using the local graph view
answer = await kg_client.query(query="What is the Sun made of?", method="local")
print(answer)
# Visualize the graph
kg_client.visualize(max_community_size=3, community_level=-1)
if __name__ == "__main__":
asyncio.run(main())
Output Artifacts¶
Build Output¶
graph.graphml— structured graph fileoutput/entities.parquet— entity tableoutput/relations.parquet— relations tableoutput/community_reports.parquet— community analysis
Query Output¶
- Answer strings based on chosen retrieval mode
Visualization¶
Generates an SVG with:
- Node colors representing graph communities
- Edge shading representing relationship weights
Example Visualization¶

