Model Catalog¶
Our comprehensive model catalog provides a diverse array of models for your selection. To configure your agents to leverage any of these models, please refer to our project configuration guidelines. Below, you will find a list of the models currently supported. We are dedicated to the continuous enhancement and expansion of our model catalog, so please visit this page regularly for the latest updates.
LLMs & VLMs¶
The table below lists the LLMs and VLMs currently supported:
LLM / VLM | Input Modalities | Output |
---|---|---|
meta-llama/Llama-3.1-8B-Instruct |
text | text |
meta-llama/Llama-3.1-70B-Instruct |
text | text |
meta-llama/Llama-3.3-70b-Instruct |
text | text |
meta-llama/Llama-4-Maverick-17B-128E-Instruct |
text | text |
meta-llama/Llama-3.2-90B-Vision-Instruct |
text, image | text |
mistralai/Mistral-7B-Instruct-v0.3 |
text | text |
mistralai/Mistral-Small-3.1-24B-Instruct-2503 |
text, image | text |
Qwen/Qwen3-32B |
text | text |
Configuring LLMs & VLMs for Your Project¶
To integrate any of the supported models into your project, update the relevant configuration section within the base_config
or the config
block of any utility agents in your YAML file. For models that support image input, ensure the agent is capable of handling images (e.g., ImageUnderstandingAgent
). Make sure the model
parameter is set to one of the supported model names listed above, and ensure that any required capabilitiesโsuch as image inputโare supported by the selected agent.
Using LLMs through Our Inference API¶
You can also directly use any of the models listed above through our inference API. See an example below:
import os
from air import AIRefinery
from air import login
auth = login(
account=str(os.getenv("ACCOUNT")), # your account
api_key=str(os.getenv("API_KEY")), # your API key
)
base_url = os.getenv("AIREFINERY_ADDRESS", "")
client = AIRefinery(**auth.openai(base_url=base_url))
# Create a chat request
response = client.chat.completions.create(
messages=[{"role": "user", "content": "What is the capital of France?"}],
model="meta-llama/Llama-3.1-70B-Instruct", # an LLM from the list above
)
Embedding Models¶
The list of models that we support for embedding your data are as follows:
intfloat/e5-mistral-7b-instruct
intfloat/multilingual-e5-large
nvidia/nv-embedqa-mistral-7b-v2
nvidia/llama-3-2-nv-embedqa-1b-v2
Using Embedding Models in Your Project¶
To utilize any of these embedding models in your project, simply update the embedding_config
within the base_config
or within the aisearch_config
section of the ResearchAgent
. Ensure that the model_name
parameter of the embedding_config
is set to one of the names listed above.
Embedding Your Data Using Our Inference API¶
You can also directly use any of the models listed above to embed your data using our inference API. See an example below:
import os
from air import AIRefinery
from air import login
auth = login(
account=str(os.getenv("ACCOUNT")), # your account
api_key=str(os.getenv("API_KEY")), # your API key
)
base_url = os.getenv("AIREFINERY_ADDRESS", "")
client = AIRefinery(**auth.openai(base_url=base_url))
# Create an embedding request
response = client.embeddings.create(
input=["What is the capital of France?"],
model="nvidia/nv-embedqa-mistral-7b-v2", # required
encoding_format="float", # required
extra_body={"input_type": "query", "truncate": "NONE"} # extra_body is required for "nvidia" models
# where "input_type" can be either "query" or "passage"
)
Compressors¶
The list of prompt compression models that we support are:
To utilize any of these prompt compression models in your project, simply update the compression_config
within the base_config
of your project. To learn more about prompt compression, see this tutorial. Ensure that the model
parameter of the compression_config
is set to one of the names listed above.
Rerankers¶
The list of reranker models that we support are:
To utilize any of these reranker models in your project, simply update the reranker_config
within the base_config
of your project. To learn more about reranking, see this tutorial. Ensure that the model
parameter of the reranker_config
is set to one of the names listed above.
Diffusers¶
The list of diffusers we support are:
These diffusers can be used for our image generation agent, and the Images API.