Databricks Agent¶
Databricks Agent
is a third-party agent hosted on the Databricks platform. These agents use Databricks Genie to allow business teams to interact with their data using natural language. Genie agents from Databricks use generative AI tailored to your organization's terminology and data, with the ability to monitor and refine its performance through user feedback.
Our AI Refinery SDK allows seamless integration with a user's customized Databricks Agent
using the DatbricksAgent
class. This integration brings the full power of Genie to applications, enhancing performance and capabilities on our AI Refinery platform.
Creating Databricks Agents ¶
Users can customize a Databricks Agent
through the Databricks platform. To create an agent, follow these steps:
- Sign Up for and Log In to your user account on Databricks.
- Proceed (or ask your account admin ) to set up a Service Principal for your account to allow external connections.
- Obtain your account credentials (You will need these to configure your agent in AIR later):
- Host: The URL of either your Databricks account console (
http://accounts.cloud.databricks.com
) or your Databricks workspace (https://{your-workspace-id}.cloud.databricks.com
). - Client ID: The client ID you were assigned when creating your service principal.
- Client Secret: The client secret you generated when creating your service principal.
- Host: The URL of either your Databricks account console (
- Set up a Genie Workspace to connect a Genie agent to your data. You can configure your Genie workspace with additional business-specific context, and exemplary SQL queries for database management and exploration.
- Obtain your Genie space ID. You can find that from the URL of your Genie space after you have set it up. The format of that URL is as follows:
Carefully copy your Genie space ID from there, between the
/rooms/
field and the?o=
separator. You will also need that to configure your Databricks agent in AIR later. - Test your Genie agent in the Databricks platform to chat with your data, while viewing the data tables or the unstructured data sources it has access to and understand its capabilities.
Onboarding Databricks Agent¶
To use the Databricks agents through our AI Refinery SDK, users need the following parameters:
Variable | Description | Required |
---|---|---|
client_id |
Mapping to the name of the environment variable that holds your actual Databricks client ID. | Yes |
client_secret |
Mapping to the name of the environment variable that holds your actual Databricks client secret. | Yes |
host_url |
Mapping to the name of the environment variable that holds your actual Databricks host URL. | Yes |
genie_space_id |
Mapping to the name of the environment variable that holds your actual Genie space ID. | Yes |
contexts |
Allows for the provision of additional information during communication with the Databricks Agent . |
No |
Workflow Overview¶
The workflow of the DatabricksAgent
class consists of four components:
- Initialization: An agent is created in the Databricks platform under a Genie workspace and is registered in AI Refinery with the specified configuration.
- Sending a Query: A user query is forwarded from AI Refinery to the Genie Agent running on the Databricks platform.
- Databricks-side Processing: The Genie Agent answers the user's query, either with a verbal response or with the generation of a SQL query. The execution of that query will return either a numerical value or tabular data that answer the user's query. If a SQL command is generated, the Databricks agent automatically runs the command to return a human-understandable answer to the user's query.
- Receiving and Parsing the Response: The
DatabricksAgent
returns the processed results as its final response to AI Refinery.
Usage and Quickstart¶
To quickly set up an AI Refinery project with a Databricks
, the user should first create their own Genie agent in Databricks as explained above. Once the agent is ready, use the YAML configuration template below to integrate it into the AI Refinery project.
Specifically, ensure the following configurations are included:
- Add a utility agent with
agent_class: DatabricksAgent
underutility_agents
. - Ensure the
agent_name
you chose for yourDatabricksAgent
is listed in theagent_list
underorchestrator
.
Template YAML Configuration of DatabricksAgent¶
See the YAML template below for the DatabricksAgent
configuration.
orchestrator:
agent_list:
- agent_name: "Database Assistant"
utility_agents:
- agent_class: DatabricksAgent
agent_name: "Database Assistant"
agent_description: "The Database Assistant has access to the tables of an Accenture database and can answer questions about the data contained."
config:
client_id: "DATABRICKS_CLIENT_ID" # Required: Environment variable holding Databricks client ID
client_secret: "DATABRICKS_CLIENT_SECRET" # Required: Environment variable holding Databricks client secret
host_url: "DATABRICKS_HOST" # Required: Environment variable holding Databricks host URL
genie_space_id: "GENIE_SPACE_ID" # Required: Environment variable holding Databricks Genie space ID
contexts: # Optional
- "date"
- "chat_history"