Image Segmentation API¶
This documentation introduces AI Refineryโs Image Segmentation API. The API leverages advanced machine learning models to segment images into distinct, labeled regions.
Users guide the segmentation by providing point promptsโspecific image locations. The model responds with a single mask image, assigning unique categorical values to each detected region, such as objects, object parts, people, or backgrounds. This mask enables easy identification and analysis of specific areas within the original image.
You can access this functionality through our SDK using either the AIRefinery
or AsyncAIRefinery
clients.
Asynchronous Image Segmentation¶
AsyncAIRefinery.images.segment()
¶
The AsyncAIRefinery
generates a mask asynchronously by sending a POST request to the segmentation endpoint.
Parameters:¶
image
(str): A base64-encoded image used for segment extraction.segment_prompt
(str): Specifies points guiding the image segmentation. Provided as a 3D list of point pairs, e.g., [[[x1, y1], [x2, y2]]]. The model uses these prompts to determine whether to create distinct segments in the resulting mask.model
(str): The model name. A complete list can be found in the Segmentation Models section of our model catalog page.timeout
(float | None): The maximum time (in seconds) to wait for a response. Defaults to60
seconds if not provided.extra_headers
(dict[str, str] | None): Request-specific headers that override any default headers.extra_body
(object | None): Additional data to include in the request body, if needed.**kwargs
: Additional segmentation parameters (e.g., "n", "size", "user").
Returns:¶
SegmentationResponse
: A Pydantic model containing the generated masks and metadata.
SegmentationResponse Object¶
This object represents the complete response from the Images segment
endpoint. Its attributes are:
created
(int): The Unix timestamp of requested segment creation.data
(List[Mask]): The list of generated masks.usage
(Optional[Usage]): Token usage information (if available).
Mask Object¶
This object represents a single generated mask and its metadata. Its attributes are:
b64_json
(Optional[str]): The mask data encoded in Base64 format.label
(Optional[str]): The semantic class label assigned to each segment, if available from the chosen model.score
(Optional[str]): The confidence score from the model for each created mask, given the prompt, if provided by the chosen model.
Usage Object¶
This object holds token-usage statistics for an image request. Its attributes are:
input_tokens
(int): Number of tokens in the prompt.input_tokens_details
(Dict[str, int]): A breakdown of input token usage.output_tokens
(int): Number of tokens in the generated image.total_tokens
(int): Total tokens used.
Example Usage¶
import asyncio
import base64
import os
import requests
from air import AsyncAIRefinery
from air import login
# Authenticate using account and API key retrieved from environment variables
auth = login(
account=str(os.getenv("ACCOUNT")),
api_key=str(os.getenv("API_KEY")),
)
# Get base URL for AI Refinery service from environment variable
base_url = os.getenv("AIREFINERY_ADDRESS", "")
# Fetch the image and convert it to base64
def get_image_as_base64(url: str) -> str:
"""Fetches an image from a URL and returns it as a base64 encoded string."""
response = requests.get(url, timeout=60)
response.raise_for_status() # Ensure the request was successful
return base64.b64encode(response.content).decode("utf-8")
# Sample image:
IMG_URL = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png"
image_for_segmentation = get_image_as_base64(IMG_URL)
async def segment_image_async():
# Initialize the asynchronous client for AI Refinery service with authentication details
client = AsyncAIRefinery(**auth.openai(base_url=base_url))
# Use the images sub-client to asynchronously generate a mask on the provided segment_prompt with the given model.
response = await client.images.segment(
image=image_for_segmentation, # Provide desired base64 image
segment_prompt=[[[450, 600]]], # Provide best guess of segment you want to extract from the image
model="syscv-community/sam-hq-vit-base", # Specify the model to use for image segmentation
)
# Print the response from the image segmentation request
print("Async image segmentation response: ", response)
# Execute the asynchronous image segmentation function when the script is run
if __name__ == "__main__":
asyncio.run(segment_image_async())
Synchronous Image Segmentation¶
AIRefinery.images.segment()
¶
The AIRefinery
client generates masks in a synchronous manner. This method supports the same parameters and return structure as the asynchronous method (AsyncAIRefinery.images.segment()
) described above.
Example Usage¶
import base64
import os
import requests
from air import AIRefinery
from air import login
# Authenticate using account and API key retrieved from environment variables
auth = login(
account=str(os.getenv("ACCOUNT")),
api_key=str(os.getenv("API_KEY")),
)
# Get base URL for AI Refinery service from environment variable
base_url = os.getenv("AIREFINERY_ADDRESS", "")
# Fetch the image and convert it to base64
def get_image_as_base64(url: str) -> str:
"""Fetches an image from a URL and returns it as a base64 encoded string."""
response = requests.get(url, timeout=60)
response.raise_for_status() # Ensure the request was successful
return base64.b64encode(response.content).decode("utf-8")
# Sample image:
IMG_URL = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png"
image_for_segmentation = get_image_as_base64(IMG_URL)
def segment_image_sync():
# Initialize the synchronous client for AI Refinery service with authentication details
client = AIRefinery(**auth.openai(base_url=base_url))
# Use the images sub-client to synchronously generate a mask based on the provided segment_prompt with the given model.
response = client.images.segment(
image=image_for_segmentation, # Provide desired base64 image
segment_prompt=[[[450, 600]]], # Provide best guess of segment you want to extract from the image
model="syscv-community/sam-hq-vit-base", # Specify the model to use for image segmentation
)
# Print the response from the image segmentation request
print("Sync image segmentation response: ", response)
# Execute the synchronous image segmentation function when the script is run
if __name__ == "__main__":
segment_image_sync()