Skip to content

Image Segmentation API

This documentation introduces AI Refineryโ€™s Image Segmentation API. The API leverages advanced machine learning models to segment images into distinct, labeled regions.

Users guide the segmentation by providing point promptsโ€”specific image locations. The model responds with a single mask image, assigning unique categorical values to each detected region, such as objects, object parts, people, or backgrounds. This mask enables easy identification and analysis of specific areas within the original image.

You can access this functionality through our SDK using either the AIRefinery or AsyncAIRefinery clients.

Asynchronous Image Segmentation

AsyncAIRefinery.images.segment()

The AsyncAIRefinery generates a mask asynchronously by sending a POST request to the segmentation endpoint.

Parameters:
  • image (str): A base64-encoded image used for segment extraction.
  • segment_prompt (str): Specifies points guiding the image segmentation. Provided as a 3D list of point pairs, e.g., [[[x1, y1], [x2, y2]]]. The model uses these prompts to determine whether to create distinct segments in the resulting mask.
  • model (str): The model name. A complete list can be found in the Segmentation Models section of our model catalog page.
  • timeout (float | None): The maximum time (in seconds) to wait for a response. Defaults to 60 seconds if not provided.
  • extra_headers (dict[str, str] | None): Request-specific headers that override any default headers.
  • extra_body (object | None): Additional data to include in the request body, if needed.
  • **kwargs: Additional segmentation parameters (e.g., "n", "size", "user").
Returns:
  • SegmentationResponse: A Pydantic model containing the generated masks and metadata.
SegmentationResponse Object

This object represents the complete response from the Images segment endpoint. Its attributes are:

  • created (int): The Unix timestamp of requested segment creation.
  • data (List[Mask]): The list of generated masks.
  • usage (Optional[Usage]): Token usage information (if available).
Mask Object

This object represents a single generated mask and its metadata. Its attributes are:

  • b64_json (Optional[str]): The mask data encoded in Base64 format.
  • label (Optional[str]): The semantic class label assigned to each segment, if available from the chosen model.
  • score (Optional[str]): The confidence score from the model for each created mask, given the prompt, if provided by the chosen model.
Usage Object

This object holds token-usage statistics for an image request. Its attributes are:

  • input_tokens (int): Number of tokens in the prompt.
  • input_tokens_details (Dict[str, int]): A breakdown of input token usage.
  • output_tokens (int): Number of tokens in the generated image.
  • total_tokens (int): Total tokens used.
Example Usage
import asyncio  
import base64
import os  

import requests

from air import AsyncAIRefinery  
from air import login  

# Authenticate using account and API key retrieved from environment variables  
auth = login(  
    account=str(os.getenv("ACCOUNT")),  
    api_key=str(os.getenv("API_KEY")),  
)  

# Get base URL for AI Refinery service from environment variable 
base_url = os.getenv("AIREFINERY_ADDRESS", "")  

# Fetch the image and convert it to base64
def get_image_as_base64(url: str) -> str:
    """Fetches an image from a URL and returns it as a base64 encoded string."""
    response = requests.get(url, timeout=60)
    response.raise_for_status()  # Ensure the request was successful
    return base64.b64encode(response.content).decode("utf-8")

# Sample image:
IMG_URL = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png"
image_for_segmentation = get_image_as_base64(IMG_URL)

async def segment_image_async():  
    # Initialize the asynchronous client for AI Refinery service with authentication details  
    client = AsyncAIRefinery(**auth.openai(base_url=base_url))  

    # Use the images sub-client to asynchronously generate a mask on the provided segment_prompt with the given model.
    response = await client.images.segment(  
        image=image_for_segmentation, # Provide desired base64 image 
        segment_prompt=[[[450, 600]]], # Provide best guess of segment you want to extract from the image 
        model="syscv-community/sam-hq-vit-base", # Specify the model to use for image segmentation  
    )  

    # Print the response from the image segmentation request  
    print("Async image segmentation response: ", response)  

# Execute the asynchronous image segmentation function when the script is run  
if __name__ == "__main__":  
    asyncio.run(segment_image_async()) 

Synchronous Image Segmentation

AIRefinery.images.segment()

The AIRefinery client generates masks in a synchronous manner. This method supports the same parameters and return structure as the asynchronous method (AsyncAIRefinery.images.segment()) described above.

Example Usage
import base64
import os  

import requests

from air import AIRefinery  
from air import login   

# Authenticate using account and API key retrieved from environment variables  
auth = login(  
    account=str(os.getenv("ACCOUNT")),  
    api_key=str(os.getenv("API_KEY")),  
)  

# Get base URL for AI Refinery service from environment variable
base_url = os.getenv("AIREFINERY_ADDRESS", "")  

# Fetch the image and convert it to base64
def get_image_as_base64(url: str) -> str:
    """Fetches an image from a URL and returns it as a base64 encoded string."""
    response = requests.get(url, timeout=60)
    response.raise_for_status()  # Ensure the request was successful
    return base64.b64encode(response.content).decode("utf-8")

# Sample image:
IMG_URL = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png"
image_for_segmentation = get_image_as_base64(IMG_URL)


def segment_image_sync():  
    # Initialize the synchronous client for AI Refinery service with authentication details  
    client = AIRefinery(**auth.openai(base_url=base_url))  

    # Use the images sub-client to synchronously generate a mask based on the provided segment_prompt with the given model.
    response = client.images.segment(  
        image=image_for_segmentation, # Provide desired base64 image 
        segment_prompt=[[[450, 600]]], # Provide best guess of segment you want to extract from the image 
        model="syscv-community/sam-hq-vit-base", # Specify the model to use for image segmentation  
    )  

    # Print the response from the image segmentation request  
    print("Sync image segmentation response: ", response)  

# Execute the synchronous image segmentation function when the script is run  
if __name__ == "__main__":  
    segment_image_sync()