Skip to content

Accenture AI Refinery SDK

RT-DETR-H_layout_17cls

Accenture/airefinery-sdk

paddlex/RT-DETR-H_layout_17cls¶

Model Information¶

paddlex/RT-DETR-H_layout_17cls is a high-accuracy layout detection model based on Baidu's RT-DETR (Real-Time DEtection TRansformer) architecture. It detects and classifies document layout elements such as text blocks, tables, figures, headers, footers, and more. The model is part of the PaddleX ecosystem.

Model Developer: Baidu / PaddlePaddle
Framework: PaddleX 3.x
Task: Document layout detection
Input: Document page image
Output: Bounding boxes with element labels and confidence scores

Model Architecture¶

Type: RT-DETR (Real-Time DEtection TRansformer) — end-to-end object detection without NMS
Backbone: HGNetv2 (High-Performance)
Parameters: ~435 MB
Inference Time: ~32ms per image (NVIDIA H100 GPU)

Supported Layout Classes (17)¶

Class	Description
`text`	Body text blocks
`title`	Document titles / headings
`figure`	Images, charts, diagrams
`figure_caption`	Captions for figures
`table`	Data tables
`table_caption`	Captions for tables
`header`	Page headers
`footer`	Page footers
`reference`	Bibliography / references
`equation`	Mathematical equations
`list-item`	Bullet/numbered list items
`index`	Table of contents / index
`code`	Code blocks
`algorithm`	Algorithm descriptions
`abstract`	Paper abstracts
`author`	Author information
`stamp`	Stamps and seals

Usage¶

from air.document_analysis import DocumentAnalysisClient

client = DocumentAnalysisClient(api_key="...")
result = client.layout_detection(
    model="paddlex/RT-DETR-H_layout_17cls",
    image_path="document_page.png",
    threshold=0.5,
)
for element in result.elements:
    print(f"{element.label}: score={element.score:.3f}, bbox={element.bbox}")

API Reference¶

See the Document Analysis API for full endpoint documentation.

External References¶