paddlex/PP-OCRv4_server_det¶
Model Information¶
paddlex/PP-OCRv4_server_det is a server-side text detection model from the PP-OCR v4 series, developed by Baidu's PaddlePaddle team. It detects text regions in document images, outputting quadrilateral bounding polygons suitable for downstream OCR recognition. The model is optimized for high accuracy on server-grade hardware.
- Model Developer: Baidu / PaddlePaddle
- Framework: PaddleX 3.x
- Task: Text detection (locating text regions in images)
- Input: Document page image
- Output: Quadrilateral bounding polygons for each text region
Model Architecture¶
- Type: DBNet++ (Differentiable Binarization) with ResNet backbone
- Model Size: ~109 MB
- Detection Accuracy: 69.2% (PaddleOCR 3.0 multilingual benchmark)
- Inference Time: ~128ms (GPU, normal mode) / ~99ms (GPU, high-performance mode)
- Languages: Language-agnostic (detects text regions regardless of script)
Benchmark¶
| Model | Accuracy (%) | GPU Inference (ms) | Model Size (MB) |
|---|---|---|---|
| PP-OCRv4_server_det | 69.2 | 127.82 / 98.87 | 109 |
| PP-OCRv5_server_det | 83.8 | 89.55 / 70.19 | 84.3 |
| PP-OCRv4_mobile_det | 63.8 | 9.87 / 4.17 | 4.7 |
Benchmark evaluated on PaddleOCR 3.0 multilingual dataset (Chinese, Traditional Chinese, English, Japanese) covering street scenes, web images, documents, handwriting, and distorted text.
Usage¶
from air.document_analysis import DocumentAnalysisClient
client = DocumentAnalysisClient(api_key="...")
result = client.text_detection(
model="paddlex/PP-OCRv4_server_det",
image_path="document_page.png",
threshold=0.3,
)
for region in result.regions:
print(f"Region: bbox={region.bbox}, score={region.score:.3f}")
API Reference¶
See the Document Analysis API for full endpoint documentation.