Skip to content

paddlex/PP-OCRv4_server_det

Model Information

paddlex/PP-OCRv4_server_det is a server-side text detection model from the PP-OCR v4 series, developed by Baidu's PaddlePaddle team. It detects text regions in document images, outputting quadrilateral bounding polygons suitable for downstream OCR recognition. The model is optimized for high accuracy on server-grade hardware.

  • Model Developer: Baidu / PaddlePaddle
  • Framework: PaddleX 3.x
  • Task: Text detection (locating text regions in images)
  • Input: Document page image
  • Output: Quadrilateral bounding polygons for each text region

Model Architecture

  • Type: DBNet++ (Differentiable Binarization) with ResNet backbone
  • Model Size: ~109 MB
  • Detection Accuracy: 69.2% (PaddleOCR 3.0 multilingual benchmark)
  • Inference Time: ~128ms (GPU, normal mode) / ~99ms (GPU, high-performance mode)
  • Languages: Language-agnostic (detects text regions regardless of script)

Benchmark

Model Accuracy (%) GPU Inference (ms) Model Size (MB)
PP-OCRv4_server_det 69.2 127.82 / 98.87 109
PP-OCRv5_server_det 83.8 89.55 / 70.19 84.3
PP-OCRv4_mobile_det 63.8 9.87 / 4.17 4.7

Benchmark evaluated on PaddleOCR 3.0 multilingual dataset (Chinese, Traditional Chinese, English, Japanese) covering street scenes, web images, documents, handwriting, and distorted text.


Usage

from air.document_analysis import DocumentAnalysisClient

client = DocumentAnalysisClient(api_key="...")
result = client.text_detection(
    model="paddlex/PP-OCRv4_server_det",
    image_path="document_page.png",
    threshold=0.3,
)
for region in result.regions:
    print(f"Region: bbox={region.bbox}, score={region.score:.3f}")

API Reference

See the Document Analysis API for full endpoint documentation.

External References