OCR API Endpoint Documentation | PixLab Developers Handbook

API Endpoint Access URL

https://api.pixlab.io/ocr

Description

OCR API: Extract text from images or video frames with high accuracy. Supports multiple languages and offers specialized endpoints for document scanning and PDF conversion. Given an input image or video frame with human readable characters. Detect input language & extract text content from there. OCR stand for Optical Character Recognition and PixLab uses state-of-the-art processing algorithms so expect accurate results given a good quality image.
For a more specialized approach such as scanning government issued documents like passports or ID cards, the /docscan endpoint do perform such a task. If you are dealing with PDF documents, you can convert them at first to raw images via the /pdftoimg endpoint.

HTTP Methods

GET

HTTP Parameters

Required

Fields	Type	Description
`img`	URL	Input image URL. If you want to upload your image directly from your app, call store before and use the output link. Only JPEG, PNG & BMP file format are supported. convert is of particular help if you have a different image format.
`key`	String	Your PixLab API Key ↗.

Optional

Fields	Type	Description
`lang`	String	Input language code if known. Please do not set this field if you have no idea about the input language. The supported BCP-47 languages code as of this release are: en (English), de (German), ar (Arabic), he (Modern Hebrew), hi (Hindi), fr (French), cs (Czech), da (Danish), nl (Dutch), fi (Finnish), el (Greek), hu (Hungarian), it (Italian), Ja (Japanese), ko (Korean), nb (Norwegian), pl (Polish), pt (Portuguese), zh-Hans (Chinese Simplified), zh-Hant (ChineseTraditional), ru (Russian), es (Spanish), sv (Swedish), tr (Turkish)
`orientation`	Boolean	Detect and correct text orientation in the input image.
`nl`	Boolean	Output new line (\n) character on each detected line.
`br`	Boolean	Output HTML line break (</br>) on each detected line.

HTTP Response

application/json

This endpoint returns a JSON response. The following fields are included in the response body:

Fields	Type	Description
`status`	Integer	HTTP status code. 200 indicates success, any other code indicates failure.
`output`	String	Extracted text content from the input image.
`lang`	String	BCP-47 language code of detected text.
`bbox`	Array	Bounding box coordinates for each extracted word. Each array element contains: `{ word: Extracted text, x: Top-left X coordinate, y: Top-left Y coordinate, w: Rectangle width, h: Rectangle height }` Use these coordinates with drawrectangles or crop endpoints. Available on `Prod plan` and higher tiers.
`error`	String	Error details when status ≠ 200.

Code Samples


import requests
from typing import Dict, Any


def extract_text_from_image(image_url: str, api_key: str) -> None:
    """Extract text from an image using PixLab OCR API."""
    params = {
        'img': image_url,
        'orientation': True,  # Correct text orientation
        'nl': True,  # Output new lines if any
        'key': api_key
    }
    
    try:
        response = requests.get('https://api.pixlab.io/ocr', params=params)
        response.raise_for_status()
        data: Dict[str, Any] = response.json()
        
        if data['status'] != 200:
            print(f"Error: {data['error']}")
            return
        
        print(f"Input language: {data['lang']}")
        print(f"Text Output: {data['output']}")
        
        for box in data['bbox']:
            print(f"Word: {box['word']}")
            print(f"Bounding box - X: {box['x']} Y: {box['y']} "
                  f"Width: {box['w']} Height: {box['h']}")
    
    except requests.exceptions.RequestException as e:
        print(f"API request failed: {e}")


if __name__ == "__main__":
    extract_text_from_image(
        image_url="http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg",
        api_key="PIXLAB_API_KEY"
    )


// Given an image with human readable characters. Detect input language & extract text content from there.
// https://pixlab.io/endpoints/ocr for additional information.

const params = new URLSearchParams({
    img: 'http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg',
    orientation: true, // Correct text orientation
    nl: true, // Output new lines if any
    key: 'PIXLAB_API_KEY'
});

fetch(`https://api.pixlab.io/ocr?${params}`)
    .then(response => response.json())
    .then(reply => {
        if (reply.status !== 200) {
            console.log(reply.error);
        } else {
            console.log("Input language: " + reply.lang);
            console.log("Text Output: " + reply.output);
            // Iterate over all extracted words
            reply.bbox.forEach(box => {
                console.log("Word: " + box.word);
                console.log("Bounding box - X: " + box.x + " Y: " + box.y + " Width: " + box.w + " Height: " + box.h);
            });
        }
    })
    .catch(error => console.error('Error:', error));

<?php

$params = [
    'img' => 'http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg',
    'orientation' => true,
    'nl' => true,
    'key' => 'PIXLAB_API_KEY'
];

$url = 'https://api.pixlab.io/ocr?' . http_build_query($params);

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

$response = curl_exec($ch);
curl_close($ch);

$reply = json_decode($response, true);

if ($reply['status'] != 200) {
    echo $reply['error'];
} else {
    echo "Input language: " . $reply['lang'] . "\n";
    echo "Text Output: " . $reply['output'] . "\n";
    
    foreach ($reply['bbox'] as $box) {
        echo "Word: " . $box['word'] . "\n";
        echo "Bounding box - X: " . $box['x'] . " Y: " . $box['y'] . " Width: " . $box['w'] . " Height: " . $box['h'] . "\n";
    }
}


require 'net/http'
require 'uri'
require 'json'

# Given an image with human readable characters. Detect input language & extract text content from there.
# https://pixlab.io/endpoints/ocr for additional information.

uri = URI.parse('https://api.pixlab.io/ocr')
params = {
  'img' => 'http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg',
  'orientation' => true, # Correct text orientation
  'nl' => true, # Output new lines if any
  'key' => 'PIXLAB_API_KEY'
}
uri.query = URI.encode_www_form(params)

response = Net::HTTP.get_response(uri)
reply = JSON.parse(response.body)

if reply['status'] != 200
  puts reply['error']
else
  puts "Input language: " + reply['lang']
  puts "Text Output: " + reply['output']
  # Iterate over all extracted words
  reply['bbox'].each do |box|
    puts "Word: " + box['word']
    puts "Bounding box - X: " + box['x'].to_s + " Y: " + box['y'].to_s + " Width: " + box['w'].to_s + " Height: " + box['h'].to_s
  end
end

Similar API Endpoints

nsfw, docscan, bg-remove, facedetect, screencapture, tagimg, docscan, facelookup ↗, pdftoimg