Tesseract logo

Tesseract

Reads text from images in various languages and typefaces.

Made by Unknown Author

  • Drag selection

  • text-recognition

What is Tesseract?

This app is a powerful open-source optical character recognition (OCR) library written entirely in JavaScript. It utilizes the renowned Tesseract OCR engine, which was one of the top-performing engines in the 1995 UNLV Accuracy test. After a period of relative inactivity between 1995 and 2006, the Tesseract engine has emerged as one of the most accurate open-source OCR solutions available. The library can read binary, grayscale, or color images, and it includes a built-in TIFF reader that can handle uncompressed TIFF files, with the option to add support for compressed TIFF images through the libtiff library. The app boasts language files for a wide range of languages, including those with specialized typefaces like Fraktur and blackletter

Highlights

  • Utilizes the Tesseract OCR engine, one of the top-performing engines in the 1995 UNLV Accuracy test
  • Reads binary, grayscale, or color images
  • Includes a built-in TIFF reader for uncompressed files, with optional support for compressed TIFF images
  • Supports a wide range of languages, including those with specialized typefaces like Fraktur and blackletter
  • Entirely written in JavaScript, making it a pure, client-side OCR solution

Platforms

  • Mac
  • Linux
  • Windows
  • Web

Languages

  • English

Social

Features

    • OCR