Debian Accessibility Project
Summary
Optical character recognition (ocr)
Debian Accessibility Optical Character Recognition (OCR)

This metapackage will install packages which are useful for Optical Character Recognition (OCR).

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Accessibility to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Accessibility mailing list

Links to other tasks

Debian Accessibility Optical character recognition (ocr) packages

Official Debian packages with high relevance

Ebook-speaker
eBook reader that reads aloud in a synthetic voice
Versions of package ebook-speaker
ReleaseVersionArchitectures
sid5.0.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
wheezy2.0-3amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie2.8.1-1+deb8u1amd64,armel,armhf,i386
stretch4.1.0-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid4.1.0-2hurd-i386,kfreebsd-amd64
buster5.0.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package ebook-speaker:
accessibilityspeech
interfacecommandline
roleprogram
scopeutility
soundplayer
works-withfile
works-with-formatepub
Popcon: 26 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides a command-line e-reader that reads out electronic text using speech synthesis. It has a simple user interface appropriate for Braille terminals.

Currently the following formats are supported (some formats need additional packages as suggested by this package):

 AportisDoc
 ASCII mail text
 ASCII text
 Broadband eBooks (BBeB)
 Composite Document File (Microsoft Office Word)
 DAISY3 DTBook
 EPUB ebook data
 GIF image data
 GutenPalm zTXT
 GNU gettext message catalogue
 HTML document
 ISO-8859 text
 JPEG image data
 Microsoft Reader eBook Data
 Microsoft Windows HtmlHelp Data
 Microsoft Word 2007+
 Mobipocket E-book
 MS Windows HtmlHelp Data
 Netpbm PPM data
 OpenDocument Text
 PDF document
 PeanutPress PalmOS
 PNG image data
 POSIX shell script text
 PostScript document
 Rich Text Format
 troff or preprocessor text (e.g. Linux man-pages)
 UTF-8 Unicode mail text
 UTF-8 Unicode text
 WordPerfect
 XML document text
Screenshots of package ebook-speaker
Gocr
Command line OCR
Maintainer: Gürkan Myczko
Versions of package gocr
ReleaseVersionArchitectures
sid0.52-1amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,ppc64el,s390x
squeeze0.48-1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
wheezy0.49-1amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
jessie0.49-2amd64,armel,armhf,i386
stretch0.49-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster0.52-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package gocr:
accessibilityocr
interfacecommandline
roleprogram
scopeapplication
useconverting
works-withimage, image:raster, text
Popcon: 309 users (161 upd.)*
Versions and Archs
License: DFSG free

This is a multi-platform OCR (Optical Character Recognition) program.

It can read pnm, pbm, pgm, ppm, some pcx and tga image files.

Currently the program should be able to handle well scans that have their text in one column and do not have tables. Font sizes of 20 to 60 pixels are supported.

If you want to write your own OCR, libgocr is provided in a separate package. Documentation and graphical wrapper are provided in separated packages, too.

Hocr-gtk
GTK+ frontend for Hebrew OCR
Versions of package hocr-gtk
ReleaseVersionArchitectures
wheezy0.10.17-1all
squeeze0.8.2-6.1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
sid0.10.18-3all
buster0.10.18-3all
jessie0.10.17-2all
Debtags of package hocr-gtk:
accessibilityocr
culturehebrew
interfacex11
roleprogram
scopeapplication
uitoolkitgtk
useconverting
works-withimage, image:raster, text
x11application
Popcon: 5 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Hocr-gtk is a GTK+ based graphical interface to the libhocr library. It can open multiple image formats and uses aspell for internal spell checking.

Lios
Linux intelligent OCR solution
Maintainer: Samuel Thibault
Versions of package lios
ReleaseVersionArchitectures
sid2.7-2all
stretch2.1-2all
buster2.7-2all
Popcon: 69 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

Lios provides a graphical interface on top of the Cuneiform and Tesseract OCR backends to make OCR processing easier for impaired users, with full autorotation, brightness optimization, rectangle selection, audio feedback, etc.

Screenshots of package lios
Tesseract-ocr
Tesseract command line OCR tool
Versions of package tesseract-ocr
ReleaseVersionArchitectures
wheezy3.02.01-6amd64,armel,armhf,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,s390x,sparc
stretch3.04.01-5amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster4.0.0-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
squeeze2.04-2+squeeze1amd64,armel,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mips,mipsel,powerpc,s390,sparc
jessie3.03.03-1amd64,armel,armhf,i386
sid4.0.0-2amd64,arm64,armel,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips,mips64el,mipsel,ppc64el,s390x
upstream4.1.0-rc1
Debtags of package tesseract-ocr:
accessibilityocr
interfacecommandline
roleprogram
Popcon: 1241 users (408 upd.)*
Newer upstream!
License: DFSG free
Git

Tesseract is an open source Optical Character Recognition (OCR) Engine. It can be used directly, or (for programmers) using an API to extract printed text from images. It supports a wide variety of languages. This package includes the command line tool.

Ttf-ocr-a
transitional dummy package
Versions of package ttf-ocr-a
ReleaseVersionArchitectures
squeeze1.0-2all
wheezy1.0-4all
Debtags of package ttf-ocr-a:
accessibilityocr
made-offont
roledata, dummy
x11font
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free

This package is a dummy transitional package. It can be safely removed.

Screenshots of package ttf-ocr-a

Debian packages in contrib or non-free

Cuneiform
multi-language OCR system
Versions of package cuneiform
ReleaseVersionArchitectures
wheezy1.1.0+dfsg-4 (non-free)amd64,i386,ia64,kfreebsd-amd64,kfreebsd-i386,mipsel
sid1.1.0+dfsg-7 (non-free)amd64,arm64,armhf,hurd-i386,i386,kfreebsd-amd64,kfreebsd-i386,mips64el,mipsel
buster1.1.0+dfsg-7 (non-free)amd64,arm64,armhf,i386,mips64el,mipsel
jessie1.1.0+dfsg-5 (non-free)amd64,i386
Debtags of package cuneiform:
accessibilityocr
interfacecommandline
roleprogram
scopeutility
useconverting
works-withimage, image:raster
Popcon: 49 users (3 upd.)*
Versions and Archs
License: non-free

Cuneiform is an OCR system. In addition to text recognition it also does layout analysis and text format recognition.

The following languages are supported: Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, French, German, Hungarian, Italian, Latvian, Lithuanian, Polish, Portuguese, Romanian, Russian, Serbian, Slovenian, Spanish, Swedish, Turkish and Ukrainian.

*Popularitycontest results: number of people who use this package regularly (number of people who upgraded this package recently) out of 200343