GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers.GOCR can be used with different front-ends, which makes it very easy to port to different OSes and architectures. It can open many different image formats, and its quality have been improving in a daily basis.
comments powered by Disqus
Recognize This! People are increasingly acquiring digital images of the world and of documents; often these images contain Roman letters. When viewing web pages with Flash and other accessibility issues we are increasingly faced with "pictures of letters". Such text cannot be copy-n-pasted into other documents, resized, offered to Google Language Tools for translation, etc. The free OCR software currently available offers poor recognition rates for realistic images. RecognizeThis! will identify
Since Sep 8, 2008 / Last update: Feb. 12, 2010 IntroductionNHocr is a command line OCR (Optical Character Recognition) program for Japanese language, etc. It has been designed to recognize machine-printed Japanese characters and some ASCII characters/symbols in an image. NHocr is probably the first Open Source Japanese OCR software (offline, machine-printed), except some experimental, partial codes open to academic communities. You can also use NHocr through WeOCR service at: http://maggie.ocrgr
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text. It is used to convert paper books and documents into electronic files. Lime OCR is build with tessearact-ocr which is an OCR Engine that was developed at HP Labs between 1985 and 1995, and now at Google. Lime OCR was initially developed for internal use of Lime Consultants, and now
For an effective disaster management system the faster and the accurate data collection is essential. The data capered in forms have to reentered manually adding an additional step and potential bottleneck for the processing of information. Handwritten Character Recognition(HCR) technology can be used to automate this process to a great extend. Two algorithms for Optical Character Recognition are, feature extraction method using traditional artificial intelligence techniques for classification,
Dear friends! For a few years our group has been developing OCR (optical character recognition) and translation system with Open Source code for Asian languages. The key features of the OCR system include: 1. Stream OCR processing During the first stage of the project, we recognized 300 000 pages of Tibetan Canon in Tibetan for TBRC Digital Library (www.tbrc.org) We used MacPro server that has processed all 280 volumes with one OCR set. 2. Tibetan spell checker and online dictionary on 250000 wo
LectorA graphical ocr solution for GNU/Linux based on Python, Qt4 and tessaract OCR. Author: Davide Setti IntroductionLector can help you to scan your tons of paper and create text document! Lector lets you select areas on which you want to do OCR (Optical Character Recognition). Then you can run tesseract-ocr simply clicking a button. The resulting text can be proofread, formatted and edited directly in Lector. Features: scanning (available only on Linux) OCR via tesseract (with support for mor
The project is currently on hold and only tested on Droid devices, but if you are interested in helping, let us know! Our team is working on a project called Mobile OCR (Mobile Optical Character Recognition). If you would like to try it out, you can download the 'MobileOCR_2.0' apk file in the source tab. This should work for phones with Android 2.0 and above. Check out the Mobile Accessibility Project at http://mobileaccessibility.cs.washington.edu/ where you can find several other free accessi
Enaocr - OCR software, that allows you to read multiple images files and convert them to ASCII text.
AboutEnaOCR is a Optical Character Recognition desktop application, that allows you to read multiple images files and convert them to ASCII text and deploy in different format or external databases. With this application is easy to deploy any data to any external database. You just need one click to process hundreds of images in seconds, is super fast and with strong accuracy in the detection of words This application was written in Java with the help of Tesseract engine. The project was develop
Hocr-tools - Tools for manipulating and evaluating the hOCR format for representing multi-lingual OC
AbouthOCR is a format for representing OCR output, including layout information, character confidences, bounding boxes, and style information. It embeds this information invisibly in standard HTML. By building on standard HTML, it automatically inherits well-defined support for most scripts, languages, and common layout options. Furthermore, unlike previous OCR formats, the recognized text and OCR-related information co-exist in the same file and survives editing and manipulation. hOCR markup is