InftyReader Ver.3 series
OCR software to recognize scientific documents including mathematical formulas.
Various output formats: XML, LaTeX, MathML, HTML, Word, etc.
Direct conversion of PDF to the formats above including Math.
Recognition of image on clipboard and past the result into Word (see here).
InftyReader is OCR software to recognize scientific documents including mathematical formulae, and to output the recognition results into various file formats: LaTeX, MathML, XHTML, HRTeX, IML and Microsoft Word document. It is developed in the laboratory of M. Suzuki, Faculty of Mathematics, Kyushu University, in collaboration
with several cooperation partners.
*InftyReader Ver.220.127.116.11 (Fev. 3, 2019)
[Important] All the users of InftyReader on Windows 10 are strongly recommended to update to Ver.18.104.22.168 or lator.
What's new in the Ver.3.1? ----- Signigicant improvement of PDF recognition.
All the users of InftyReader Ver.3.0 series can use the Ver.3.1 series at no additional fee.
Personal Use License package:
InftyReaderE3154.zip (English Edition, about 77MB) -------- Fev. 3 2018
What's new, in the version 3?
For the general information about InftyReader, please read "AboutInftyReaderE.txt" here.
version 22.214.171.124, the software package of one year license is the same as the normal license package. So, there is no one year license package this time. (The difference of the validity period will be checked by the serial number.)
Enterprise License package:
InftyReaderE3154_Enterprise.zip (English Edition, about 77MB) -------- Fec. 3, 2019 .
What's the difference from the personal use edition? Please read: "About InftyReader Enterprise".
License Update. If you purcased InftyReader in 2013/2014, you can use InftyReader Ver.3 by the serial number (and license key) you have. In case you purchased InftyReader in 2012 or before, and wish to use the new version InftyReader Ver. 3 series, you need to get a new serial number for Ver.3 series.
For the users having a normal license of old version purchased during 2010 and 2012, there are discount prices to get a new serial number for InftyReader Ver.3 series. The price depends on the year you purchased the license. For more detail, please see here.
Trial Use. To use InftyReader in the Trial Mode, please see: AboutTrialUse.txt
InftyReader Ver.126.96.36.199 (Nov. 22, 2013)
Below is the final version of the Ver.2.9 series. In case you need to (re-)activate InftyReader Ver.2 series, please use this version.
InftyReaderE2972.zip (English Edition, about 48MB) -------- Nov. 22, 2013
Document for blind users.
Below is the Introduction to InftyReader for blind users given by Prof. John Gardner (Oregon State University & ViewPlus Technology) at the ICCHP Summer University 2011.
Introduction to InftyReader by Prof. John Gardner.
* Comments about output formats
- IML is the default XML file format of the editor "InftyEditor", an authoring tool of math documents developed by InftyProject. InftyEditor provides a very easy user interface to input and edit math expressions together with ordinary texts.
The English edition of InftyEditor is a free software. Please see the sites of InftyEditor.
- In XHTML format, mathematical expressions are output using MathML notation.
- HR-TeX is a simplified LaTeX-like notation easier "to read" specially
designed for the blinds.
Using InftyEditor, user can correct and edit the recognition results of InftyReader comparing the results with original images, and convert the results into various formats: LaTeX, PDF, XHTML with MathML, etc.
Please note that InftyReader recognizes only <<Black and White>>, <<Binary>> images carefully scanned in either 600DPI or 400 DPI. Please be aware that the program fails to run if the imput image contains gray scale image areas or color image areas even partly.
Image files have to be prepared in either TIFF, PNG, or BMP format. InftyReader recognizes also PDF. It converts input PDF to PNG file first and then recognizes the converted image files.
Here are some features of InftyReader since Ver. 2.8 :
- It uses the OCR engines of Toshiba Corporation, "ExpressReaderPro", and of MediaDrive Corporation, "WinReader", simultaneously to improve the recognition results of characters in ordinary text areas. (As for the characters and math symbols
in formulae, it uses Infty's OCR).
- It can recognize tables including math expressions in the cells (in case the ruled lines are not broken),
- It can convert PDF files into LaTeX or XHTML(MathML) including
mathematical expressions, except for PDF including color images or gray
images. (Note that InftyReader can process only black and white binary images)
It recognizes the page images of PDF files refering to the text information
imbedded in PDF.
Attention: The original PDF should be of high resolution equivalent to 600dpi scanned images. Someimes PDF files existing on the WEB are of low resolution of the level 200dpi images, in order to reduce those file sizes. In such cases, the recognition results will be of very low quality of the level almost useless!
* Caution ---- Important!
- Source documents have to be clearly printed.
- It should be scanned in in 600dpi (or 400dpi). Usualy, binary images are better for the recognition than color images.
- InftyReader erases small noises, segments page images into picture areas,
table areas and text areas automatically, and then recognizes text/table
areas including mathematical expressions.
However, to get better recognition results, users are <<recommended>>
to erase noises and pictures before the recognition.
- In scanning,
it is important to adjust the binarization threshold of the scanner so that
the number of the touched or broken characters is less than 1% of the total
number of the characters in each scanned page image.
* Operating Environment
InftyReader runs on Windows 10, Windows 8.1, Windows 8, Windows 7, Vista, XP, on a PC equipped with at least 1GB free memory available for the applications.
Note that it does not run on Windows 98, Me, nor 2000. .
* How to use InftyReader?
- Select file(s) or folder.
- Input/select output docuent name
- Press the "Start" button.
Then, the recognition results of the selected image files are saved in to the file you specified by the "output docuent name". When, you select a
folder instead of files, all the image files in the folder of the specified
file type (TIF/GIF/PNG/BMP/PDF) are recognized and the results are output
into the files having the name(s) of the folders.
If you set check to the "Search Sub Folders" item under the "Option" menu, InftyReader recognizes all the image files in the sub folders of the selected
folder. For example, if you select the folder "foldertop" having the subfolder
| |-- a.tif
| |-- b.tif
and if you select the file type "IML" for the output file type, then, you will
get the files "subfolder1.iml", "subfolder2.iml" in the folder "foldertop".
The recognition results of a.tif and b.tif (resp. c.tif and d.tif) are saved
in the file subfolder1.iml (resp. subfolder2.iml, respectively).
If you select LaTeX as output file type, you will get "subfolder1.tex", "subfolder2.tex", and it is similar for other file types HR-TeX and XHTML.
To use InftyReader, please get a license key from sAccessNet -> click here.
As for the trial use, please see: AboutTrialUse.txt
InftyReader is usable under the following license agreement.
(1) You may not modify the software in any manner. You may not reverse engineer, decompile or disassemble the software.
(2) You may not sell the software without making a formal agreement with Science Accessibility Net.
You may distribute the software only free of charge, without modifying the zip-package of the software.
(3) The author shall have no obligation to correct errors and inconveniences of the software.
(4) The author shall not be responsible for any lost and damage caused by the use of the software.
(5) The license is basically limited to personal use, including the case purchased by an institution for specified user. Shared use by a small
group members is also allowed. In the default setting, the number of the pages recognizable by this
license is limited to 10000 pages per monthe. In case an institution uses the software to service a number of clients or to digitize huge numbers of volumes, please use the enterprise version, reading the page here. For more details, please contact us.
Any report about the software will be welcome.
Non Profit Organization
Science Accessibility Net (sAccessNet)
e-mail: support1"at"mail.sciaccess.net (Please replace "at" by @.)