summaryrefslogtreecommitdiff
path: root/graphics/ocropus/ocroscript.1
diff options
context:
space:
mode:
authorPierre Cazenave <pwcazenave at gmail {dot} com>2010-07-30 02:46:02 -0500
committerErik Hanson <erik@slackbuilds.org>2010-07-31 22:31:51 -0500
commite502945912c3ccc6d55a6819bc921cf5f47cc4fd (patch)
tree5c008c3bf671002bdedefa3f8a7fde59d88bb113 /graphics/ocropus/ocroscript.1
parent43cc5518b42cbafb28111cfa607ce8a50e64bb6a (diff)
downloadslackbuilds-e502945912c3ccc6d55a6819bc921cf5f47cc4fd.tar.gz
graphics/ocropus: Added (document analysis and OCR system)
Signed-off-by: Robby Workman <rworkman@slackbuilds.org>
Diffstat (limited to 'graphics/ocropus/ocroscript.1')
-rw-r--r--graphics/ocropus/ocroscript.143
1 files changed, 43 insertions, 0 deletions
diff --git a/graphics/ocropus/ocroscript.1 b/graphics/ocropus/ocroscript.1
new file mode 100644
index 0000000000..d8087203f7
--- /dev/null
+++ b/graphics/ocropus/ocroscript.1
@@ -0,0 +1,43 @@
+.TH ocroscript 1 "June 06, 2008"
+.SH NAME
+ocropus \- command line OCR tool
+.SH SYNOPSIS
+.B ocroscript
+.RI "<script> <arguments>"
+.SH DESCRIPTION
+You can see a list of all available commands by looking in the $OCROSCRIPTS
+(/usr/share/ocropus/scripts/ by default) path.
+.PP
+The \(oqrecognize\(cq script uses tesseract for recognition and sends the html-based hOCR
+ouput to stdout. Tesseract is probably the most mature text recognizer within
+OCRopus at the moment. Natively, Tesseract doesn't do layout analysis, but
+combined with OCRopus, it makes for a pretty good OCR system:
+.RS
+$ ocroscript recognize page.png > page.html
+.RE
+.PP
+Here is a brief summary of the remaining command line commands available.
+You will need to look at the script to see what the command line arguments are:
+.TP
+degrade.lua
+Simple document image degradation
+.TP
+hocr-to-text.lua
+Convert hOCR output to plain text.
+.TP
+line-clean.lua
+Given a line image, remove marginal noise and fix some other problems.
+.TP
+sauvola.lua
+Perform Sauvola thresholding.
+.SH SEE ALSO
+.BR tesseract (1),
+.br
+.PP
+.UR http://code.google.com/p/ocropus/w/list
+.UE
+.SH AUTHOR
+ocroscript was written by Thomas Breuel.
+.PP
+This manual page was written by Jeffrey Ratcliffe <Jeffrey.Ratcliffe@gmail.com>,
+for the Debian project (but may be used by others).