CentOS下tesseract安装测试
(1)首先安装依赖的leptonica库:
yum groupinstall "Development Tools" -y yum -y install wget cmake yum -y install libjpeg-devel libpng-devel libtiff-devel zlib-devel yum -y install gcc gcc-c++ make numpy wget http://www.leptonica.com/source/leptonica-1.71.tar.gz
tar zxvf leptonica-1.71.tar.gz cd leptonica-1.71 ./configure --prefix=/usr make make install
(2)编译tesseract了,所用版本 3.04,编译需要automake、libtool,直接用yum安装就可以了。
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.00.tar.gz mv 3.04.00.tar.gz Tesseract3.04.00.tar.gz tar -xvf Tesseract3.04.00.tar.gz cd tesseract-3.04.00/ ./autogen.sh ./configure make && make install
(3)下载安装英文,中文繁体,中文简体 识别库。
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/chi_tra.traineddata cp *.traineddata /usr/local/share/tessdata/
(4)测试识别结果
wget -c " http://login.sina.com.cn/cgi/pin.php?r=98901353&s=0&p=xd-50ec7fb512ddaeac3cbde407a4499c0c324c " -O test.jpg tesseract test.jpg ./b -psm 3 -l chi_sim+eng
会输出结果到b.txt