Windows, pytesseract, tesseract, python 3, OCR
2017. 10. 25. 22:24
virtualenv py3_ocr
py3_ocr/Scripts/activate
cd py3_ocr
pip install Pillow
pip install pytesseract
pip freeze
=>
Pillow==4.3.0
pytesseract==0.1.7
Windows, Tesseract 설치
https://github.com/tesseract-ocr/tesseract/wiki
test.png
test.py
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
# pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>'
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
# Include the above line, if you don't have tesseract executable in your PATH
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
# print(pytesseract.image_to_string(Image.open('test.png')))
# print(pytesseract.image_to_string(Image.open('test.png'), lang='eng'))
image = Image.open('test.png')
tessdata_dir_config = '--tessdata-dir "C:\\Program Files (x86)\\Tesseract-OCR"'
# Example config: '--tessdata-dir "C:\\Program Files (x86)\\Tesseract-OCR\\tessdata"'
# It's important to add double quotes around the dir path.
Q = pytesseract.image_to_string(image, lang='eng', config=tessdata_dir_config)
Q = Q[:-2].replace("X","*")
print(eval(Q))
결과
2275