1. Optical Character Recognition
    1. 对文本资料进行扫描,然后对图像文件进行分析处理,获取文字及版面信息
  2. Neural Network: pattern recognition- Tutorial
  3. android源码实现
    1. external/tesseract/*
    2. 编译: $ cd external/tesseract/ $ mm 生成libocr.so 0(liblept,libtess,libjpeg),push系统/system/lib/中,它也可以放在软件的安装包里
    3. doOcr
      1. /** * 进行图片识别 * * @param bitmap * 待识别图片 * @param language * 识别语言 * @return 识别结果字符串 */ public String doOcr(Bitmap bitmap, String language) { TessBaseAPI baseApi = new TessBaseAPI(); //初始化OCR的字符集data路径:getSDPath()="/mnt/sdcard" baseApi.init(getSDPath(), language); //baseApi.init(".", language); // 必须加此行,tess-two要求BMP必须为此配置 bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true); baseApi.setImage(bitmap); String text = baseApi.getUTF8Text(); baseApi.clear(); baseApi.end(); return text; }
    4. getUTF8Text
      1. /** * The recognized text is returned as a String which is coded as UTF8. * * @return the recognized text */ public String getUTF8Text() { // Trim because the text will have extra line breaks at the end String text = nativeGetUTF8Text(); return text.trim(); }
    5. native方法
      1. private native String nativeGetUTF8Text();
      2. public class TessBaseAPI { /** * Used by the native implementation of the class. */ private int mNativeData; static { System.loadLibrary("lept"); System.loadLibrary("tess"); nativeClassInit(); }
  4. 算法
    1. 切分,归一化,特征提取,和数据库中对比,结果输出
    2. 识别率
      1. a search strategy
      2. classification engine
    3. 特征提取
      1. characterized by
      2. having a large set of symbols
    4. 匹配能力
    5. 匹配算法
      1. 模板匹配
      2. 人工神经网络训练
      3. 结构化分析、特征统计
      4. Training data
    6. http://code.google.com/p/tesseract-ocr/wiki/Documentation
  5. 应用
    1. http://www.i2ocr.com/
    2. 邮政编码识别的信函自动分拣系统
    3. 汉王公司、国外的东芝,IBM、HP,NEC
  6. 光,颜色
  7. 人类的认知/AI/模式识别
  8. 宇宙全息律:一切皆是映射
  9. 01信息技术:存储(Si Disc), 计算(Si cpu),显示(LCD)---道生一,一生二,二生三,三生万物.
  10. 计算机图形学:图像2值算法(大津展之,1979)
  11. QRCode 二维码
  12. DC