引入依赖:
<dependency><groupId>org.bytedeco</groupId><artifactId>javacv-platform</artifactId><version>1.5.5</version></dependency>
引入中文语言训练数据集:chi_sim
GitHub - tesseract-ocr/tessdata: Trained models with fast variant of the "best" LSTM models + legacy modelsTrained models with fast variant of the "best" LSTM models + legacy models - GitHub - tesseract-ocr/tessdata: Trained models with fast variant of the "best" LSTM models + legacy modelshttps://github.com/tesseract-ocr/tessdata代码示例:
import org.bytedeco.javacpp.BytePointer;
import org.bytedeco.leptonica.PIX;
import org.bytedeco.leptonica.global.lept;
import org.bytedeco.tesseract.TessBaseAPI;public class JavaCVOcr {public static String OCR(String lng,String dataPath,String imagePath) {TessBaseAPI api=new TessBaseAPI();if (api.Init(dataPath, lng)!=0){System.out.println("error");}PIX image= lept.pixRead(imagePath);if (image==null){return "";}api.SetImage(image);BytePointer outText=api.GetUTF8Text();String result=outText.getString();api.End();outText.deallocate();lept.pixDestroy(image);return result;}public static void main(String[] args) {String property = System.getProperty("user.dir");String text= OCR("chi_sim", property, "C:\\Users\\Desktop\\1693147958548.png");System.out.println(text);}
}