diff --git a/README.en.md b/README.en.md new file mode 100644 index 0000000000000000000000000000000000000000..e12678cba04ed6f3336411f35eac2b8f1c4a3e32 --- /dev/null +++ b/README.en.md @@ -0,0 +1,76 @@ +# Text Autofill Using Photo Recognition + +## Overview + +This sample demonstrates how to use the general text recognition capability provided by @kit.CoreVisionKit to convert printed text (such as delivery information) into image information through CameraPicker or PhotoViewPicker, convert the image information into text characters that can be used by the device using the text recognition technology, and extract structured data based on actual service rules. + +## Effect + +| Home Pag | Photo Capture | Recognition | Saving | +|-------------------------------------------------------|---------------------------------------------------------------|-----------------------------------------------------------|-------------------------------------------------| +| ![Home Page](screenshots/device/1_en.png "Home Page") | ![Photo Capture](screenshots/device/2_en.png "Photo Capture") | ![Recognition](screenshots/device/3_en.png "Recognition") | ![Saving](screenshots/device/4_en.png "Saving") | + +How to Use +1. Tap the Recognition button to open a dialog for selecting an image source. Choose the photo mode to capture the text you want to recognize, or select the album to pick the target image directly from your Gallery. +2. Once the text in the image is recognized, the text content will be automatically populated into the text box. +3. Tap the Recognition button again. The information in the text box will be extracted as structured data and displayed in the list below the button. +4. Tap the Saving address button. A message will be displayed, indicating that the address is saved successfully, and the previous content in the text box will be automatically cleared. + +## Project Directory + +``` +entry/src/main/ +├──ets +| ├──common +| | ├──constants +| | | └──CommonConstants.ets // Common constant class +| | └──utils +| | ├──AddressParse.ets // Delivery information +| | ├──OCRManager.ets // Visual recognition class +| | ├──PromptActionManager.ets // Dialog management class +| | └──Logger.ets // Log class +| ├──entryability +| | └──EntryAbility.ets // Entry Ability +| ├──viewmodel +| | └──DataModel.ets // UI model class +| ├──views +| | ├──ConsigneeInfoItem.ets // List item UI +| | └──DialogBuilder.ets // Dialog UI +| └──pages +| └──Index.ets // Home page +└──resources // Resources +``` + +## How to Implement + +You can implement the general text recognition capability by calling the textRecognition.recognizeText() method of @kit.CoreVisionKit. + +* Photo capture: call CameraPicker.pick() of @kit.CameraKit to take photos. The system provides the interactive UI, and no camera permissions are required. +* Album: call the select() method of the photoAccessHelper.PhotoViewPicker object of @kit.MediaLibraryKit to open the Gallery and select an image. +* Visual text recognition: call the textRecognition.recognizeText() method of @kit.CoreVisionKit to recognize image information. +* Structured data extraction: extract data using regular expressions. This sample focuses on extracting common delivery information. To cover complex scenarios and achieve accurate extraction, consider leveraging professional cloud services or NLP tools. + +## Required Permissions + +N/A + +## Constraints + +1. This sample is only supported on Huawei phones running standard systems. +2. Core Vision Kit is available only in the Chinese mainland (excluding Hong Kong (China), Macao (China), and Taiwan (China)). +3. The HarmonyOS version must be HarmonyOS 5.1.1 Release or later. +4. The DevEco Studio version must be DevEco Studio 5.1.1 Release or later. +5. The HarmonyOS SDK version must be HarmonyOS 5.1.1 Release SDK or later. +6. Supported formats include JPG, JPEG, and PNG. +7. Supported languages are simplified Chinese, English, Japanese, Korean, and traditional Chinese. +8. The text length should not exceed 10,000 characters. +9. Printed font recognition is supported, though there are limitations in recognizing handwritten fonts. +10. The input image should have a suitable imaging quality (720p or higher recommended), with a height between 100 px and 15,210 px, a width between 100 px and 10,000 px, and an aspect ratio preferably greater than 1:10, ideally close to the aspect ratio of a smartphone screen. +11. The shooting angle should be less than 30 degrees from the vertical direction of the plane where the text is located. +12. It is recommended that you use the following standard delivery information formats for recognition (replace \* with numbers): + + 1. Recipient: Zhao Liu Contact: 13\*\*\*\*\*\*\*\*\* Address: \*\*Floor, Chengjian Building, No. \*\*\*\*\*, Tiyuxi Road, Tianhe District, Guangzhou + 2. Mr. Zhang (13\*-\*\*\*\*-\*\*\*\*) Address: Room ****, Building \*\*, No. \*\*\* Jianguo Road, Chaoyang District, Beijing + 3. Recipient: Mr. Wang, Contact: 010-\*\*\*\*\*\*\*\*, Address: Room \*\*\*, Unit \*\*, Building \*\*, No. \*\*, South Street, Zhongguancun, Haidian District + 4. Lucky Tea House, No. \*\*\*\*, Panshan Road, Xiangzhou District, Zhuhai City, Guangdong Province, Mr. Chen, 13\*\*\*\*\*\*\*\*\* + \ No newline at end of file diff --git a/README.md b/README.md index 4479cc8f0759a2dea4412d64b7d9091f65312927..9c52650d71d858233a19abe9afe64bd849e1ddb3 100644 --- a/README.md +++ b/README.md @@ -48,7 +48,7 @@ entry/src/main/ * 拍照:通过调用`@kit.CameraKit`的CameraPicker.pick()拍摄照片。交互界面由系统提供,无需申请相机权限。 * 相册:通过调用`@kit.MediaLibraryKit`的photoAccessHelper.PhotoViewPicker对象的select()方法拉起图库,选择图片。 * 视觉文字识别:通过调用`@kit.CoreVisionKit`的textRecognition.recognizeText()方法对图像信息进行识别。 -* 结构化数据提取:通过正则处理和提取格式化数据。示例中只是对常见的收货信息进行简单提取。如需覆盖全面复杂的场景,做到精确提取,可以考虑使用专业的云服务或NLP工具。 +* 结构化数据提取:通过正则处理和提取数据。示例中只是对常见的收货信息进行简单提取。如需覆盖复杂的场景,做到精确提取,可以考虑使用专业的云服务或NLP工具。 ## 相关权限 @@ -67,9 +67,9 @@ entry/src/main/ 9. 支持文档印刷体识别,在识别手写字体方面能力有所欠缺。 10. 输入图像具有合适成像的质量(建议720p以上),100px<高度<15210px,100px<宽度<10000px,高宽比例建议10:1以下(高度小于宽度的10倍),接近手机屏幕高宽比例为宜。 11. 拍摄角度与文本所在平面垂直方向的夹角应小于30度。 -12. 推荐使用以下常规的收货信息进行拍照识别: +12. 推荐使用以下常规的收货信息进行拍照识别(使用时替换\*号为数字): - 1. 收货人:赵六 联系方式:13800000000 收货地址:广州市天河区体育西路18999号城建大厦66层 - 2. 张先生 (138-0000-0000) 收货地址: 北京市朝阳区建国路888号院5号楼3203室 - 3. 收件人:王工,联系电话:010-88881234,地址:海淀区中关村南大街55号院33号楼44单元302室 - 4. 广东省珠海市香洲区盘山路2688号幸运茶馆,陈先生,13500000000 \ No newline at end of file + 1. 收货人:赵六 联系方式:13\*\*\*\*\*\*\*\*\* 收货地址:广州市天河区体育西路\*\*\*\*\*号城建大厦\*\*层 + 2. 张先生 (13\*-\*\*\*\*-\*\*\*\*) 收货地址: 北京市朝阳区建国路\*\*\*号院\*\*号楼\*\*\*\*室 + 3. 收件人:王工,联系电话:010-\*\*\*\*\*\*\*\*,地址:海淀区中关村南大街\*\*号院\*\*号楼\*\*单元\*\*\*室 + 4. 广东省珠海市香洲区盘山路\*\*\*\*号幸运茶馆,陈先生,13\*\*\*\*\*\*\*\*\* \ No newline at end of file diff --git a/screenshots/device/1_en.png b/screenshots/device/1_en.png new file mode 100644 index 0000000000000000000000000000000000000000..e2c4e1133fe30f9b0fbbb5ce1f481ba2073696a6 Binary files /dev/null and b/screenshots/device/1_en.png differ diff --git a/screenshots/device/2_en.png b/screenshots/device/2_en.png new file mode 100644 index 0000000000000000000000000000000000000000..b1b1d3d25d088a90b0c2433bdd0b7f3959b07a85 Binary files /dev/null and b/screenshots/device/2_en.png differ diff --git a/screenshots/device/3_en.png b/screenshots/device/3_en.png new file mode 100644 index 0000000000000000000000000000000000000000..831482ce53ea97b44197036c12c9bd367c0be417 Binary files /dev/null and b/screenshots/device/3_en.png differ diff --git a/screenshots/device/4_en.png b/screenshots/device/4_en.png new file mode 100644 index 0000000000000000000000000000000000000000..1208aed9c2567a461abef452ad6da133e185147a Binary files /dev/null and b/screenshots/device/4_en.png differ