# ChineseStrokes **Repository Path**: ouyangpengdev/ChineseStrokes ## Basic Information - **Project Name**: ChineseStrokes - **Description**: Number of strokes for every Chinese character. 81k+ characters included currently. You can use this to sort Chinese characters by number of strokes. 每个中文的笔画数。收录了八万一千多个中文字。可用作按笔画排序。 - **Primary Language**: Go - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-03-23 - **Last Updated**: 2021-03-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # ChineseStrokes Number of strokes for every Chinese character. 81k+ characters included currently. You can use this to sort Chinese characters by number of strokes. 每个中文的笔画数。收录了八万一千多个中文字。可用作按笔画排序。 Source (数据来源) - zidian.911cha.com ## Usage You can retrieve the strokes data manually: ``` # specify a code range ChineseStrokes 4e00 4e0a # specify a file seq 19968 19978 | awk '{printf "%x\n", $1}' > codes ChineseStrokes codes ``` Or you can use existing data: To generate Go code: ``` grep -h -E '^[0-9]+ [0-9]+$' data/*.txt | sort -n | awk ' BEGIN{print "func getStrokes(charCode int) int {\n\tswitch charCode {"} {printf "\tcase %s:\n\t\treturn %s\n", $1, $2} END{print "\t}\n\treturn 0\n}"}' > code.go ``` To generate Ruby hash: ``` grep -h -E '^[0-9]+ [0-9]+$' data/*.txt | sort -n | awk ' BEGIN{print "charCodes = {"} {printf " %s => %s,\n", $1, $2} END{print "}"}' > code.rb ``` To generate JSON: ``` grep -h -E '^[0-9]+ [0-9]+$' data/*.txt | sort -n | awk ' BEGIN{print "{"} NR>1{ print "," } {printf " \"%s\": %s", $1, $2} END{print "\n}"}' > code.json ``` ## CharCode | Language | Get char code | To string | |------------|----------------------|------------------------------| | JavaScript | `'永'.charCodeAt(0)` | `String.fromCharCode(27704)` | | Ruby | `'永'.ord` | `27704.chr(Encoding::UTF_8)` | | Python | `ord(u'永')` | `unichr(27704)` | | Go | `[]rune("永")[0]` | `string(27704)` | ## Statistics Average: ``` grep -h -E '^[0-9]+ [0-9]+$' data/*.txt | awk '{total += $2} END{print total/NR}' => 13.961 ``` Distribution: ``` grep -h -E '^[0-9]+ [0-9]+$' data/*.txt | sort -k 2 -n | uniq -f 1 -c ``` count | char code | storkes ------ | --------- | ------- 20 | 131273 | 1 77 | 131073 | 2 173 | 131075 | 3 449 | 131072 | 4 808 | 131086 | 5 1620 | 131096 | 6 2703 | 131105 | 7 3776 | 131116 | 8 4751 | 131125 | 9 5625 | 131133 | 10 6445 | 131137 | 11 7037 | 131142 | 12 6765 | 131149 | 13 6650 | 131151 | 14 6377 | 131155 | 15 5816 | 131158 | 16 4715 | 131164 | 17 3988 | 131167 | 18 3282 | 131194 | 19 2669 | 131168 | 20 2080 | 131486 | 21 1589 | 131195 | 22 1185 | 131272 | 23 882 | 131487 | 24 564 | 132209 | 25 392 | 132210 | 26 308 | 133483 | 27 220 | 131488 | 28 127 | 131489 | 29 91 | 136579 | 30 50 | 135583 | 31 41 | 138176 | 32 30 | 133644 | 33 12 | 133323 | 34 11 | 136472 | 35 21 | 136473 | 36 5 | 145843 | 37 4 | 161759 | 38 6 | 136474 | 39 2 | 168403 | 40 1 | 173258 | 41 1 | 160152 | 42 2 | 161969 | 44 1 | 169571 | 45 1 | 169572 | 47 2 | 158149 | 48 1 | 158148 | 49 1 | 40856 | 51 1 | 19003 | 52 1 | 181929 | 53 2 | 132411 | 64