1 Star 1 Fork 0

郭少强/deeplearning-note

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
text_classification.py 1.04 KB
一键复制 编辑 原始数据 按行查看 历史
import mojimoji
import neologdn
import MeCab
def normalize_text(text):
result = mojimoji.zen_to_han(text, kana=False)
result = neologdn.normalize(result)
return result
def text_to_words(text):
m = MeCab.Tagger('-d /usr/local/lib/mecab/dic/mecab-ipadic-neologd')
m.parse('')
# 사전정의
text = normalize_text(text)
m_text = m.parse(text)
basic_words = []
# mecab
m_text = m_text.split('\n')
for row in m_text:
word = row.split("\t")[0]
if word == 'EOS':
break
else:
pos = row.split('\t')[1]
slice_ = pos.split(',')
parts = slice_[0]
if parts == '기호':
if word != '。':
continue
basic_words.append(word)
elif slice_[0] in ('형용사', '동사'):
basic_words.append(slice_[-3])
elif slice_[0] in ('명사', '부사'):
basic_words.append(word)
basic_words = ' '.join(basic_words)
return basic_words
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/guo_shaoqiang/deeplearning-note.git
git@gitee.com:guo_shaoqiang/deeplearning-note.git
guo_shaoqiang
deeplearning-note
deeplearning-note
master

搜索帮助