1 Star 0 Fork 0

modelee / en_core_web_sm

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 3.56 KB
一键复制 编辑 原始数据 按行查看 历史
Adriane Boyd 提交于 2023-11-21 08:10 . Update spaCy pipeline
tags language license model-index
spacy
token-classification
en
mit
name results
en_core_web_sm
task metrics
name type
NER
token-classification
name type value
NER Precision
precision
0.8454836771
name type value
NER Recall
recall
0.8456530449
name type value
NER F Score
f_score
0.8455683525
task metrics
name type
TAG
token-classification
name type value
TAG (XPOS) Accuracy
accuracy
0.97246532
task metrics
name type
UNLABELED_DEPENDENCIES
token-classification
name type value
Unlabeled Attachment Score (UAS)
f_score
0.9175304332
task metrics
name type
LABELED_DEPENDENCIES
token-classification
name type value
Labeled Attachment Score (LAS)
f_score
0.89874821
task metrics
name type
SENTS
token-classification
name type value
Sentences F-Score
f_score
0.9059485531

Details: https://spacy.io/models/en#en_core_web_sm

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_sm
Version 3.7.1
spaCy >=3.7.2,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
License MIT
Author Explosion

Label Scheme

View label scheme (113 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, _SP, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 97.25
SENTS_P 92.02
SENTS_R 89.21
SENTS_F 90.59
DEP_UAS 91.75
DEP_LAS 89.87
ENTS_P 84.55
ENTS_R 84.57
ENTS_F 84.56
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/modelee/en_core_web_sm.git
git@gitee.com:modelee/en_core_web_sm.git
modelee
en_core_web_sm
en_core_web_sm
main

搜索帮助