335 Star 1.5K Fork 862

MindSpore / docs

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
Ngram.md 1.67 KB
一键复制 编辑 原始数据 按行查看 历史
俞涵 提交于 2023-01-29 16:21 . add mindquantum and modify url

Function differences with torchtext.data.utils.ngrams_iterator

torchtext.data.utils.ngrams_iterator

torchtext.data.utils.ngrams_iterator(
    token_list,
    ngrams
)

For more information, see torchtext.data.utils.ngrams_iterator.

mindspore.dataset.text.Ngram

class mindspore.dataset.text.Ngram(
    n,
    left_pad=("", 0),
    right_pad=("", 0),
    separator=" "
)

For more information, see mindspore.dataset.text.Ngram.

Differences

PyTorch: Returns an iterator that generates the given tokens and ngrams.

MindSpore: TensorOp generates n-grams from a one-dimensional string tensor.

Code Example

from mindspore.dataset import text
from torchtext.data.utils import ngrams_iterator

# In MindSpore, output numpy.ndarray type n-gram.

ngram_op = text.Ngram(3, separator="-")
output = ngram_op(["WildRose Country", "Canada's Ocean Playground", "Land of Living Skies"])
print(output)
# Out:
# ["WildRose Country-Canada's Ocean Playground-Land of Living Skies"]

# In torch, return an iterator that yields the given tokens and their ngrams.
token_list = ['here', 'we', 'are']
print(list(ngrams_iterator(token_list, 2)))
# Out:
# ['here', 'we', 'are', 'here we', 'we are']
1
https://gitee.com/mindspore/docs.git
git@gitee.com:mindspore/docs.git
mindspore
docs
docs
r2.0.0-alpha

搜索帮助

53164aa7 5694891 3bd8fe86 5694891