2.4K Star 8.2K Fork 4.4K

GVPMindSpore / mindspore

 / 详情

dataset: add concatDataset operation

DONE
RFC
创建于  
2020-05-14 16:19
name about labels
RFC Use this template for the new feature or enhancement kind/feature

Background

  • Add take op for Dataset to support get certain number of rows(batches)

Introduction

  • ConcatDataset operation means:
    def __add__(self, datasets):
        """
        Concat the datasets in the input list of datasets. 

        Note:
        The column name,column data type and rank of column data should be the same in input datasets.

        Args:
            datasets (list or class Dataset): A list of datasets or a single class Dataset
                to be concated together with this dataset.

        Returns:
            ConcatDataset, dataset concated.

        Examples:
            >>> import mindspore.dataset as ds
            >>> # ds1 and ds2 are instances of Dataset object
            >>> # creates a dataset by concating ds1 and ds2
            >>> data1 = ds1 + ds2
        """

Example

  1. code
import mindspore.dataset as ds

def generator_3():
      for i in range(3):
            yield (np.array([i]), )

def generator_10():
      for i in range(3, 10):
            yield (np.array([i]), )


def test_concat_01():
    """
    Test concat: test concat 2 datasets that have the same column name and data type
    """
    data1 = ds.GeneratorDataset(generator_3, ["col1"])
    data2 = ds.GeneratorDataset(generator_10, ["col1"])
    
    data3 = data1 + data2

    # Here i refers to index, d refers to data element 
    for i, d in enumerate(data3):
        logger.info("data:", d[0][0])
        assert i == d[0][0]

    assert sum([1 for _ in data3]) == 10

  1. output
0, 1, 2, 3, 4,5,6,7,8,9

评论 (1)

ms_yan 创建了RFC
ms_yan 关联仓库设置为MindSpore/mindspore
ms_yan 负责人设置为ms_yan
展开全部操作日志

Hey @ms_yan, Welcome to MindSpore Community.
All of the projects in MindSpore Community are maintained by @mindspore-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/mindspore/community/blob/master/command.md to find the details.

ms_yan 修改了描述
ms_yan 任务状态TODO 修改为WIP
ms_yan 通过mindspore/mindspore Pull Request !1157任务状态WIP 修改为DONE

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(2)
5518576 mindspore ci 1587902139
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助