1 Star 0 Fork 0

李童/training-operator

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README

Kubeflow Training SDK

Python SDK for Training Operator

Requirements.

Python 2.7 and 3.5+

Installation & Usage

pip install

pip install kubeflow-training

Then import the package:

from kubeflow import training 

Setuptools

Install via Setuptools.

python setup.py install --user

(or sudo python setup.py install to install the package for all users)

Getting Started

Please follow the sample to create, update and delete TFJob.

Documentation for API Endpoints

TODO(andreyvelich): These docs are outdated. Please track this issue for the status: https://gitee.com/vak80/katib/issues/2081

Class Method Description
TFJobClient create Create TFJob
TFJobClient get Get or watch the specified TFJob or all TFJob in the namespace
TFJobClient patch Patch the specified TFJob
TFJobClient delete Delete the specified TFJob
TFJobClient wait_for_job Wait for the specified job to finish
TFJobClient wait_for_condition Waits until any of the specified conditions occur
TFJobClient get_job_status Get the TFJob status
TFJobClient is_job_running Check if the TFJob status is Running
TFJobClient is_job_succeeded Check if the TFJob status is Succeeded
TFJobClient get_pod_names Get pod names of TFJob
TFJobClient get_logs Get training logs of the TFJob
PyTorchJobClient create Create PyTorchJob
PyTorchJobClient get Get the specified PyTorchJob or all PyTorchJob in the namespace
PyTorchJobClient patch Patch the specified PyTorchJob
PyTorchJobClient delete Delete the specified PyTorchJob
PyTorchJobClient wait_for_job Wait for the specified job to finish
PyTorchJobClient wait_for_condition Waits until any of the specified conditions occur
PyTorchJobClient get_job_status Get the PyTorchJob status
PyTorchJobClient is_job_running Check if the PyTorchJob running
PyTorchJobClient is_job_succeeded Check if the PyTorchJob Succeeded
PyTorchJobClient get_pod_names Get pod names of PyTorchJob
PyTorchJobClient get_logs Get training logs of the PyTorchJob

Documentation For Models

Building conformance tests

Run

docker build . -f Dockerfile.conformance -t <tag>
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/vak80/training-operator.git
git@gitee.com:vak80/training-operator.git
vak80
training-operator
training-operator
v1.7-branch

搜索帮助