recommendation Model
This is an implementation of WideDeep as described in the Wide & Deep Learning for Recommender System paper.
WideDeep model jointly trained wide linear models and deep neural network, which combined the benefits of memorization and generalization for recommender systems.
The Criteo datasets are used for model training and evaluation.
To download the dataset, please install Pandas package first. Then issue the following command:
python scripts/process_data.py
The entire code structure is as following:
1, src/
callback.py "Callback class: EvaluateCallback, LossCallback;"
config.py "config args for data, model, train"
datasets.py "dataset loader class"
WideDeep.py "Model structure code, include: DenseLayer, widedeepModel, NetWithLossClass, TrainStepWrap, PredictWithSigmoid, ModelBuilder"
2, ./
train.py "The main script for train and eval, init by the config.py in deepfm_model_zoo/src/config.py "
test.py "The main script for predict, load checkpoint file."
rank_table_2p.json "The config file for two npus"
rank_table_4p.json "The config file for four npus"
rank_table_8p.json "The config file for eight npus"
3, scripts/
process_data.py "The file for raw data download and process"
run_train.sh "The shell script for training"
run_eval.sh "The shell script for evaluation"
4, scripts_16p/
start.sh "the run file of sixteen npus"
5, tools/
cluster_16p.json "the info of npu"
common.sh "the ssh key config of npu"
hccl.json "the config file of npu"
To train and evaluate the model, issue the following command:
python tools/train_and_test.py
Arguments:
--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.To train the model, issue the following command:
python tools/train.py
Arguments:
--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.To evaluate the model, issue the following command:
python tools/test.py
Arguments:
--data_path
: This should be set to the same directory given to the data_download's data_dir argument.--epochs
: Total train epochs.--batch_size
: Training batch size.--eval_batch_size
: Eval batch size.--field_size
: The number of features.--vocab_size
: The total features of dataset.--emb_dim
: The dense embedding dimension of sparse feature.--deep_layers_dim
: The dimension of all deep layers.--deep_layers_act
: The activation of all deep layers.--keep_prob
: The rate to keep in dropout layer.--ckpt_path
:The location of the checkpoint file.--eval_file_name
: Eval output file.--loss_file_name
: Loss output file.There are other arguments about models and training process. Use the --help
or -h
flag to get a full list of possible arguments with detailed descriptions.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。