HugeCTR
HugeCTR is a high-efficiency GPU framework designed for Click-Through-Rate (CTR) estimation training.
Design Goals:
Optimized for recommender system
Easy to be customized
Please find more introductions in our HugeCTR User Guide and doxygen files in directory docs
cuBLAS >= 9.1
Compute Capability >= 60 (P100)
CMake >= 3.8
cuDNN >= 7.5
NCCL >= 2.0
Clang-Format 3.8
OpenMPI >= 4.0 (optional, if require multi-nodes training)
$ git submodule init $ git submodule update
Compute Capability can be specified by -DSM=XX, which is SM=60 by default. Only one Compute Capability is avaliable to be set.
$ mkdir -p build $ cd build $ cmake -DCMAKE_BUILD_TYPE=Release -DSM=XX .. $ make
Compute Capability can be specified by -DSM=XX, which is SM=60 by default. Only one Compute Capability is avaliable to be set.
$ mkdir -p build $ cd build $ cmake -DCMAKE_BUILD_TYPE=Debug -DSM=XX .. $ make
To use mixed precision training, enable USE_WMMA and set SCALER to 128/256/512/1024 by:
$ mkdir -p build $ cd build $ cmake -DSM=XX -DUSE_WMMA=ON -DSCALER=YYY ..
Please refer to samples/*
Default coding style follows Google C++ coding style (link).
This project also uses Clang-Format(link) to help developers to fix style issue, such as indent, number of spaces to tab, etc.
The Clang-Format is a tool that can auto-refactor source code.
Use following instructions to install and enable Clang-Format:
$ sudo apt-get install clang-format
# First, configure Cmake as usual $ mkdir -p build $ cd build $ cmake -DCLANGFORMAT=ON ..# Second, run Clang-Format$ cmake --build . --target clangformat# Third, check what Clang-Format did modify$ git status# or$ git diff
Doxygen is supported in HugeCTR and by default on-line documentation
browser (in HTML) and an off-line reference manual (in LaTeX) can be
generated within docs/.
Within project home directory
$ doxygen
Totally three kinds of files will be used as input of HugeCTR Training: configuration file (.json), model file, data set.
Configuration file should be a json format file e.g. simple_sparse_embedding.json
There are four sessions in a configuration file: "solver", "optimizer", "data", "layers". The sequence of these sessions is not restricted.
You can specify the device (or devices), batchsize, model_file.. in solver session;
and the optimizer that will be used in every layer.
File list and data set related configurations will be specified under data session.
Finally, layers should be listed under layers. Note that embedders should always be the first layer.
Model file is a binary file that will be loaded for weight initilization. In model file weight will be stored in the order of layers in configuration file.
A data set includes a ASCII format file list and a set of data in binary format.
A file list starts with a number which indicate the number of files in the file list, then comes with the path of each data file.
$ cat simple_sparse_embedding_file_list.txt 10 ./simple_sparse_embedding/simple_sparse_embedding0.data ./simple_sparse_embedding/simple_sparse_embedding1.data ./simple_sparse_embedding/simple_sparse_embedding2.data ./simple_sparse_embedding/simple_sparse_embedding3.data ./simple_sparse_embedding/simple_sparse_embedding4.data ./simple_sparse_embedding/simple_sparse_embedding5.data ./simple_sparse_embedding/simple_sparse_embedding6.data ./simple_sparse_embedding/simple_sparse_embedding7.data ./simple_sparse_embedding/simple_sparse_embedding8.data ./simple_sparse_embedding/simple_sparse_embedding9.data
A data file (binary) contains a header and data (many samples).
Header Definition:
typedef struct DataSetHeader_{ long long number_of_records; //the number of samples in this data file
long long label_dim; //dimension of label
long long slot_num; //the number of slots in each sample
long long reserved; //reserved for future use} DataSetHeader;Data Definition (each sample):
typedef struct Data_{ int label[label_dim];
Slot slots[slot_num];
} Data;typedef struct Slot_{ int nnz;
T* keys; //long long or uint} Slot;下一篇:tacotron2
还没有评论,说两句吧!
热门资源
TensorFlow-Course
This repository aims to provide simple and read...
Klukshu-Sockeye-...
KLUKSHU SOCKEYE PROJECTS 2016 This repositor...
flaireWebSite
flaireWebSite
caffe_ocr
caffe_ocr是一个对现有主流ocr算法研究实验性的项...
DeepFaceLab_Linux
DeepFaceLab的Linux Ubuntu 版本
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com