ck-request-asplos18-mobilenets-tvm-arm
This repository contains experimental workflow and all related artifacts as portable, customizable and reusable Collective Knowledge components for image classification from the 1st ReQuEST tournament at ASPLOS'18 on reproducible SW/HW co-design of deep learning (speed, accuracy, energy, costs).
Title: Optimizing Deep Learning Workloads on ARM GPU with TVM
Authors: Lianmin Zheng, Tianqi Chen
Details: Link
Algorithm: image classification with ResNet-18, MobileNet and VGG16
Program: TVM/NNVM, ARM Compute Library, MXNet, OpenBLAS
Compilation: g++
Transformations:
Binary: will be compiled on a target platform
Data set: ImageNet 2012 validation (50,000 images)
Run-time environment: Linux with OpenCL
Hardware: Firefly-RK3399 with ARM Mali-T860MP4 or other boards with ARM Mali GPUs
Run-time state: set by our scripts
Execution: inference speed
Metrics: total execution time; top1/top5 accuracy over some (all) images from the data set
Output: classification result; execution time; accuracy
Experiments: benchmarking the inference speed of different backends on ImageNet (automated via CK command line)
How much disk space required (approximately)? ~4GB
How much time is needed to prepare workflow (approximately)? several hours (mainly native compilation of packages)
How much time is needed to complete experiments (approximately)? hours for full ImageNet accuracy validation (50000 images)
Publicly available?: Yes
Code license(s)?: MIT license
CK workflow framework used? Yes
CK workflow URL: https://github.com/ctuning/ck-request-asplos18-mobilenets-tvm-arm
CK results URL: https://github.com/ctuning/ck-request-asplos18-results-mobilenets-tvm-arm
Original artifact: https://github.com/merrymercy/tvm-mali
NB: The #
sign means sudo
.
# sudo apt-get install libtinfo-dev
# pip install numpy scipy decorator matplotlib or # pip3 install numpy scipy decorator matplotlib
The minimal installation requires:
Python 2.7 or 3.3+ (limitation is mainly due to unitests)
Git command line client.
You can install CK in your local user space as follows:
$ git clone http://github.com/ctuning/ck $ export PATH=$PWD/ck/bin:$PATH $ export PYTHONPATH=$PWD/ck:$PYTHONPATH
You can also install CK via PIP with sudo to avoid setting up environment variables yourself:
$ sudo pip install ck
$ ck pull repo:ck-request-asplos18-mobilenets-tvm-arm
It is possible to install and test the snapshot of this workflow from the ACM Digital Library without interfering with your current CK installation. Download related file "request-asplos18-artifact-?-ck-workflow.zip" to a temporary directory, unzip it and then execute the following commands:
$ . ./prepare_virtual_ck.sh $ . ./start_virtual_ck.sh
All CK repositories will be installed in your current directory. You can now proceed with further evaluation as described below.
$ ck detect platform.gpgpu --opencl
$ sudo apt-get install libblas*
To detect and register in ck :
ck detect soft:lib.blas
To check the environment:
$ ck show env --tags=blas,no-openblas
A possible output:
cfe1e23a4472bb1d linux-32 32 BLAS library api-3 32bits,blas,blas,cblas,host-os-linux-32,lib,no-openblas,target-os-linux-32,v0,v0.3
$ ck install package:lib-openblas-0.2.18-universal
If you want to test other openblas version:
$ ck list package:lib-openblas*
$ ck install package:lib-lapack-3.4.2
$ ck install package --tags=compiler,llvm
Though above is the suggested method, you can also install llvm via apt and the detect it via CK.
# apt-get install llvm-4.0 clang-4.0 $ ck detect soft:compiler.llvm
$ ck install package:lib-armcl-opencl-17.12 --env.USE_GRAPH=ON --env.USE_NEON=ON --env.USE_EMBEDDED_KERNELS=ON
To check/install other versions available via CK
$ ck list package:lib-armcl-opencl-* $ ck install package --tags=lib,armcl env.USE_GRAPH=ON --env.USE_NEON=ON --env.USE_EMBEDDED_KERNELS=ON
$ ck install package:lib-mxnet-master-cpu --env.USE_F16C=0
$ ck install package:lib-nnvm-tvm-master-opencl
This program must be first compiled
$ ck compile program:request-armcl-inference
and then executed as follows:
$ ck run program:request-armcl-inference --cmd_key=all
You can also use "ck benchmark" command to automatically set CPU/GPU frequency to max, compile program, run it N times and perform statistical analysis on empirical characteristics:
$ ck benchmark program:request-armcl-inference --cmd_key=all
We validated results from the authors:
backend: ARMComputeLib-mali model: vgg16 conv_method: gemm dtype: float32 cost: 1.6511 backend: ARMComputeLib-mali model: vgg16 conv_method: gemm dtype: float16 cost: 0.976307 backend: ARMComputeLib-mali model: vgg16 conv_method: direct dtype: float32 cost: 3.99093 backend: ARMComputeLib-mali model: vgg16 conv_method: direct dtype: float16 cost: 1.61435 backend: ARMComputeLib-mali model: mobilenet conv_method: gemm dtype: float32 cost: 0.172009 backend: ARMComputeLib-mali model: mobilenet conv_method: direct dtype: float32 cost: 0.174635
Extra info: CK program meta
$ ck run program:request-mxnet-inference --cmd_key=all or $ ck benchmark program:request-mxnet-inference --cmd_key=all
We validated results from the authors:
backend: MXNet+OpenBLAS model: resnet18 dtype: float32 cost:0.4145 backend: MXNet+OpenBLAS model: mobilenet dtype: float32 cost:0.3408 backend: MXNet+OpenBLAS model: vgg16 dtype: float32 cost:3.1244
Extra info: CK program meta and code
$ ck run program:request-tvm-nnvm-inference --cmd_key=all or $ ck benchmark program:request-tvm-nnvm-inference --cmd_key=all
We validated results from the authors:
backend: TVM-mali model: vgg16 dtype: float32 cost:0.9599 backend: TVM-mali model: vgg16 dtype: float16 cost:0.5688 backend: TVM-mali model: resnet18 dtype: float32 cost:0.1748 backend: TVM-mali model: resnet18 dtype: float16 cost:0.1122 backend: TVM-mali model: mobilenet dtype: float32 cost:0.0814 backend: TVM-mali model: mobilenet dtype: float16 cost:0.0525
Extra info: CK program meta and code
Original benchmarking clients did not include real classification in this ReQuEST submission. We therefore provided code for real image classification for each of the above CK programs. This is also required to calculate model accuracy on all (or a subset of) ImageNet data set.
You can benchmark classification using MXNet with OpenBLAS as follows:
$ ck benchmark program:request-mxnet-inference --cmd_key=classify
You can also install ImageNet data sets for model accuracy validation as follows:
$ ck install package:imagenet-2012-val or $ ck install package:imagenet-2012-val-min-resized $ ck install package:imagenet-2012-aux
You can then run accuracy test as follows:
$ ck run program:request-mxnet-inference --cmd_key=test --env.STAT_REPEAT=1
You can find raw accuracy results (top1/top5) for several models here.
Extra info: CK program meta and code
You can benchmark classification and test accuracy using TVM/NNVM as follows:
$ ck benchmark program:request-tvm-nnvm-inference --cmd_key=classify $ ck run program:request-tvm-nnvm-inference --cmd_key=test --env.STAT_REPEAT=1
You can find raw accuracy results (top1/top5) for several models here.
Extra info: CK program meta and code
ReQuEST promotes reusability of AI/ML workflows, packages and artifacts using CK framework. Since image classification using ArmCL was already implemented and shared using CK format and added to the ReQuEST scoreboard, we can simply reuse this workflow and compare against public results!
Please, follow this ReadME to reproduce ArmCL classification results on Firefly-RK3399!
Validated experimental results were recorded and processed using the following scripts (we plan to automate it further for the future ReQuEST editions):
$ ck find script:benchmark-request-tvm-arm
They are now available in this CK repo and on the public ReQuEST scoreboard.
This workflow was converted to CK and validated by the following reviewers:
还没有评论,说两句吧!
热门资源
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com