ngc-container-replicator
Clones nvcr.io using the either DGX (compute.nvidia.com) or NGC (ngc.nvidia.com) API keys.
The replicator will make an offline clone of the NGC/DGX container registry. In its current form, the replicator will download every CUDA container image as well as each Deep Learning framework image in the NVIDIA project.
Tarfiles will be saved in /output
inside the container, so be sure to volume
mount that directory. In the following example, we will collect our images in/tmp
on the host.
Use --min-version
to limit the number of versions to download. In the example
below, we will only clone versions 17.10
and later DL framework images.
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/output deepops/replicator --project=nvidia --min-version=17.12 --api-key=<your-dgx-or-ngc-api-key>
You can also filter on specific images. If you only wanted Tensorflow, PyTorch
and TensorRT, you would simply add --image
for each option, e.g.
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/output deepops/replicator --project=nvidia --min-version=17.12 --image=tensorflow --image=pytorch --image=tensorrt --dry-run --api-key=<your-dgx-or-ngc-api-key>
Note: the --dry-run
option lets you see what will happen without committing
to a lengthy download.
Note: a state.yml
file will be created the output directory. This saved state will be used to
avoid pulling images that were previously pulled. If you wish to repull and save an image, just
delete the entry in state.yml
corresponding to the image_name
and tag
you wish to refresh.
If you don't already have a deepops
namespace, create one now.
kubectl create namespace deepops
Next, create a secret with your NGC API Key
kubectl -n deepops create secret generic ngc-secret --from-literal=apikey=<your-api-key-goes-here>
Next, create a persistent volume claim that will life outside the lifecycle of the CronJob. If you are using DeepOps you can use a Rook/Ceph PVC similar to:
--- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ngc-replicator-pvc namespace: deepops labels: app: ngc-replicator spec: storageClassName: rook-raid0-retain # <== Replace with your StorageClass accessModes: - ReadWriteOnce resources: requests: storage: 32Mi
Finally, create a CronJob
that executes the replicator on a schedule. This
eample run the replicator every hour. Note: This example usedRook block storage to provide a persistent volume to hold thestate.yml
between executions. This ensures you will only download new
container images. For more details, see our DeepOps
project.
--- apiVersion: v1 kind: ConfigMap metadata: name: replicator-config namespace: deepops data: ngc-update.sh: | #!/bin/bash ngc_replicator --project=nvidia --min-version=$(date +"%y.%m" -d "1 month ago") --py-version=py3 --image=tensorflow --image=pytorch --image=tensorrt --no-exporter --registry-url=registry.local # <== Replace with your local repo --- apiVersion: batch/v1beta1 kind: CronJob metadata: name: ngc-replicator namespace: deepops labels: app: ngc-replicator spec: schedule: "0 4 * * *" jobTemplate: spec: template: spec: nodeSelector: node-role.kubernetes.io/master: "" containers: - name: replicator image: deepops/replicator imagePullPolicy: Always command: [ "/bin/sh", "-c", "/ngc-update/ngc-update.sh" ] env: - name: NGC_REPLICATOR_API_KEY valueFrom: secretKeyRef: name: ngc-secret key: apikey volumeMounts: - name: registry-config mountPath: /ngc-update - name: docker-socket mountPath: /var/run/docker.sock - name: ngc-replicator-storage mountPath: /output volumes: - name: registry-config configMap: name: replicator-config defaultMode: 0777 - name: docker-socket hostPath: path: /var/run/docker.sock type: File - name: ngc-replicator-storage persistentVolumeClaim: claimName: ngc-replicator-pvc restartPolicy: Never
make dev py.test
save markdown readmes for each image. these are not version controlled
test local registry push service. coded, beta testing
add templater to workflow
上一篇:kubernetes
下一篇:nvscic2c
还没有评论,说两句吧!
热门资源
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
shih-styletransfer
shih-styletransfer Code from Style Transfer ...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com