资源分类

gpt2-k8s-cloud-run

2020-03-03 |

60 |

0 |

gpt2-k8s-cloud-run

Cloud Run - GPT2 on k8s

Containerising PyTorch models in a repeatable way. Deploy OpenAI's GPT-2 model and expose it over a Flask API. Finally deploy it to GCP repositories and publish it on a k8s cluster using Cloud Run.

First, before anything else download the model

mkdir models
curl --output models/gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin

Local

Local Python

python3 -m venv ./venvsource venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Then run the python flask server using the following

cd deployment
python run_server.py

Conda

conda env create -f=environment.yml --n ml-flasksource activate ml-flask

Then run the python flask server using the following

cd deployment
python run_server.py

docker-compose

Setup

docker-compose up --build flask

Go to http://localhost:5000

Shutdown

docker-compose down -v

GCP

First build and push the container to GCR (make sure to update the variables at the to of the script first)

./container_push.sh# #!/usr/bin/env bash# PROJECT_ID=devopstar# # Set gcloud project# gcloud config set project $PROJECT_ID# # Authenticate Docker# gcloud auth configure-docker# # Build Container# docker build -t gpt-2-flask-api .# # Tag Image for GCR# docker tag gpt-2-flask-api:latest #     asia.gcr.io/$PROJECT_ID/gpt-2-flask-api:latest# # Push to GCR# docker push asia.gcr.io/$PROJECT_ID/gpt-2-flask-api:latest

Create the cluster

Making use of the following guide: https://cloud.google.com/run/docs/quickstarts/prebuilt-deploy-gke

Run ./gke.sh after changing the project ID (and region if you want somewhere closer)

#!/usr/bin/env bashPROJECT_ID=devopstar
REGION=australia-southeast1

gcloud beta container clusters create "$PROJECT_ID-gpt2-demo" 
    --project "$PROJECT_ID" 
    --zone "$REGION-a" 
    --no-enable-basic-auth 
    --cluster-version "1.12.6-gke.10" 
    --machine-type "n1-standard-4" 
    --image-type "COS" 
    --disk-type "pd-standard" 
    --disk-size "100" 
    --metadata disable-legacy-endpoints=true 
    --scopes "https://www.googleapis.com/auth/cloud-platform" 
    --num-nodes "3" 
    --enable-stackdriver-kubernetes 
    --enable-ip-alias 
    --network "projects/$PROJECT_ID/global/networks/default" 
    --subnetwork "projects/$PROJECT_ID/regions/$REGION/subnetworks/default" 
    --default-max-pods-per-node "110" 
    --addons HorizontalPodAutoscaling,HttpLoadBalancing,Istio,CloudRun 
    --istio-config auth=MTLS_PERMISSIVE 
    --enable-autoupgrade 
    --enable-autorepair

Get Istio IP

kubectl get svc istio-ingressgateway -n istio-systemexport GATEWAY_IP=$(kubectl -n istio-system get service    istio-ingressgateway    -o jsonpath='{.status.loadBalancer.ingress[0].ip}')echo $GATEWAY_IPcurl -v -H "Host: gpt-2-flask-api.default.example.com" http://$GATEWAY_IP

Test the endpoint by using https://chrome.google.com/webstore/detail/virtual-hosts/aiehidpclglccialeifedhajckcpedom?hl=en

Set the virtual host to gpt-2-flask-api.default.example.com and the IP to the one you recieved from the istio gateway.

Delete Cluster

gcloud container clusters delete devopstar-gpt2-demo

Deploy to Cloud Run (GKE)

gcloud beta run deploy 
    --image asia.gcr.io/devopstar/gpt-2-flask-api 
    --cluster devopstar-gpt2-demo 
    --cluster-location australia-southeast1-a# Service name: (gpt-2-flask-api):  # Deploying container to Cloud Run on GKE service [gpt-2-flask-api] in namespace [default] of cluster [devopstar-gpt2-demo]# ⠧ Deploying new service... Configuration "gpt-2-flask-api" is waiting for a Revision to become ready.#   ⠧ Creating Revision...#   . Routing traffic...

Deploy to Cloud Run (No GKE)

Make sure to first enable Cloud Run APIs: https://console.developers.google.com/apis/api/run.googleapis.com

# Set Cloud Run regiongcloud config set run/region us-central1# Rungcloud beta run deploy 
    --image asia.gcr.io/devopstar/gpt-2-flask-api 
    --memory 2Gi# Service name: (gpt-2-flask-api):  # Deploying container to Cloud Run service [gpt-2-flask-api] in project [devopstar] region [us-central1]# Allow unauthenticated invocations to new service [gpt-2-flask-api]? # (y/N)?  y# ✓ Deploying new service... Done.#   ✓ Creating Revision...#   ✓ Routing traffic...# Done.# Service [gpt-2-flask-api] revision [gpt-2-flask-api-9eb49475-778f-4f11-8a5c-60d1ed3bd2ff] has been deployed and is serving traffic at https://gpt-2-flask-api-ulobqfivxa-uc.a.run.app

Navigate to URL https://gpt-2-flask-api-ulobqfivxa-uc.a.run.app/

Issues with Memory

Unfortunately due to memory limits it doesn't look like we can use Cloud Run for this purpose at the moment...

# 2019-04-11T13:08:14.652058Z 16%|█▌ | 80/512 [04:23<29:04, 4.04s/it]# 2019-04-11T13:08:18.862046Z 16%|█▌ | 81/512 [04:27<29:07, 4.06s/it]# 2019-04-11T13:08:22.155164Z 16%|█▌ | 82/512 [04:31<29:24, 4.10s/it]# 2019-04-11T13:08:26.152569Z 16%|█▌ | 83/512 [04:34<27:35, 3.86s/it]# 2019-04-11T13:08:30.952013Z 16%|█▋ | 84/512 [04:38<27:49, 3.90s/it]# 2019-04-11T13:08:34.552013Z 17%|█▋ | 85/512 [04:43<29:40, 4.17s/it]# 2019-04-11T13:08:39.355836Z 17%|█▋ | 86/512 [04:47<28:23, 4.00s/it]# 2019-04-11T13:08:43.051604Z 17%|█▋ | 87/512 [04:52<30:02, 4.24s/it]# 2019-04-11T13:08:46.922621ZPOST504 234 B 300 s Chrome 73  /upstream request timeout

However... It does appear that a request timeout can be set using the following update command. It allows a maximum of 15 minutes to be set.

# On Updategcloud beta run services update gpt-2-flask-api 
    --timeout=15m# On Creationgcloud beta run deploy 
    --image asia.gcr.io/devopstar/gpt-2-flask-api 
    --memory 2Gi 
    --timeout=15m

And... No dice

# 15 minute request only gets halfway# 2019-04-12T12:59:52.451876Z 55%|█████▌ | 283/512 [14:23<06:42, 1.76s/it]# 2019-04-12T12:59:57.242055Z 55%|█████▌ | 284/512 [14:26<08:26, 2.22s/it]# 2019-04-12T13:00:02.343637Z 56%|█████▌ | 285/512 [14:31<11:19, 2.99s/it]# 2019-04-12T13:00:06.045967Z 56%|█████▌ | 286/512 [14:36<13:39, 3.63s/it]# 2019-04-12T13:00:09.955038Z 56%|█████▌ | 287/512 [14:40<13:40, 3.65s/it]# 2019-04-12T13:00:14.548409Z 56%|█████▋ | 288/512 [14:44<13:54, 3.73s/it]

Custom Domain

If you have a custom domain verified already you can also attached a subdomain to the endpoint. Check / verify a domain you own to start with

gcloud domains verify devopstar.com

Map the service to a subdomain

gcloud beta run domain-mappings create 
    --service gpt-2-flask-api 
    --domain gpt2.devopstar.com# Creating......done.# Mapping successfully created. Waiting for certificate provisioning. You must configure your DNS records for certificate issuance to begin.# RECORD TYPE  CONTENTS# CNAME        ghs.googlehosted.com

In your DNS provider, add a CNAME entry for your service

Access your endpoint on the custom domain https://gpt2.devopstar.com

Bugs

Can't set PORT variable in GUI

Error: (Name cannot be one of reserved names (i.e. GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_REGION, HOME, K_CONFIGURATION, K_REVISION, K_SERVICE, PATH, PORT, PWD and TMPDIR)
Service needs to be setup on 8080 because of this

Attribution

上一篇：gpt-2-demo

下一篇：gpt2-japanese

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Keras-ResNeXt

Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
spark-corenlp

This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...

CapsNet with capsule-wise convolution Project ...
inferno-boilerplate

This is a very basic boilerplate example for pe...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com