Articles | C2C

Optimize Cloud Costs with Vertex AI & PalM 2 API

Written by Petar Ojdrovic | Aug 12, 2021 5:00:00 AM

Hey community! I am looking into replacing some of our OpenAI usage with Vertex / PalM 2 and am wondering how to get the necessary API and use them in production. We would be using these APIs in cloud functions.

 

gcloud auth print-access-token does not work in cloud functions.

I've used service account credentials to initialize clients for other APIs like translate, bigquery, etc., but this endpoint does not accept a service account credential.

Should I use the runtimes service account and get an access token that way? 

curl -s "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token" -H "Metadata-Flavor: Google" | jq -r .access_token

 

Here's the sample request found in the Vertex AI playground 

API_ENDPOINT="us-central1-aiplatform.googleapis.com"
PROJECT_ID="home-service-62a33"
MODEL_ID="text-bison@001"

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://${API_ENDPOINT}/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:predict" -d \
$'{
"instances": [
{
"content": ""
}
],
"parameters": {
"temperature": 0.2,
"maxOutputTokens": 256,
"topP": 0.8,
"topK": 40
}
}'

 

More broadly - how should we authenticate these API's for use in a cloud function?

 

Thanks in advance!

 

Best answer by pthiagar

Generally, once the service account policy binding is setup for the function, the most reliable way to generate tokens would be to do so programmatically using the authentication libraries. 

But, if you have to do this manually, there are many ways to do this too.  For your e.g. you should be able to use Compute metadata server to get id token for a specific audience like this-

curl "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity?audience=AUDIENCE" \

     -H "Metadata-Flavor: Google"

Where audience is the URL of the invoked function.  You can then pass the google signed id token in the Authorization Bearer header to the request.  

See docs for more info - authentication options for invocation.

Hope this helps!