Getting Started¶
This guide walks you through pulling the AIMD Docker image, deploying it to Cloud Run, and making your first prediction.
Prerequisites¶
- Docker installed
- Google Cloud SDK (
gcloud) installed and authenticated - Access granted to the AIMD Artifact Registry repository (provided by Beatdapp)
- API key for authentication (provided by Beatdapp)
Step 1: Pull the Docker Image¶
Authenticate Docker with Artifact Registry:
Pull the image using the tag from the Release Notes:
Verify the image digest
After pulling, verify the image digest matches the one listed in the Release Notes for your version:
Step 2: Configure Environment Variables¶
At minimum, you need:
| Variable | Description |
|---|---|
GCP_PROJECT |
Your Google Cloud project ID |
API_KEYS |
Comma-separated API keys for authentication |
See Configuration for the full list of environment variables.
Step 3: Deploy to Cloud Run¶
The AIMD service requires a GPU. NVIDIA L4 GPUs are available in these Cloud Run regions: us-central1, us-east4, europe-west1, europe-west4, asia-southeast1.
gcloud run deploy aimd-inference-mvp \
--image ${REGION}-docker.pkg.dev/${GCP_PROJECT}/aimd-inference/aimd-inference-mvp:${TAG} \
--region us-central1 \
--gpu 1 \
--gpu-type nvidia-l4 \
--cpu 4 \
--memory 16Gi \
--min-instances 0 \
--max-instances 1 \
--timeout 300 \
--concurrency 4 \
--set-env-vars "GCP_PROJECT=${GCP_PROJECT},API_KEYS=${API_KEY}" \
--execution-environment gen2 \
--no-allow-unauthenticated \
--project ${GCP_PROJECT}
Note
- Models are baked into the Docker image — no GCS paths or volume mounts needed.
- If prompted about zonal redundancy quota, select Y to deploy without it.
- Use
--min-instances 0to scale to zero when idle (cost optimization).
Step 4: Verify the Deployment¶
Get your service URL:
SERVICE_URL=$(gcloud run services describe aimd-inference-mvp \
--region us-central1 \
--format 'value(status.url)')
Run a health check (no API key required):
Expected response:
{
"status": "healthy",
"device": "cuda",
"models_loaded": 5,
"xgb_loaded": true,
"gpu_available": true,
"gpu_memory_allocated_mb": 1234.5
}
Step 5: Make Your First Prediction¶
curl -X POST ${SERVICE_URL}/predict \
-H "Authorization: Bearer $(gcloud auth print-identity-token)" \
-H "Content-Type: application/json" \
-H "X-API-Key: ${API_KEY}" \
-d '{"gcs_uri": "gs://your-bucket/audio.mp3"}'
See API Reference for the full request/response specification and Usage Examples for more detailed examples.