Skip to content

API Reference

Base URL: Your Cloud Run service URL (e.g., https://aimd-inference-mvp-xxxxx.run.app)

All requests to Cloud Run require a GCP identity token in the Authorization header when the service is deployed with --no-allow-unauthenticated. Additionally, POST /predict and GET /status require an API key in the X-API-Key header.


POST /predict

Run AI detection on an audio file.

Authentication

  • Authorization: Bearer <identity-token> (Cloud Run IAM)
  • X-API-Key: <api-key> (application-level)

Request Body

Field Type Required Default Description
gcs_uri string Yes GCS URI of the audio file (gs://bucket/path/file.mp3)
num_snippets integer or "max" No "max" Number of 30-second snippets to analyze. "max" analyzes all possible snippets.
xgb_threshold float (0.0–1.0) No 0.5 Decision threshold for the ensemble classifier

Response (200 OK)

Field Type Description
filename string Original filename extracted from the GCS URI
prediction "AI" or "REAL" Final ensemble prediction
probability float Ensemble output probability (0.0 = definitely real, 1.0 = definitely AI)
confidence float Confidence score (0.0 = uncertain, 1.0 = highly confident)
snippet_results array Per-snippet breakdown (see below)
model_probabilities object Average probability from each individual model
processing_time_ms float Total processing time in milliseconds

snippet_results items

Field Type Description
snippet_id integer 1-indexed snippet number
start_time float Snippet start time in seconds
end_time float Snippet end time in seconds
probability float Snippet-level AI probability
prediction "AI" or "REAL" Snippet-level prediction

model_probabilities keys

The model_probabilities object contains a key for each model in the ensemble, with its average probability across all snippets. The specific keys are stable across a given release version.

Example

curl -X POST ${SERVICE_URL}/predict \
  -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"gcs_uri": "gs://my-bucket/audio/song.mp3"}'
{
  "filename": "song.mp3",
  "prediction": "AI",
  "probability": 0.85,
  "confidence": 0.70,
  "snippet_results": [
    {
      "snippet_id": 1,
      "start_time": 0.0,
      "end_time": 30.0,
      "probability": 0.88,
      "prediction": "AI"
    },
    {
      "snippet_id": 2,
      "start_time": 30.0,
      "end_time": 60.0,
      "probability": 0.82,
      "prediction": "AI"
    }
  ],
  "model_probabilities": {
    "aggro_cnn": 0.88,
    "tory_cnn": 0.75,
    "aggro_stt": 0.91,
    "tory_stt": 0.80,
    "udio_lite": 0.85
  },
  "processing_time_ms": 2345.6
}

Error Responses

Status Error Description
400 Invalid GCS URI Malformed gs:// path or path traversal detected
400 Download failed File not accessible in GCS (permissions or not found)
401 Unauthorized Missing or invalid X-API-Key header
422 Inference failed Model error during processing
429 Rate limit exceeded Too many requests. Check Retry-After header.
503 Service busy All GPU slots occupied. Retry after a few seconds.

GET /health

Health check endpoint. Used by Cloud Run for startup and liveness probes.

Authentication

No X-API-Key required. Cloud Run IAM (Authorization header) may still apply.

Response (200 OK)

Field Type Description
status "healthy" or "unhealthy" Service health status
device string Compute device ("cuda" or "cpu")
models_loaded integer Number of loaded models
xgb_loaded boolean Whether the ensemble classifier is loaded
gpu_available boolean Whether a GPU is detected
gpu_memory_allocated_mb float Current GPU memory usage in MB

Example

curl ${SERVICE_URL}/health \
  -H "Authorization: Bearer $(gcloud auth print-identity-token)"

GET /metrics

Prometheus metrics endpoint.

Authentication

No X-API-Key required.

Response

Returns metrics in Prometheus text exposition format.

Available metrics:

Metric Type Labels Description
inference_requests_total Counter status, endpoint Total requests by status and endpoint
inference_request_duration_seconds Histogram endpoint Request latency (buckets: 0.1s to 120s)
rate_limit_rejections_total Counter Total rate limit rejections

GET /status

Detailed service status including rate limiter state. Useful for debugging.

Authentication

  • Authorization: Bearer <identity-token> (Cloud Run IAM)
  • X-API-Key: <api-key> (required)

Response (200 OK)

Returns a JSON object with engine (same as /health response) and rate_limiter state.

curl ${SERVICE_URL}/status \
  -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  -H "X-API-Key: ${API_KEY}"

Request Tracing

All requests are assigned a unique request ID for tracing. You can provide your own via the X-Request-ID header, or the service will generate one automatically. The request ID is returned in the X-Request-ID response header.