Introduction

Send a POST request, get an AI response. No GPUs to provision, no infrastructure to manage. ZeroGPU handles model hosting, scaling, and routing. You get an API key and start building.

First API call in 5 minutes

Three steps: get credentials, send a request, see the response.

What you get

One API endpoint

POST /v1/responses — send text in, get AI output back. Six SDKs included.

Real-time analytics

Track every request: token counts, latency, volume, model distribution.

Project isolation

Separate dev, staging, and production with independent keys, logs, and analytics.

How it works

Create a project

One organization, multiple projects. Each gets its own API key and dashboard.

Send requests

POST to /v1/responses with your key, project ID, and model. Response comes back as structured JSON.

Monitor everything

Token usage, request volume, latency, error rates — all visible in the dashboard. Debug individual requests in Logs.

Go deeper

Quickstart

Credentials → first request → response. Under 5 minutes.

API Reference

Full endpoint spec with request/response schemas.

SDK Examples

Copy-paste examples for Python, JavaScript, Rust, Go, Ruby.

Dashboard

Your API key, project ID, and ready-to-use code snippet.

Getting Started

Platform

Core Concepts

Resources

First API call in 5 minutes

What you get

One API endpoint

Real-time analytics

Project isolation

How it works

Go deeper

Quickstart

API Reference

SDK Examples

Dashboard

Getting Started

Platform

Core Concepts

Resources

First API call in 5 minutes

​What you get

One API endpoint

Real-time analytics

Project isolation

​How it works

​Go deeper

Quickstart

API Reference

SDK Examples

Dashboard

What you get

How it works

Go deeper