Skip to content

Universal Endpoint

You can use the Universal Endpoint to contact every provider.

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}

Description

AI Gateway offers multiple endpoints for each Gateway you create - one endpoint per provider, and one Universal Endpoint. The Universal Endpoint requires some adjusting to your schema, but supports additional features. Some of these features are, for example, retrying a request if it fails the first time, or configuring a fallback model/provider.

You can use the Universal endpoint to contact every provider. The payload is expecting an array of message, and each message is an object with the following parameters:

  • provider : the name of the provider you would like to direct this message to. Can be OpenAI, workers-ai, or any of our supported providers.
  • endpoint: the pathname of the provider API you’re trying to reach. For example, on OpenAI it can be chat/completions, and for Workers AI this might be @cf/meta/llama-3.1-8b-instruct. See more in the sections that are specific to each provider.
  • authorization: the content of the Authorization HTTP Header that should be used when contacting this provider. This usually starts with “Token” or “Bearer”.
  • query: the payload as the provider expects it in their official API.

Example

Request
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \
--header 'Content-Type: application/json' \
--data '[
{
"provider": "workers-ai",
"endpoint": "@cf/meta/llama-3.1-8b-instruct",
"headers": {
"Authorization": "Bearer {cloudflare_token}",
"Content-Type": "application/json"
},
"query": {
"messages": [
{
"role": "system",
"content": "You are a friendly assistant"
},
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
},
{
"provider": "openai",
"endpoint": "chat/completions",
"headers": {
"Authorization": "Bearer {open_ai_token}",
"Content-Type": "application/json"
},
"query": {
"model": "gpt-4o-mini",
"stream": true,
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
}
]'

The above will send a request to Workers AI Inference API, if it fails it will proceed to OpenAI. You can add as many fallbacks as you need, just by adding another JSON in the array.

Websockets API beta

The Universal Endpoint can also be accessed via a WebSockets API which provides a single persistent connection, enabling continuous communication. This API supports all AI providers connected to AI Gateway, including those that do not natively support WebSockets.

Example request

import WebSocket from "ws";
const ws = new WebSocket(
"wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/",
{
headers: {
"cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN",
},
},
);
ws.send(
JSON.stringify({
type: "universal.create",
request: {
eventId: "my-request",
provider: "workers-ai",
endpoint: "@cf/meta/llama-3.1-8b-instruct",
headers: {
Authorization: "Bearer WORKERS_AI_TOKEN",
"Content-Type": "application/json",
},
query: {
prompt: "tell me a joke",
},
},
}),
);
ws.on("message", function incoming(message) {
console.log(message.toString());
});