google.com

Command Palette

Search for a command to run...

OpenAI compatibility  |  Gemini API  |  Google AI for Developers

Last updated: 6/19/2026

Gemini Deep Research is now available in preview with collaborative planning, visualization, MCP support, and more.

Was this helpful?

Send feedback

OpenAI compatibility

Gemini models are accessible using the OpenAI libraries (Python and TypeScript / Javascript) along with the REST API, by updating three lines of code and using your Gemini API key. If you aren't already using the OpenAI libraries, we recommend that you call the Gemini API directly.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) response = client.chat.completions.create( model="gemini-3.5-flash", messages=[\ { "role": "system",\ "content": "You are a helpful assistant."\ },\ {\ "role": "user",\ "content": "Explain to me how AI works"\ }\ ] ) print(response.choices[0].message) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); const response = await openai.chat.completions.create({ model: "gemini-3.5-flash", messages: [\ { role: "system",\ content: "You are a helpful assistant."\ },\ {\ role: "user",\ content: "Explain to me how AI works",\ },\ ], }); console.log(response.choices[0].message); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $GEMINI_API_KEY" \ -d '{ "model": "gemini-3.5-flash", "messages": [\ {\ "role": "user",\ "content": "Explain to me how AI works"\ }\ ] }'

What changed? Just three lines!

  • api_key="GEMINI_API_KEY": Replace "GEMINI_API_KEY" with your actual Gemini API key, which you can get in Google AI Studio.

  • base_url="https://generativelanguage.googleapis.com/v1beta/openai/": This tells the OpenAI library to send requests to the Gemini API endpoint instead of the default URL.

  • model="gemini-3.5-flash": Choose a compatible Gemini model

Thinking

Gemini models are trained to think through complex problems, leading to significantly improved reasoning. The Gemini API comes with thinking\ parameters which give fine grain control over how much the model will think.

Different Gemini models have different reasoning configurations, you can see how they map to OpenAI's reasoning efforts as follows:

reasoning_effort (OpenAI)thinking_level (Gemini 3.1 Pro)thinking_level (Gemini 3.1 Flash-Lite)thinking_level (Gemini 3 Flash)thinking_budget (Gemini 2.5)
minimallowminimalminimal1,024
lowlowlowlow1,024
mediummediummediummedium8,192
highhighhighhigh24,576

If no reasoning_effort is specified, Gemini uses the model's default level or budget.

If you want to disable thinking, you can set reasoning_effort to "none" for 2.5 models. Reasoning cannot be turned off for Gemini 2.5 Pro or 3 models.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) response = client.chat.completions.create( model="gemini-3.5-flash", reasoning_effort="low", messages=[\ { "role": "system",\ "content": "You are a helpful assistant."\ },\ {\ "role": "user",\ "content": "Explain to me how AI works"\ }\ ] ) print(response.choices[0].message) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); const response = await openai.chat.completions.create({ model: "gemini-3.5-flash", reasoning_effort: "low", messages: [\ { role: "system",\ content: "You are a helpful assistant."\ },\ {\ role: "user",\ content: "Explain to me how AI works",\ },\ ], }); console.log(response.choices[0].message); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $GEMINI_API_KEY" \ -d '{ "model": "gemini-3.5-flash", "reasoning_effort": "low", "messages": [\ {\ "role": "user",\ "content": "Explain to me how AI works"\ }\ ] }'

Gemini thinking models also produce thought summaries. You can use the extra_body field to include Gemini fields in your request.

Note that reasoning_effort and thinking_level/thinking_budget overlap functionality, so they can't be used at the same time.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) response = client.chat.completions.create( model="gemini-3.5-flash", messages=[{"role": "user", "content": "Explain to me how AI works"}], extra_body={ 'extra_body': { "google": { "thinking_config": { "thinking_level": "low", "include_thoughts": True } } } } ) print(response.choices[0].message) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); const response = await openai.chat.completions.create({ model: "gemini-3.5-flash", messages: [{role: "user", content: "Explain to me how AI works",}], extra_body: { "google": { "thinking_config": { "thinking_level": "low", "include_thoughts": true } } } }); console.log(response.choices[0].message); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d '{ "model": "gemini-3.5-flash", "messages": [{"role": "user", "content": "Explain to me how AI works"}], "extra_body": { "google": { "thinking_config": { "thinking_level": "low", "include_thoughts": true } } } }'

Gemini 3 supports OpenAI compatibility for thought signatures in chat completion APIs. You can find the full example on the thought signatures page.

Streaming

The Gemini API supports streaming responses.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) response = client.chat.completions.create( model="gemini-3.5-flash", messages=[\ {\ "role": "system",\ "content": "You are a helpful assistant."\ },\ { "role": "user",\ "content": "Hello!"\ }\ ], stream=True ) for chunk in response: print(chunk.choices[0].delta) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function main() { const completion = await openai.chat.completions.create({ model: "gemini-3.5-flash", messages: [\ {\ "role": "system",\ "content": "You are a helpful assistant."\ },\ {\ "role": "user",\ "content": "Hello!"\ }\ ], stream: true, }); for await (const chunk of completion) { console.log(chunk.choices[0].delta.content); } } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d '{ "model": "gemini-3.5-flash", "messages": [\ {"role": "user", "content": "Explain to me how AI works"}\ ], "stream": true }'

Function calling

Function calling makes it easier for you to get structured data outputs from generative models and is supported in the Gemini API.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) tools = [\ {\ "type": "function",\ "function": {\ "name": "get_weather",\ "description": "Get the weather in a given location",\ "parameters": {\ "type": "object",\ "properties": {\ "location": {\ "type": "string",\ "description": "The city and state, e.g. Chicago, IL",\ },\ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},\ },\ "required": ["location"],\ },\ }\ }\ ] messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}] response = client.chat.completions.create( model="gemini-3.5-flash", messages=messages, tools=tools, tool_choice="auto" ) print(response) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function main() { const messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]; const tools = [\ {\ "type": "function",\ "function": {\ "name": "get_weather",\ "description": "Get the weather in a given location",\ "parameters": {\ "type": "object",\ "properties": {\ "location": {\ "type": "string",\ "description": "The city and state, e.g. Chicago, IL",\ },\ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},\ },\ "required": ["location"],\ },\ }\ }\ ]; const response = await openai.chat.completions.create({ model: "gemini-3.5-flash", messages: messages, tools: tools, tool_choice: "auto", }); console.log(response); } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d '{ "model": "gemini-3.5-flash", "messages": [\ {\ "role": "user",\ "content": "What'\''s the weather like in Chicago today?"\ }\ ], "tools": [\ {\ "type": "function",\ "function": {\ "name": "get_weather",\ "description": "Get the current weather in a given location",\ "parameters": {\ "type": "object",\ "properties": {\ "location": {\ "type": "string",\ "description": "The city and state, e.g. Chicago, IL"\ },\ "unit": {\ "type": "string",\ "enum": ["celsius", "fahrenheit"]\ }\ },\ "required": ["location"]\ }\ }\ }\ ], "tool_choice": "auto" }'

Image understanding

Gemini models are natively multimodal and provide best in class performance on many common vision tasks.

PythonJavaScriptRESTMore

import base64 from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) # Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') # Getting the base64 string base64_image = encode_image("Path/to/agi/image.jpeg") response = client.chat.completions.create( model="gemini-3.5-flash", messages=[\ {\ "role": "user",\ "content": [\ {\ "type": "text",\ "text": "What is in this image?",\ },\ {\ "type": "image_url",\ "image_url": {\ "url": f"data:image/jpeg;base64,{base64_image}"\ },\ },\ ],\ }\ ], ) print(response.choices[0]) import OpenAI from "openai"; import fs from 'fs/promises'; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function encodeImage(imagePath) { try { const imageBuffer = await fs.readFile(imagePath); return imageBuffer.toString('base64'); } catch (error) { console.error("Error encoding image:", error); return null; } } async function main() { const imagePath = "Path/to/agi/image.jpeg"; const base64Image = await encodeImage(imagePath); const messages = [\ {\ "role": "user",\ "content": [\ {\ "type": "text",\ "text": "What is in this image?",\ },\ {\ "type": "image_url",\ "image_url": {\ "url": `data:image/jpeg;base64,${base64Image}`\ },\ },\ ],\ }\ ]; try { const response = await openai.chat.completions.create({ model: "gemini-3.5-flash", messages: messages, }); console.log(response.choices[0]); } catch (error) { console.error("Error calling Gemini API:", error); } } main(); bash -c ' base64_image=$(base64 -i "Path/to/agi/image.jpeg"); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d "{ \"model\": \"gemini-3.5-flash\", \"messages\": [\ {\ \"role\": \"user\",\ \"content\": [\ { \"type\": \"text\", \"text\": \"What is in this image?\" },\ {\ \"type\": \"image_url\",\ \"image_url\": { \"url\": \"data:image/jpeg;base64,${base64_image}\" }\ }\ ]\ }\ ] }" '

Generate an image

Generate an image using gemini-2.5-flash-image or gemini-3-pro-image-preview. Supported parameters include prompt, model, n, size, and response_format. Any other parameters not listed here or in the extra_body section will be silently ignored by the compatibility layer.

PythonJavaScriptRESTMore

import base64 from openai import OpenAI from PIL import Image from io import BytesIO client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/", ) response = client.images.generate( model="gemini-2.5-flash-image", prompt="a portrait of a sheepadoodle wearing a cape", response_format='b64_json', n=1, ) for image_data in response.data: image = Image.open(BytesIO(base64.b64decode(image_data.b64_json))) image.show() import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/", }); async function main() { const image = await openai.images.generate( { model: "gemini-2.5-flash-image", prompt: "a portrait of a sheepadoodle wearing a cape", response_format: "b64_json", n: 1, } ); console.log(image.data); } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/images/generations" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d '{ "model": "gemini-2.5-flash-image", "prompt": "a portrait of a sheepadoodle wearing a cape", "response_format": "b64_json", "n": 1, }'

Generate a video

Generate a video using veo-3.1-generate-preview via the Sora-compatible /v1/videos endpoint. Supported top-level parameters are prompt and model. Additional parameters like duration_seconds, image, and aspect_ratio must be passed with extra_body. See the extra_body section for all available parameters.

Video generation is a long-running operation that returns an operation ID you can poll for completion.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) # Returns a Long Running Operation (status: processing) response = client.videos.create( model="veo-3.1-generate-preview", prompt="A cinematic drone shot of a waterfall", ) print(f"Operation ID: {response.id}") print(f"Status: {response.status}") import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function main() { // Returns a Long Running Operation (status: processing) const response = await openai.videos.create({ model: "veo-3.1-generate-preview", prompt: "A cinematic drone shot of a waterfall", }); console.log(`Operation ID: ${response.id}`); console.log(`Status: ${response.status}`); } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/videos" \ -H "Authorization: Bearer $GEMINI_API_KEY" \ -F "model=veo-3.1-generate-preview" \ -F "prompt=A cinematic drone shot of a waterfall"

Check video status

Video generation is asynchronous. Use GET /v1/videos/{id} to poll the status and retrieve the final video URL when complete:

PythonJavaScriptRESTMore

import time from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) # Poll until video is ready video_id = response.id # From the create call while True: video = client.videos.retrieve(video_id) if video.status == "completed": print(f"Video URL: {video.url}") break elif video.status == "failed": print(f"Generation failed: {video.error}") break print(f"Status: {video.status}. Waiting...") time.sleep(10) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function main() { // Poll until video is ready const videoId = response.id; // From the create call while (true) { const video = await openai.videos.retrieve(videoId); if (video.status === "completed") { console.log(`Video URL: ${video.url}`); break; } else if (video.status === "failed") { console.log(`Generation failed: ${video.error}`); break; } console.log(`Status: ${video.status}. Waiting...`); await new Promise(resolve => setTimeout(resolve, 10000)); } } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/videos/VIDEO_ID" \ -H "Authorization: Bearer $GEMINI_API_KEY"

Audio understanding

Analyze audio input:

PythonJavaScriptRESTMore

import base64 from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) with open("/path/to/your/audio/file.wav", "rb") as audio_file: base64_audio = base64.b64encode(audio_file.read()).decode('utf-8') response = client.chat.completions.create( model="gemini-3.5-flash", messages=[\ {\ "role": "user",\ "content": [\ {\ "type": "text",\ "text": "Transcribe this audio",\ },\ {\ "type": "input_audio",\ "input_audio": {\ "data": base64_audio,\ "format": "wav"\ }\ }\ ],\ }\ ], ) print(response.choices[0].message.content) import fs from "fs"; import OpenAI from "openai"; const client = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/", }); const audioFile = fs.readFileSync("/path/to/your/audio/file.wav"); const base64Audio = Buffer.from(audioFile).toString("base64"); async function main() { const response = await client.chat.completions.create({ model: "gemini-3.5-flash", messages: [\ {\ role: "user",\ content: [\ {\ type: "text",\ text: "Transcribe this audio",\ },\ {\ type: "input_audio",\ input_audio: {\ data: base64Audio,\ format: "wav",\ },\ },\ ],\ },\ ], }); console.log(response.choices[0].message.content); } main(); bash -c ' base64_audio=$(base64 -i "/path/to/your/audio/file.wav"); curl "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d "{ \"model\": \"gemini-3.5-flash\", \"messages\": [\ {\ \"role\": \"user\",\ \"content\": [\ { \"type\": \"text\", \"text\": \"Transcribe this audio file.\" },\ {\ \"type\": \"input_audio\",\ \"input_audio\": {\ \"data\": \"${base64_audio}\",\ \"format\": \"wav\"\ }\ }\ ]\ }\ ] }" '

Structured output

Gemini models can output JSON objects in any structure you define.

PythonJavaScriptMore

from pydantic import BaseModel from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) class CalendarEvent(BaseModel): name: str date: str participants: list[str] completion = client.beta.chat.completions.parse( model="gemini-3.5-flash", messages=[\ {"role": "system", "content": "Extract the event information."},\ {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},\ ], response_format=CalendarEvent, ) print(completion.choices[0].message.parsed) import OpenAI from "openai"; import { zodResponseFormat } from "openai/helpers/zod"; import { z } from "zod"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai" }); const CalendarEvent = z.object({ name: z.string(), date: z.string(), participants: z.array(z.string()), }); const completion = await openai.chat.completions.parse({ model: "gemini-3.5-flash", messages: [\ { role: "system", content: "Extract the event information." },\ { role: "user", content: "John and Susan are going to an AI conference on Friday" },\ ], response_format: zodResponseFormat(CalendarEvent, "event"), }); const event = completion.choices[0].message.parsed; console.log(event);

Embeddings

Text embeddings measure the relatedness of text strings and can be generated using the Gemini API. You can use gemini-embedding-2-preview for multimodal embeddings or gemini-embedding-001 for text-only embeddings.

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) response = client.embeddings.create( input="Your text string goes here", model="gemini-embedding-2-preview" ) print(response.data[0].embedding) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/" }); async function main() { const embedding = await openai.embeddings.create({ model: "gemini-embedding-2-preview", input: "Your text string goes here", }); console.log(embedding); } main(); curl "https://generativelanguage.googleapis.com/v1beta/openai/embeddings" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer GEMINI_API_KEY" \ -d '{ "input": "Your text string goes here", "model": "gemini-embedding-2-preview" }'

Batch API

You can create batch jobs, submit them, and check their status using the OpenAI library.

You'll need to prepare the JSONL file in OpenAI input format. For example:

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gemini-3.5-flash", "messages": [{"role": "user", "content": "Tell me a one-sentence joke."}]}} {"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gemini-3.5-flash", "messages": [{"role": "user", "content": "Why is the sky blue?"}]}}

OpenAI compatibility for Batch supports creating a batch, monitoring job status, and viewing batch results.

Compatibility for upload and download is currently not supported. Instead, the following example uses the genai client for uploading and downloading files, the same as when using the Gemini Batch API.

PythonMore

from openai import OpenAI # Regular genai client for uploads & downloads from google import genai client = genai.Client() openai_client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) # Upload the JSONL file in OpenAI input format, using regular genai SDK uploaded_file = client.files.upload( file='my-batch-requests.jsonl', config=types.UploadFileConfig(display_name='my-batch-requests', mime_type='jsonl') ) # Create batch batch = openai_client.batches.create( input_file_id=batch_input_file_id, endpoint="/v1/chat/completions", completion_window="24h" ) # Wait for batch to finish (up to 24h) while True: batch = client.batches.retrieve(batch.id) if batch.status in ('completed', 'failed', 'cancelled', 'expired'): break print(f"Batch not finished. Current state: {batch.status}. Waiting 30 seconds...") time.sleep(30) print(f"Batch finished: {batch}") # Download results in OpenAI output format, using regular genai SDK file_content = genai_client.files.download(file=batch.output_file_id).decode('utf-8') # See batch_output JSONL in OpenAI output format for line in file_content.splitlines(): print(line)

The OpenAI SDK also supports generating embeddings with the Batch API. To do so, switch out the create method's endpoint field for an embeddings endpoint, as well as the url and model keys in the JSONL file:

# JSONL file using embeddings model and endpoint # {"custom_id": "request-1", "method": "POST", "url": "/v1/embeddings", "body": {"model": "ggemini-embedding-001", "messages": [{"role": "user", "content": "Tell me a one-sentence joke."}]}} # {"custom_id": "request-2", "method": "POST", "url": "/v1/embeddings", "body": {"model": "gemini-embedding-001", "messages": [{"role": "user", "content": "Why is the sky blue?"}]}} # ... # Create batch step with embeddings endpoint batch = openai_client.batches.create( input_file_id=batch_input_file_id, endpoint="/v1/embeddings", completion_window="24h" )

See the Batch embedding generation section of the OpenAI compatibility cookbook for a complete example.

Flex and Priority inference

The Gemini API matches OpenAI's service_tier parameter in name and logic, enforcing limits and gracefully directing traffic for both the Flex and Priority inference tiers.

PythonMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) completion = client.chat.completions.create( model="gemini-3.5-flash", messages=[\ {"role": "user", "content": "Write a short poem about clouds."}\ ], service_tier="priority" # Or service_tier="flex" ) print(completion)

When not explicitly assigned, service_tier defaults to standard, equivalent to default for OpenAI. Learn more about inference tiers in the Optimization documentation.

Enable Gemini features with extra_body

There are several features supported by Gemini that are not available in OpenAI models but can be enabled using the extra_body field.

ParameterTypeEndpointDescription
cached_contentTextChatCorresponds to Gemini's general content cache.
thinking_configObjectChatCorresponds to Gemini's ThinkingConfig.
aspect_ratioTextImagesOutput aspect ratio (e.g., "16:9", "1:1", "9:16").
generation_configObjectImagesGemini generation config object (e.g., {"responseModalities": ["IMAGE"], "candidateCount": 2}).
safety_settingsListImagesCustom safety threshold filters (e.g., [{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"}]).
toolsListImagesEnables grounding (e.g., [{"google_search": {}}]). Only for gemini-3-pro-image-preview.
aspect_ratioTextVideoDimensions of the output video (16:9 for landscape, 9:16 for portrait). Maps from size if not specified.
resolutionTextVideoOutput resolution (720p, 1080p, 4K). Note: 1080p and 4K trigger upsampler pipeline.
duration_secondsIntegerVideoGeneration length (values: 4, 6, 8). Must be 8 when using reference_images, interpolation, or extension.
frame_rateTextVideoFrame rate for video output (e.g., "24").
input_referenceTextVideoReference input for video generation.
extend_video_idTextVideoID of an existing video to extend.
negative_promptTextVideoItems to exclude (e.g., "shaky camera").
seedIntegerVideoInteger for deterministic generation.
styleTextVideoVisual styling (cinematic default, creative for social-media optimized).
person_generationTextVideoControls generation of people (allow_adult, allow_all, dont_allow).
reference_imagesListVideoUp to 3 images for style/character reference (base64 assets).
imageTextVideoBase64-encoded initial input image to condition the video generation.
last_frameObjectVideoFinal image for interpolation (requires image as first frame).

Example using extra_body

Here's an example of using extra_body to set cached_content:

PythonMore

from openai import OpenAI client = OpenAI( api_key=MY_API_KEY, base_url="https://generativelanguage.googleapis.com/v1beta/" ) stream = client.chat.completions.create( model="gemini-3.5-flash", n=1, messages=[\ {\ "role": "user",\ "content": "Summarize the video"\ }\ ], stream=True, stream_options={'include_usage': True}, extra_body={ 'extra_body': { 'google': { 'cached_content': "cachedContents/0000aaaa1111bbbb2222cccc3333dddd4444eeee" } } } ) for chunk in stream: print(chunk) print(chunk.usage.to_dict())

List models

Get a list of available Gemini models:

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) models = client.models.list() for model in models: print(model.id) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/", }); async function main() { const list = await openai.models.list(); for await (const model of list) { console.log(model); } } main(); curl https://generativelanguage.googleapis.com/v1beta/openai/models \ -H "Authorization: Bearer GEMINI_API_KEY"

Retrieve a model

Retrieve a Gemini model:

PythonJavaScriptRESTMore

from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) model = client.models.retrieve("gemini-3.5-flash") print(model.id) import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "GEMINI_API_KEY", baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/", }); async function main() { const model = await openai.models.retrieve("gemini-3.5-flash"); console.log(model.id); } main(); curl https://generativelanguage.googleapis.com/v1beta/openai/models/gemini-3.5-flash \ -H "Authorization: Bearer GEMINI_API_KEY"

Current limitations

Support for the OpenAI libraries is still in beta while we extend feature support.

If you have questions about supported parameters, upcoming features, or run into any issues getting started with Gemini, join our Developer Forum.

What's next

Try our OpenAI Compatibility Colab to work through more detailed examples.

Was this helpful?

Send feedback

Related Articles