How to Use Gemini 3.5 Flash: Step-by-Step Tutorial
Last updated: May 20, 2026. Read time: 7 minutes. What you’ll learn: 3 ways to access the model today, how to switch from older versions, setting up the API key for dev tools (Cursor/Cline), and the “Hello Agent” test.
Google’s Gemini 3.5 Flash is now the default engine for most of the Google ecosystem. But “default” doesn’t mean you’re always getting the full agent-tier performance. Depending on whether you’re a casual user, a pro creator, or a developer, the way you “use” it is fundamentally different.
This is the step-by-step setup guide to ensure you aren’t stuck on the older (cheaper, slower) Gemini 3.1 or 3 Flash Preview models.
Step 1: The Consumer Path (Gemini App)
If you just want to talk to the model, go to https://gemini.google.com.
- Verify the model: Look at the top left or the bottom of your chat window. It should say “Gemini 3.5 Flash”.
- The Switch: If it says “Gemini 3.1 Pro” or “Gemini 3 Flash-Lite,” click the model name. A dropdown will appear. Select 3.5 Flash.
- Multi-modality: You can drag and drop PDFs, images, audio files, or videos directly into the chat. Check Google’s file-input limits for the current per-file size cap — they change as the model is updated.
Pro Tip: If you are in the US and have an AI Ultra subscription, you can also trigger Gemini Spark from here. Spark uses the same 3.5 Flash model but adds 24/7 autonomous execution.
Step 2: The Pro Playground (Google AI Studio)
For builders who want to tweak parameters like Temperature or System Instructions without writing code, use AI Studio.
- Go to https://aistudio.google.com.
- On the right-hand sidebar, find the Model dropdown.
- Select Gemini 3.5 Flash.
- Enable Reasoning: This is the most important “hidden” feature. In the settings, ensure “Show reasoning tokens” is toggled on. This allows you to see the model’s “thoughts” as it plans its answer—essential for debugging complex prompts.
- Get your API Key: Click the “Get API key” button in the top left. You’ll need this for the next step.
Step 3: The Developer Path (API & Coding Tools)
If you use tools like Cursor, Cline, or Claude Code, you can swap out their default models for Gemini 3.5 Flash to save money (it’s cheaper than Sonnet) or gain speed.
For Cursor Users:
- Open Cursor Settings.
- Go to Models.
- Add a new model with the exact ID:
gemini-3.5-flash. - Paste your Google API key in the “Google” section.
- Toggle off other models and select
gemini-3.5-flashas the default.
For CLI Users (Simon Willison’s llm):
If you have Python installed, this is the fastest way to use the model from your terminal.
pip install llm
llm install llm-gemini
llm keys set gemini # Paste your key from AI Studio
llm -m gemini-3.5-flash "How do I optimize this Dockerfile?"
Step 4: The “Hello Agent” Test
To confirm you are actually on the 3.5 Flash engine and not a cached older version, run this “agentic” test prompt. 3.5 Flash should plan its steps before answering:
“I have a folder of 50 messy receipts in JPG format. Plan a Python script that uses your OCR capabilities to extract the Date, Vendor, and Total into a CSV, handles cases where the currency is not USD, and flags blurry images. Don’t write the code yet, just show me your detailed execution plan.”
What to look for: with reasoning enabled, 3.5 Flash should stream “thinking” tokens before the final plan — you’ll see it consider image resolution, currency conversion libraries, and how to handle blurry or rotated images. If the response comes back as a flat 3-step list with no reasoning trace, double-check that the model dropdown actually shows gemini-3.5-flash and that “Show reasoning tokens” is on.
Troubleshooting Common Issues
- “Model not found” in API calls: Ensure you are using the
v1betaendpoint. The stablev1endpoint sometimes takes 24-48 hours longer to update with the newest model IDs. - Slow responses: 3.5 Flash is 4x faster than 3.1 Pro, but “reasoning” takes time. If you don’t need the agentic depth, you can turn off reasoning tokens in AI Studio to get near-instant text.
- Pricing Surprise: 3.5 Flash is noticeably more expensive than the older 3 Flash ($1.50 / $9.00 per 1M tokens vs the previous-gen Flash’s lower rate). If you’re doing bulk translation or summarization, check the current Google AI pricing page before swapping models in production.
Related Articles
- 7 Minutes to Master Gemini 3.5 Flash — The deep dive into specs and benchmarks.
- 50 Best Gemini 3.5 Flash Prompts — Ready-to-use agentic templates.
- Gemini 3.5 Flash vs Claude Haiku 4.5 — Which one wins for coding?