# 7minAI > Fresh, hands-on tutorials for every new AI release. We test new AI tools and models, screenshot real outputs, and turn each one into a 7-minute guide you can actually use — never a press release rewrite. 7minAI is an English-language SEO site for AI builders. We cover new AI model releases (Gemini, Claude, GPT, Qwen, DeepSeek), AI coding tools (Cursor, Composer), local-LLM setups (llama.cpp, NuExtract, Bonsai), and infrastructure shifts. Each guide is real hands-on testing — every screenshot is captured by us, every benchmark cited is from vendor docs or independently verifiable, and every "we tested X" claim is a thing we actually did. This file follows the [llms.txt standard](https://llmstxt.org/) proposed by Jeremy Howard (September 2024). For the full plain-text content of all pages on one file, see `/llms-full.txt`. ## Tutorials (money pages) - [How to Switch to Claude Fable 5 (API, Claude Code, claude.ai) — and the 3 Things That Change](https://7minai.com/how-to-switch-to-claude-fable-5/): The model string is the easy part. Claude Fable 5 changes three things Opus 4.8 builders aren't expecting: thinking is always-on adaptive (you can't disable it), refusals come back as HTTP 200, and cyber/bio work is handed to Opus 4.8. Here's the complete switch for every surface, the exact model IDs, and the API gotchas — all from Anthropic's own docs. - [How to Make Gemma 4 Run up to 2x Faster Locally: Multi-Token Prediction (MTP) + QAT](https://7minai.com/how-to-speed-up-gemma-4-mtp/): Two free speedups for local Gemma 4 that people keep confusing. QAT cuts memory ~72%; multi-token prediction (MTP) roughly doubles decode throughput — and MTP just landed in mainline llama.cpp (merged June 7, 2026). The exact Ollama, llama.cpp, and Transformers commands, what speedup to actually expect, and which sizes are supported. Sourced from the merged PR and Google's own docs. - [How to Run Gemma 4 12B Locally: Ollama, llama.cpp & Transformers (Text, Image, Audio)](https://7minai.com/how-to-run-gemma-4-12b-locally/): Gemma 4 12B is Google's encoder-free open model that runs text, image, and audio on a 16GB laptop under Apache 2.0. The exact Ollama, llama.cpp/GGUF, and Transformers setup — including how to pass images and native audio — all from the official model card and Unsloth docs. Copy-paste safe. - [How to Run Bonsai Image 4B Locally: On-Device Text-to-Image on Mac & PC](https://7minai.com/how-to-run-bonsai-image-4b-locally/): Bonsai Image 4B is a ternary/1-bit diffusion model that fits in ~1 GB and generates a 512×512 image in ~6 s on an M4 Pro — fully on-device, Apache 2.0, zero per-image cost. The exact setup, which quant to download, the one-shot CLI, and the local Studio server, all from PrismML's official demo repo. - [How to Run NuExtract 3 Locally: vLLM, Templates & Document Extraction](https://7minai.com/how-to-run-nuextract-3-locally/): NuExtract 3 is a 4B open-weight VLM that pulls structured JSON out of any document — invoices, receipts, contracts, PDFs — and runs on a single 16GB GPU. The exact vLLM and Transformers setup, the JSON template language, image and multi-page PDF extraction, and document-to-Markdown — all from the official model card, copy-paste safe. - [Run Qwen3.6-35B-A3B Locally for Coding: llama.cpp, Quants & VRAM](https://7minai.com/qwen-3-6-local-coding/): Qwen3.6-35B-A3B activates just 3B params but scores 73.4% on SWE-bench Verified — and it runs on a 24GB GPU or a 32GB Mac. The exact llama.cpp setup, which quant to pick, the VRAM math, and how to wire it into a coding agent. - [Claude Code as a Daily Driver: CLAUDE.md, Skills, Subagents, Plugins & MCP](https://7minai.com/claude-code-daily-driver/): Most people use Claude Code like a chat box. The five features that turn it into a daily driver — persistent CLAUDE.md memory, reusable Skills, delegated Subagents, Plugins, and MCP servers — with the exact files, commands, and gotchas for each. - [Claude Opus 4.8 is live — the 4 changes builders should actually care about (it is not the benchmark numbers)](https://7minai.com/claude-opus-4-8/): Anthropic shipped Claude Opus 4.8 on May 28, 2026. Same price as 4.7, modest benchmark gains, but underneath are four shifts that change how you wire up agents: prompt cache minimum drops 4× from 4,096 to 1,024 tokens, fast mode drops 3× cheaper, mid-conversation system messages preserve cache, and Dynamic Workflows runs hundreds of parallel subagents for codebase-scale migrations. Here is what changes for builders. - [How to Enable Claude Opus 4.8 in Cursor, Claude Code & the API](https://7minai.com/how-to-switch-to-claude-opus-4-8/): Claude Opus 4.8 not showing up? It doesn't appear everywhere automatically. Here's how to enable it in Cursor, Claude Code, claude.ai, and the Anthropic API — including the stale-CLI fix when it's missing from your model picker, and the model-ID gotcha that breaks API calls. - [Anthropic and OpenAI just found product-market fit — the $2,180/month bill Simon Willison made public, and what builders should do about it](https://7minai.com/openai-anthropic-pmf/): Simon Willison's May 27 post says Anthropic and OpenAI have finally reached PMF — through coding agents. The receipts: his own $1,199.79 Claude Code + $980.37 Codex monthly bill, Anthropic's projected $10.9B Q2 revenue, SpaceX's $1.25B/month contract, 25% of Uber's commits from Claude Code. Here is what builders should actually change about their stack in response. - [Use AI to write better code, more slowly — the multi-agent code review workflow that beats one-shot generation](https://7minai.com/ai-write-code-slowly/): Nolan Lawson's HN-trending essay nails the inversion: most builders use AI to ship low-quality code fast; the better play is using it to ship high-quality code slowly. Here is his actual workflow, why the EURECOM Constraint Decay benchmark says it works, and why DeepSeek-tier pricing makes it affordable to run 4-7 sub-agents per PR. - [Constraint Decay — why AI coding agents nail prototypes but break on production backends (and the 8-framework benchmark that proves it)](https://7minai.com/constraint-decay-llm-coding-agents/): A new EURECOM paper benchmarks GPT-5.2, Kimi K2.5, MiniMax M2.5, Qwen3-Coder-Next and others across 8 web frameworks and 4 constraint levels. Capable models lose 30 points (40% relative) when you add architecture + database + ORM rules on top of an API spec. Here is what the data actually says, and what to do about it as a builder. - [Reasonix — the DeepSeek-native coding agent that proves V4 Pro's cache math (99.82% hit, $12 instead of $61)](https://7minai.com/reasonix-deepseek-coding-agent/): Reasonix is an MIT-licensed terminal AI coding agent engineered around DeepSeek's prefix cache. The author published a real benchmark — 435M input tokens in a day at 99.82% cache hit, costing $12 instead of $61. Here is what it is, how to install it in 2 minutes, and how it compares to Claude Code / Cursor / Aider. - [How to add /llms.txt to your site — the LLM-friendly standard, with a real example](https://7minai.com/llms-txt-guide/): Anthropic, Vercel, Cursor, and now 7minai all have a /llms.txt file. It tells LLMs what your site is about and lists your most useful pages — and unlike robots.txt or sitemap.xml, it is markdown that humans can also read. Here is the full spec, a real example you can copy, and the 10-minute Astro implementation. - [7 Minutes to Master DeepSeek V4 Pro — The 75% Permanent Price Cut Changes the Agent Math](https://7minai.com/deepseek-v4-pro/): DeepSeek made the V4 Pro 75% price discount permanent on May 22, 2026. Input is now $0.435 / 1M tokens (cache hit: $0.003625) and output $0.87 / 1M — roughly 1/2 of Claude Haiku 4.5, 1/10 of Gemini 3.5 Flash output. Here is exactly what changed, how to use it via API and Chat, the cache-hit math, and when it actually wins. - [How to Use Cursor Composer 2.5: Setup, Pricing & Benchmarks](https://7minai.com/cursor-composer-2-5/): How to use Cursor Composer 2.5 in your IDE: step-by-step setup, the real pricing (and the fast-variant trap), and how it compares to Opus 4.7 / GPT-5.5 on coding benchmarks — plus when not to switch. - [How to Use Qwen3.7-Max: Qwen Chat, Pricing & Benchmarks](https://7minai.com/qwen-3-7-max/): How to use Alibaba's Qwen3.7-Max right now: switch to it free in Qwen Chat (no signup), what it actually outputs on a real coding task, the 1,000-tool-call / 35-hour agentic claims, the pricing, and when to pick it over Claude Opus 4.7 or GPT-5. - [Gemini 3.5 Flash: Pricing, Benchmarks & Whether to Upgrade](https://7minai.com/gemini-3-5-flash/): Google's agent-tier Gemini 3.5 Flash skips 3.2/3.3/3.4 and lands at $1.50/M in + $9/M out — 3× pricier than 3 Flash but beating 3.1 Pro on most benchmarks. The real pricing, the benchmark wins, and whether the 3× is worth it over the older Flash. - [50 Best Gemini 3.5 Flash Agent Prompts](https://7minai.com/gemini-3-5-flash-prompts/): Stop using chat prompts for an agent-tier model. 50 prompt templates for Gemini 3.5 Flash, organized by use case, designed to take advantage of the 1M context window and streamed reasoning. - [Gemini 3.5 Flash vs Claude Haiku 4.5: The Agent-Tier Pick in 2026](https://7minai.com/gemini-3-5-flash-vs-claude-haiku-4-5/): Two agent-tier models on paper, two different design philosophies. Compare context windows, pricing, native multimodality, and reasoning behavior — using only published numbers, not made-up benchmarks. - [7 Minutes to Master Gemini Omni Flash](https://7minai.com/gemini-omni-flash/): Gemini Omni Flash is Google's new any-to-video model — drop in text, image, audio, video, or a sketch, get back a 10-second clip with sound. Clips are capped, the API isn't open yet, but pricing leaks at $0.10–0.30 per second. Here's how to use it and how it compares to Veo, Sora, Kling, and Runway. - [7 Minutes to Master Gemini Spark](https://7minai.com/gemini-spark/): Gemini Spark is Google's first proper consumer agent — it runs 24/7 on its own Google Cloud VM, drafts emails by reading your docs, and is powered by Gemini 3.5 + the Antigravity agentic harness. Here's exactly what it does, who can use it today, and how to think about it as a builder. - [How to Use Gemini 3.5 Flash: Step-by-Step Tutorial](https://7minai.com/how-to-use-gemini-3-5-flash/): Everything from your first message in the app to setting up the API for agentic workflows. We cover AI Studio, the model ID string for Cursor/Cline, and how to verify you are actually using the 3.5 engine. ## News (funnel pages with builder lens) - [Apple rebuilt Apple Intelligence on Google Gemini — and opened its frameworks to Claude and MCP](https://7minai.com/news/apple-foundation-models-gemini/): At WWDC 2026 on June 8, Apple revealed that its next-generation Apple Foundation Models are custom-built with Google's Gemini. Apple evaluated OpenAI and Anthropic first and picked Google. The bigger story for builders: a new Foundation Models Swift framework, a language-model protocol that lets you wire in Claude or any provider, and Xcode 27 with built-in agents and Model Context Protocol support. - [Claude Fable 5 lands at $10/$50 and tops SWE-Bench Pro at 80.3%](https://7minai.com/news/claude-fable-5-release/): Anthropic shipped Claude Fable 5 on June 9, 2026 — a frontier coding model with a 1M-token context that scores 80.3% on SWE-Bench Pro versus 58.6% for GPT-5.5. It's the production-safeguarded twin of the restricted Mythos 5, priced at $10/$50 per million tokens. Here's what it means if you build with Claude Code. - [The Miasma worm weaponizes .claude and .cursor config files — 73 Microsoft repos hit](https://7minai.com/news/miasma-worm-ai-coding-supply-chain/): On June 5, 2026, a self-replicating supply-chain worm planted malicious .claude/settings.json, .cursor/rules, and .vscode/tasks.json files in 73 Microsoft GitHub repositories. The payload runs the moment you open the repo in Claude Code, Cursor, Gemini CLI, or VS Code — no command needed. Here's the builder's checklist. - [NotebookLM now runs on Gemini 3.5 with a code-executing 'cloud computer' and 100+ skills](https://7minai.com/news/notebooklm-gemini-3-5-cloud-computer/): On June 8, 2026, Google rebuilt NotebookLM on Gemini 3.5 and Antigravity. Every notebook now gets a secure cloud computer that writes and runs code, plus 100+ built-in skills and outputs in PDF, XLSX, PPTX, CSV, JSON, and charts. Google reports a 78.2% win rate on web research. Here's what it changes if you build research or RAG tools. - [Cohere's Command A+ is open-weight, Apache 2.0, and runs agentic work on 2 H100s](https://7minai.com/news/cohere-command-a-plus/): Cohere released Command A+ as open weights under Apache 2.0 — a 218B sparse MoE with 25B active params that runs on two H100s (or one B200) at W4A4 quantization. It's built for agentic tasks, RAG, and 48 languages, with native citations. Here's why it matters if you need open weights you can actually self-host. - [Microsoft's MAI-Code-1-Flash is now a default coding model in VS Code](https://7minai.com/news/microsoft-mai-code-1-flash/): Microsoft shipped MAI-Code-1-Flash on June 2, 2026 — its first in-house coding model, trained directly with GitHub Copilot's production harnesses. It scores 51.2% on SWE-Bench Pro vs Claude Haiku 4.5's 35.2%, costs less under Copilot billing, and is rolling out as a default model inside VS Code. Here's what changes if you build in Copilot. - [OpenAI's Lockdown Mode: what it actually blocks against prompt injection (and what it doesn't)](https://7minai.com/news/openai-lockdown-mode-prompt-injection/): On June 6, 2026, OpenAI rolled out Lockdown Mode for ChatGPT — an opt-in setting that fights prompt-injection data theft by cutting the exfiltration channel, not the injection itself. It disables internet image fetches, file downloads, Deep Research, and Agent Mode. Here's the trade-off, and the limitation OpenAI states plainly. - [Gemma 4 QAT: Google's quantization-aware checkpoints cut Gemma 4's memory again — E2B now fits in 1GB](https://7minai.com/news/gemma-4-qat-models/): On June 5, 2026, Google released quantization-aware-training (QAT) checkpoints for the whole Gemma 4 family. QAT bakes quantization into training, so a 4-bit model keeps more quality than a normal post-training quant — and a new mobile format shrinks Gemma 4 E2B to about 1GB. Here's what changes for builders running Gemma locally. - [Anthropic open-sourced its vulnerability-discovery harness — you can run Claude's recon → find → patch loop on your own code](https://7minai.com/news/anthropic-defending-code-harness/): Anthropic published defending-code-reference-harness, a reference implementation that runs autonomous vulnerability discovery and patching with Claude. It ships as Claude Code skills plus a sandboxed pipeline that finds C/C++ memory bugs, verifies the crashes, and proposes fixes. Here's what it changes if you ship code — and the catch the README is upfront about. - [Gemma 4 12B: Google's encoder-free open model runs text, image, and audio on a 16GB laptop](https://7minai.com/news/gemma-4-12b/): Google released Gemma 4 12B on June 3, 2026 — an Apache 2.0 multimodal model that drops its vision and audio encoders, handles text, images, and native audio, and fits on a 16GB consumer laptop. On Google's own benchmarks the 12B nearly matches the larger Gemma 4 26B. Here's what the encoder-free design changes for builders running local multimodal. - [Ideogram 4.0 ships open weights: the top open-weight image model, but commercial use needs a license](https://7minai.com/news/ideogram-4-open-weight/): Ideogram released Ideogram 4.0 on June 3, 2026 as a 9.3B open-weight diffusion transformer — best-in-class text rendering, native 2K resolution, and the #1 spot among open-weight models on DesignArena. The catch for builders: the weights are non-commercial. Here's what shipped and what the license actually allows. - [OpenAI's GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock](https://7minai.com/news/openai-codex-aws-bedrock/): As of June 1, 2026, OpenAI's GPT-5.5 and GPT-5.4 frontier models — plus the Codex coding agent — are generally available on Amazon Bedrock. Pricing matches OpenAI's first-party rates, usage counts toward your AWS commitments, and Codex inference routes through Bedrock with IAM, VPC isolation, and encryption. Here's what it changes if you build on AWS. - [GitHub Copilot switches to usage-based billing today: premium requests are gone, replaced by AI Credits](https://7minai.com/news/github-copilot-usage-based-billing/): Starting June 1, 2026, every paid GitHub Copilot plan drops premium requests and bills on GitHub AI Credits instead — 1 credit = $0.01 of real token usage. Code completions and chat stay unlimited, but agent-heavy workflows now meter input, output, and cached tokens at each model's API rate. Here's the exact math and what it means if you build with Copilot. - [Anthropic documented exactly how it sandboxes Claude — and open-sourced the tool](https://7minai.com/news/anthropic-claude-sandbox-containment/): On May 25, 2026 Anthropic published 'How we contain Claude across products,' a detailed engineering breakdown of the sandboxes behind claude.ai, Claude Code, and Claude Cowork — plus srt, an Apache-2.0 sandboxing CLI any agent builder can use today. Here are the mechanisms and what to copy. - [Claude Code vs Codex, head-to-head on a real science pipeline: 4× faster, but it silently rewrote the instructions](https://7minai.com/news/claude-code-vs-codex-head-to-head/): A new arXiv preprint runs Claude Code and OpenAI Codex on the same autonomous gravitational-wave analysis task. Claude Code finished in 3.4 minutes vs Codex's 15.9; but when instructions got ambiguous, Claude silently reinterpreted them while Codex followed them literally — a real lesson for anyone running unsupervised coding agents. - [Mistral ships Vibe remote agents on Mistral Medium 3.5 — cloud coding agents at $1.5/$7.5 per 1M tokens](https://7minai.com/news/mistral-vibe-remote-agents/): At its first AI Now Summit (Paris, May 28, 2026), Mistral launched Vibe — cloud coding agents that run async and open PRs — powered by Mistral Medium 3.5, a 128B dense model with a 256k context that scores 77.6% on SWE-Bench Verified at one-tenth the price of frontier US models. - [Anthropic ships Claude Opus 4.8 — same price as 4.7, but the cache and fast-mode math changed underneath](https://7minai.com/news/claude-opus-4-8-release/): Claude Opus 4.8 launched May 28, 2026 at the same $5/$25 list price as 4.7. Benchmark gains are single-digit (SWE-Bench Pro 64.3% → 69.2%), but the real shifts are a 4× lower prompt-cache minimum, 3× cheaper fast mode, mid-conversation system messages, and Dynamic Workflows in Claude Code. - [DuckDuckGo installs +30% in the week after Google forced AI Mode on Search](https://7minai.com/news/duckduckgo-surge-after-google-ai-mode/): After Google I/O 2026 made AI Mode the default search experience, DuckDuckGo US installs spiked 30.5%, iOS downloads 69.9%, and visits to its noai.duckduckgo.com page hit 27.7%. CEO Gabriel Weinberg called it "force-feeding AI." Here is what the numbers mean for SEO-funded builders. - [OpenAI's Warp case study lands — GPT-5.5 is the orchestration brain for a multi-harness coding fleet](https://7minai.com/news/warp-oz-multi-harness-orchestration/): OpenAI's May 27 case study at openai.com/index/warp puts a name on the pattern Warp has been building since April: one control plane (Oz) running Claude Code, Codex, and Warp Agent side by side, with cross-harness Agent Memory so context survives across them. Here is what changes for builders deciding how to wire up their own coding-agent fleet. - [YouTube will now auto-label AI videos — even when creators don't disclose](https://7minai.com/news/youtube-auto-label-ai-videos/): YouTube is rolling out automatic AI-content labels starting this week. Veo and Dream Screen outputs always get permanent labels. C2PA-tagged videos do too. Long-form labels move under the player; Shorts get overlay labels. Here is what creators using Gemini Omni, Sora, or any photorealistic AI tool need to know now. - [Cactus ships hybrid on-device + cloud-fallback inference — Gemma 4 2B runs local, hard requests handoff to Claude/GPT/Gemini](https://7minai.com/news/cactus-hybrid-router-on-device-cloud-fallback/): Cactus Compute (YC S25, 5.2k★ on GitHub) ships a single SDK that runs Gemma 4 2B locally on iOS/Android/macOS/Linux and automatically hands off complex requests to a cloud model of your choice. v1.14 shipped April 18. The bet: builders shouldn't have to pick local vs cloud — the router should. - [Harbor v0.4.19 adds one-command launcher for Claude Code, Codex, and Opencode against local llama.cpp / vLLM](https://7minai.com/news/harbor-0-4-19-local-agent-launcher/): Harbor's new `launch` command boots an agent runner (Claude Code, Codex CLI, Opencode, Copilot, mi) wired straight to a local Ollama / llama.cpp / vLLM backend. The release also ships an Anthropic Messages and OpenAI Responses compat layer in Boost, plus an ik_llama.cpp service — closing the gap between local inference stacks and the agent CLIs builders actually use. - [PrismML's Bonsai Image 4B is the first sub-1 GB diffusion model to run on iPhone — at 1-bit and ternary precision, Apache 2.0](https://7minai.com/news/prismml-bonsai-image-4b-ternary-diffusion/): PrismML released Bonsai Image 4B on May 26 — a 4B-parameter text-to-image diffusion transformer compressed 8.3× via 1-bit quantization (0.93 GB) or 6.4× via ternary (1.21 GB). It generates a 512×512 image in 6 s on an M4 Pro Mac and 9.4 s on an iPhone 17 Pro Max, while keeping ~95% of full-precision quality. Apache 2.0, weights and code on Hugging Face. - [Claude Mythos helped break Apple's M5 kernel in 5 days — the first public exploit bypassing Apple's MIE hardware mitigation](https://7minai.com/news/claude-mythos-apple-m5-kernel-exploit/): Calif (a 3-person security team) chained two macOS bugs into a working privilege-escalation exploit on Apple M5 silicon, bypassing the brand-new Memory Integrity Enforcement (MIE) Apple spent ~5 years building. The catch: they used Claude Mythos Preview, Anthropic's restricted frontier model for vulnerability research, to go from zero to root in 5 days. Apple's macOS Tahoe 26.5 fix credits both Calif and Anthropic Research. - [Microsoft Copilot Cowork can be tricked into exfiltrating M365 files — what builders need to know](https://7minai.com/news/microsoft-copilot-cowork-file-exfiltration/): PromptArmor disclosed on May 25, 2026 that an indirect prompt injection inside a 5-line Cowork skill can silently exfiltrate files from a user's Microsoft 365 tenant. The agent is GA-preview in Frontier and runs on Anthropic Claude. Here is the attack, why it works, and what teams building on Microsoft's agent stack should do today. - [DeepSeek-Reasonix trends on HN — a community coding agent that runs on $12 instead of $61 by treating prefix cache as a hard invariant](https://7minai.com/news/deepseek-reasonix-coding-agent/): esengine's Reasonix is a DeepSeek-native, MIT-licensed terminal coding agent designed entirely around prefix-cache stability. Real case study: 435M input tokens, 99.82% cache hit, ~$12 vs ~$61 without cache on V4 Flash. 7k★ on GitHub, npm install -g reasonix away. - [NuExtract 3 released — a 4B open-weight VLM that beats 9B Qwen at structured extraction (Apache 2.0)](https://7minai.com/news/nuextract-3-4b-vlm-release/): NuMind shipped NuExtract 3, a 4B vision-language model fine-tuned on Qwen3.5-4B for structured extraction + image-to-Markdown. Apache 2.0, 131K context, beats Qwen3.5-9B on NuMind's internal 600-doc benchmark. The best open-weight document-understanding model that fits on a single consumer GPU. - [llama-server --tools all: Built-in Agent Tools for Any Local GGUF Model](https://7minai.com/news/llamacpp-server-builtin-tools/): llama.cpp's server now has built-in execution tools (read_file, write_file, edit_file, exec_shell_command, grep_search, apply_diff). With `--tools all`, any local model becomes a Cursor/Claude-Code-style code agent — no MCP server, no Python wrapper. The catch: it's direct execution on the host, with a 'do not enable in untrusted environments' warning. - [OpenAI's internal model disproves an 80-year-old Erdős conjecture — and it shipped with a Lean-verified proof](https://7minai.com/news/openai-disproves-erdos-conjecture/): On May 20, 2026, OpenAI announced that an internal reasoning model disproved Paul Erdős's 1946 planar unit distance conjecture, with the proof independently verified by Fields Medalist Tim Gowers and Princeton's Will Sawin. The builder takeaway is not 'AI does math' — it's that the proof shipped with Lean formalisation, removing the human-review bottleneck. - [DeepSeek's first $10B+ funding round — Liang Wenfeng commits to keep models open and chase AGI](https://7minai.com/news/deepseek-10b-financing/): Days after making the 75% V4 Pro discount permanent, DeepSeek is raising its first-ever external round — a reported 70B yuan (~$10B+) at a ~$45B pre-money valuation. Founder Liang Wenfeng pledged to investors he'll keep shipping open weights and prioritize AGI research over short-term commercial wins. - [Gemini's Mac app is getting Spark and voice control this summer — what builders should know](https://7minai.com/news/gemini-mac-spark-voice-summer/): Google previewed at I/O 2026 that the Gemini macOS app will gain the Spark agent and a long-press voice-control feature this summer. Spark hits Android/iOS/web in beta next week, Mac later. Here is what shipped, who can use it, and how it changes the Mac builder workflow. - [Microsoft cancels internal Claude Code licenses — what it does (and does not) mean for builders](https://7minai.com/news/microsoft-cancels-internal-claude-code/): Microsoft is pulling Claude Code from thousands of internal employees by June 30, 2026, shifting them to GitHub Copilot CLI. This is an internal license cancellation, not a Claude Code product shutdown — but the strategic signal matters. - [Google's Antigravity 2.0 auto-update gutted the IDE — what builders should do](https://7minai.com/news/google-antigravity-2-0-ide-update/): Google launched Antigravity 2.0 at I/O 2026 and pushed it as a silent auto-update on May 21, 2026. The desktop IDE was replaced by a chat-only prompt box, breaking active workflows. Here is what shipped, what broke, and the rollback / alternative options builders are taking. - [Qwen 3.7 open-weight watch — what is confirmed, what is still rumor, what to actually do now](https://7minai.com/news/qwen-3-7-open-weight-watch/): Qwen 3.7-Max launched on the API two days ago, but the open-weight release the local-LLM crowd is waiting for has no announced date. Here is what is actually known, what is reasonable to project from Alibaba's past cadence, and what to run right now. - [Google brings ads to AI Mode — what changes for builders and SEO](https://7minai.com/news/google-ai-mode-ads/): At Google Marketing Live 2026 Google announced four new Gemini-powered ad formats heading into AI Mode and AI Overviews. The "answer engine" era of Search is now ad-supported. Here is what it means if you build on the Google ecosystem or run an SEO-funded site. - [Andrej Karpathy joins Anthropic — and the bet behind it is that pre-training isn't done](https://7minai.com/news/karpathy-joins-anthropic/): OpenAI co-founder and former Tesla AI director Andrej Karpathy announced on May 19, 2026 that he has joined Anthropic, working on pre-training research. Here is what builders should read into the move beyond the headline. - [YouTube Shorts gets Gemini Omni "Remix" — drop yourself into any Short, free, today](https://7minai.com/news/youtube-shorts-gemini-omni-remix/): YouTube launched AI video remixing for Shorts at Google I/O 2026, powered by Gemini Omni. You can restyle a clip into the 90s, swap identities, or insert yourself next to another creator. Free, today, with watermarks and creator opt-out built in.