3000 tokens/s : Welcome to Alter Fast

17x faster than GPT-5

Samuel ROY & Olivier Legris
October 10, 2025

Hey everyone,

We are launching a new AI router: Alter Fast

Based on your needs, we will intelligently route your request to the right model from a selection of the fastest available, our fastest one being 17x faster than GPT-5 high.

It’s more than a performance toggle, it’s a re‑imagining of what AI UX can be.

We are so confident people will love it, it is now the setting per default for all new Alter users.

🎶 Schwefelgel - Wie ich heiß 🎶

What's New?

We handpicked the fastest models available on the market, without sacrificing quality, with the fastest one peaking at 3000 tokens/s.

Feel free to try that one individually too, it is Cerebras/gpt-oss-120b

Be aware, speed is addictive.

How to enable it

When typing / to display your model list, you can now see Alter Fast; just select it and you can drive on the high‑speed lane of information.

Should you I use it `Fast` all the time?

We’ve been testing fast for a week now, and for the vast majority of the tasks, it works like a dream.

If you are a heavy user of tools, you may still want to keep /best, Claude 4.5, or GPT‑5.

Also, don’t forget you can specify which model you want to use at an Alter Action level in the Advanced tab.

❝

If I Had More Time,
I Would Have Written a Shorter Letter

Blaise “3000” Pascal

Your feedback are important!

As we’ve been playing with this for less than a week, I’m sure we have not faced the diversity of tasks and contexts. So let us know if you find a scenario where /fast fell short.

And of course, feel free to share some love too, if you like it.

Cheers

Full Changelog

New Features & Enhancements

Models: Introduced a fast Alter model pipeline with smart auto‑selection and 3000 tokens/second
Notch Cursor: Added proper text field keyboard navigation in the notch
Dictation: Now replacements respect word boundaries

Bug Fixes & Stability

Notch & Windows: Kept Alter windows in front when closing the notch
Infrastructure: Optimized load balancers configuration and grace periods during streaming on sigterm signals
Conversations: Multiple crash fixes, fixed the invisible draggable area while minimized and better support of streaming finish reasons
Models & Endpoints: Fixed custom endpoint where an empty URL produced an empty model list
Audio & Recording: Mitigated a double‑free audio crash and the recording clock
Core & Startup: Improved error reporting when Core Data fails and prevented crashes on MCP Client start errors