Yeah Claude Haiku (don't remember the version) did it first, they claimed it was because "it's smarter now" (it's still dumb). Then OpenAI did it with GPT-5 and Google did the same with Gemini Flash and now every new model version is at least twice as expensive than the one before that.
IMO the raw Claude CLI is great for one-off interactive sessions, but as soon as you want repeatable multi-step workflows you’re either copy-pasting prompts forever or hacking your own solution manually. That’s exactly the gap these tools fill.
My take on a solution for this is https://ossature.dev — .smd spec markdown files + ossature audit / build that gives you DAG orchestration, SHA-traced increments, and tiny focused contexts.
Yeah bash scripts start clean but the sprawl kicks in quick as the workflow and project becomes more complex. Prompts get copied, deps turn manual, and maintenance of your workflow itself becomes the chore.
Ossature swaps that for structured SMDs and optional AMDs. Multiple specs build a clean DAG that drops into an editable plan.toml so everything stays traceable without the mess.
I use bash scripts. Both Claude and Vibe support all kinds of arguments if you need a prompt to “become a task”. Bash is also deterministic and easy to read and debug.
Yeah, I did briefly consider front-matter, but ended up with inline @ tags because I thought it kept the entire document feeling like one coherent spec instead of header-data + body, front matter felt like config to me, but this is 0.0.1 so things might change :)
To a certain extent, yes it does! For my cases, I'm often running 3 parallel implementations that get 10 to 20 iterations deep, and then Claude has to sort out the pros and cons of the options and also take the best bits of each. Easy to hit the context window with Claude just running those on its own, so giving `/cook` to Claude, it can offload a bit more via cook and stay higher level.
Claude and Codex can also use the cook command to coordinate runs of other agents. This is similar to how you can describe a workflow to them of how to use subagents, and they'll try, but this gives them a reliable deterministic way to run those agents. An added benefit of having Claude/Codex/etc. use cook directly is that they are really good at analyzing the traces of what is happening inside of cook and after the fact.
They bootstrap a workflow with a prompt then build an orchestrator off that then prompt it to be converted to an opencode plugin and then prompt a website to be generated advertising it and then prompt a tool that reviews hacker news feedback and automatically incorporates feedback into next generation of the tool. At the end of the week they go to their manager and complain they are out of tokens for the actual job they are being paid for.
reply