Our teams efforts have shifted toward reviewing plans as a team.
Before code hits a humans eyes it's went through a few independant review passes from our review agents.
If it's low complexity / blast radius it gets auto merged.
If it's high complexity / blast radius it gets flagged for human review.
And funny enough our team has agreed even our human review layer has the best results if we have an agent create supplementary descriptions of the code and potential issues that we read and apply human judgement.
Reason being is that it can be very pendantic. Both in a good and bad way. Either flagging things that are non-concerns, or catching things my lazy human eye wouldnt have caught. e.g. A docstring mis-describing the actual shape of an object.
I'm not sure it's fair to call the Apple Vision Pro a flop in the traditional sense.
While it may not have sold millions of units and been a household staple.
It certainly focused the entire org on manufacturing a suite of chips and hardware that are on a completely different level than their competitors. Apple's now has a clear advantage in all dimensions that matter: compute, power consumption, size, capabilities, etc.
Apple Vision helped created a moat that will be hard for anyone else to cross for at least a decade.
1. Be opinionated on best practises, tools and libraries
2. Not get in the way of what the developer wants to do
To that end the core is built on top of Temporal, and our llm package is a thin wrapper around ai-sdk that provides QoL enhancements (Prompt files, tracing, cost tracking etc..)
So for failures in general, and tool calling specifically there are two levels of retries.
1. ai-sdk level tool retries: The library by default handles tool call failures and will retry if the LLM deems it a transient issue, and will never hard fail if one of its tool calls in unsuccessful (unless perhaps you instruct it to).
2. Temporal level activity failures: Our workflows and steps are all configured with a base line affordance to reattempt steps that have failed. You as the developer are able to change this, you can make it so a step is never retried, or retried say 100 times with exponential backoff.
So the API keys during setup are entirely optional. They're used in the example workflow that evaluates blog posts for clarity and provides feedback on how to improve.
Youre more than free to ignore/delete the example workflow and create your own that doesn't make use of an LLM
1. Fetching trending hn posts
2. Pulling reddit posts that match keywords
3. Transforming Daily calendar events into an html page etc..
And the claude code plugins (that are installed for you) all work with you Anthropic subscription no problem
I own a stake in a small brewery in Canada, and this feature just saved me setting up some infrastructure to "productionize" an agent we created to assist with ordering, invoicing, and government document creation.
I get paid in beer and vibes for projects like these, so the more I can ship these projects in the same place I prototype them the better.
(Also don't worry all, still have SF income to buy food for my family with)
i use comfyui to generate cartoons for some comedians
i also have an n8n pipeline (there is an agent here) that mines social media for certain topics and audio
i also use gemini to transcribe audio from sets and to generate scripts
i am working on a clipmaker because I find opus pro and others to leave out context and audience reactions too often.. i will just upscale in topaz anyways.
Hey! Ben here (one of the engineers who built this).
This is a reason why we made our http framework (@outputai/http) a first class citizen for the greater framework and our claude code plugins.
As you pointed out at this moment in time theres a Cambrian explosion both in new tools/libraries and the willingness to use them, which poses a systemic security threat when combined with how LLMs function.
So while you're free to use any third party tool or library you want with Output. We encourage you to roll your own as often as possible both for the security/control it gives you. But also for the vertical integration it provides (debugging, cost tracking, evals etc...)
There's an interesting side effect to the current state of the non-technical world.
We have some new tools that increase productivity, and these same tools both lower the barrier to entry to understanding software concepts and building software.
I think the result is more people who would've been traditionally considered non-technical are going to be onboarding to concepts that wouldve been traditionally ring fenced in the developer world.
Granular version control and diffs being one of them.
If this trend is real, and relatively large, I think it will be a good thing.
I can only assume Chuck has decided to relieve the grim reaper of his duties, leaving us all here to meet our own end not with a scythe but a roundhouse kick.
If you're referring to https://en.wikipedia.org/wiki/Under_a_Velvet_Cloak - note that it was written a couple decades after the prior books of the series, for a different publisher, to a different length. Those would be yellow flags with almost any author.
At the same time it feels like the python is overused.
If I could wave a magic wand to reset any programming language adoption at this point I would choose Python over Javascript.
I think Pythons execution model, deep OO behaviour, and extremely weak guarantees have done a lot of damage to the soundness and performance of the technology world.
JS doesn't either... JS casts numbers to strings when adding them to a string... "2" is not a number, it's a string that contains a number character... "2" + 2 === "22" because you are appending a number to a string, the cast is implicit and not really surprising if you understand what is going on.
Even more so when you consider how falsy values work in practice (data validation becomes really easy), there are a few gotchas, but in general they are pretty easily avoided in practice. JS is really good at dealing with garbage input in ways that don't blow up the world... sometimes that's a bad thing, but in practice it can also be a very good thing. But in the end it's a skill issue regarding understanding far more than a deep flaw. Not that there aren't flaws in JS... I think Date's in particular can be tough to deal with... a string vs a String instance is another.
Before code hits a humans eyes it's went through a few independant review passes from our review agents.
If it's low complexity / blast radius it gets auto merged.
If it's high complexity / blast radius it gets flagged for human review.
And funny enough our team has agreed even our human review layer has the best results if we have an agent create supplementary descriptions of the code and potential issues that we read and apply human judgement.
Reason being is that it can be very pendantic. Both in a good and bad way. Either flagging things that are non-concerns, or catching things my lazy human eye wouldnt have caught. e.g. A docstring mis-describing the actual shape of an object.
reply