On my podcast this week I interviewed Jeff Escalante, Director of Software at Clerk (oh and former head of Next.Js at Vercel). He spends more than 80% of his time developing software as an IC now. Here is his exact process for developing a new feature:
Jeff starts by using Codex (his tool of choice) to implement the feature. Clerk has separate repos for SDKs, backend services, Cloudflare workers, Terraform/config infrastructure, the dashboard, the marketing site, etc (to me this sounds painful, I generally prefer a monorepo).
So Jeff built an internal tool that helps Codex orient itself across those repos. The tool gives the agent context like: here is how this workflow moves through the stack, here are the relevant files, and here is how these repos relate to each other in production.
From there, Codex figures out the implementation. Jeff reviews the output. Then he has other LLMs review the work too, often with models reviewing each other and going back and forth.
Next, Jeff then runs the application locally and manually tests the feature himself. If it were a CLI change, for example, he would actually run the CLI and take it through its paces. After that, he usually gives Codex another round of feedback, has it review itself again, and only then asks it to orchestrate the PRs.
That PR orchestration is also automated. His tool figures out what changed in which repos, opens the relevant PRs, writes a higher-level product description explaining why the change is landing, adds repo-specific notes, links all the related PRs together, and updates the descriptions so reviewers can navigate the full change set.
Then another skill monitors the PRs. It checks AI reviewer feedback, checks CI every 10 minutes, fixes merge conflicts, brings branches up to date, and addresses review comments.
Only after that entire loop settles does the code go to human review. Clerk still requires a human review for all changes that go to production.
Human reviewers generally focus on the product implications of a change over nitpicks on the code itself or small bugs. The assumption is these issues will have already been caught by one of many AI code review tools.
If you found this interesting, you’d enjoy the full episode from my podcast (it’s also the first episode ever!) You can check it out anywhere you listen to podcasts, called “The Arjay McCandless Show” or just watch it on YouTube here: https://www.youtube.com/watch?v=_MpCoMq7xOk
I’d really appreciate any feedback as I’m looking to get deeper into podcasting!
Enjoy the rest of your week!
Arjay
