Tools & Workflow
Updated June 2026
This is how I’m building right now. The tools change quickly. The shape of the work changes less: read the context, settle the boundaries, test the model, prototype the remaining interface question, review the plan, build, then mark up the running thing until it holds together.
Frame
Docs first. grill-with-docs defines terms, scope, and the line between product decisions and UI questions. Codex tests the model underneath: data shape, states, ownership, failure cases, and drift risks.
Shape
When the product decisions are settled, paper-prototype creates a brief before touching Paper. The fidelity depends on the question. Rough if I’m testing structure. More realistic if the component treatment or implementation path matters. I use the output as editable options, make changes myself, then feed the chosen direction back into the build.
Review
Before code, I run pre-build-review. It checks for missing states, mobile issues, accessibility gaps, unclear data assumptions, and implementation ambiguity. Product uncertainty goes back to framing. Interface uncertainty goes back to Paper. Build detail goes into the plan.
Build
I'm ideally aiming for an agent-agnostic setup, though the Codex desktop app is too good to leave right now so it remains my main tool. I also regularly test Conductor and cmux running Pi alongside it.
One branch per outcome, one worktree per parallel thread. Separate worktrees keep experiments clean without cluttering the repo.
For interface-heavy work, /ui-craft keeps the implementation crisp at the component, interaction, accessibility and detail level while the build is still taking shape.
Refine
Once there is something running, I mark it up. Agentation or Codex annotations for spacing, hierarchy, states, copy, motion, and edge cases. The agent makes the changes. I check the result in the browser. Repeat until the interface feels intentional.
Ship & Learn
If it needs to be public, I use Vercel or GitHub Pages. Afterwards I look at what changed in the process. Which prompt got sharper. Which skill needs an update. Which step caught a problem early. Which step did not earn its place.