June 7, 2026
We Hired an Intern Named Gilfoyle
He lives in a sandbox, doesn't sleep, and commits to our production repo. How we built an internal dev agent on AnyFrame, and pointed it at AnyFrame.

Our newest team member is named Gilfoyle. He's an intern, and this week alone he's shipped more commits to our web app than most of the humans on the team.
Here are six of them, straight from our git log. Read the subject lines
closely; we'll let you draw your own conclusions:
ff7e4fa content(blog): add proof-of-work + monorepo screenshots, trim heaviest sections
dc00e5b fix(web/blog): scope image-caption styling to actual captions
fc00365 copy(blog): smooth out quoted-dialogue asides, drop two tangents
fc7aab5 feat(web/blog): add post images and a floating table of contents
ffe598b fix(web): tune blog body type scale and drop em dashes from the Gilfoyle post
06a7b8c feat(web): add a blog page setup and publish the Gilfoyle dogfooding post
Every one of those landed under his own commit identity, on our production repo, with nobody else touching a line. We'll come back to what they actually shipped. First: who, or what, is Gilfoyle?
Of course "Gilfoyle" isn't a person: he's our internal developer agent, an AI that lives in a cloud sandbox with our codebase checked out and a shell open. We named him after the Silicon Valley character: the most arrogant, deadpan systems engineer on TV, demoted to intern, and somehow still our most productive headcount.
This post is about how we built him, why the unglamorous parts matter most, and the recursive punchline at the center of it: we built AnyFrame, a control plane for sandboxed AI agents. Then we pointed AnyFrame at AnyFrame. Gilfoyle runs on the product he helps ship.
What AnyFrame is, in one paragraph
AnyFrame lets you define an agent: a repo, an install command, a system prompt, a set of skills, and connections to your tools (Slack, Linear, GitHub, Notion, Sentry…), and boot it into an isolated cloud sandbox that can actually do the work. Not a chatbot that drafts suggestions: a real machine with your code, your dependencies, and a terminal. You talk to it over chat, it streams back what it's doing, and it can open PRs, run your tests, query your tools, and ship.
If you've ever wished Claude Code or Codex lived on a server your whole team could @mention, and could safely touch your private repos too, that's Gilfoyle.
Hiring him took about ten minutes
We didn't build anything special. We onboarded Gilfoyle the same way any customer would. Three steps:
1. Define the agent. Dashboard → New agent → point it at our codebase, set the install command, pick the Claude harness, write a system prompt. Ours explains our conventions (conventional commits, "smallest change that solves the problem," run the type-check before you call it done) and, yes, gives him a personality. We attached our codebase skills and connected GitHub. (More on the codebase and the skills below; that's where most of the leverage lives.)

Gilfoyle's actual config screen: runtime, triggers, and permissions, in one place.
2. Give him a key. Settings → API keys → Create. Copy the afm_… value.
That key stays on our server. Discord never sees our code or our
credentials; only the rendered stream of what Gilfoyle is doing.
3. Give him a Discord handle. We forked the open-source
anyframe-discord-bot, set three env vars (ANYFRAME_API_KEY,
ANYFRAME_AGENT_ID, DISCORD_BOT_TOKEN), and pushed it to Railway. Done.
Now the workflow is just: @mention him in any channel. He spins up a thread, boots a fresh isolated sandbox with our repo already cloned, and streams the work back. Every message after that goes to the same sandbox: no re-mentioning. If the sandbox gets evicted between turns (they're ephemeral), the next message silently resumes it from a snapshot. You never notice.

A typical brief: @ him, describe the change in plain English, get back to your day.
One thread = one sandbox = one unit of work. Three people can put Gilfoyle on three tasks in three channels and they never collide. He's an intern who clones himself.
Giving him the whole company in one checkout
The ten-minute version above is real, but the leverage came from two deliberate choices about what Gilfoyle works on and how he proves he did it.
First, the codebase. AnyFrame isn't one repo: it's a backend, a dashboard, an SDK, and a couple of integrations, all evolving together. Point an agent at just one and it's blind to the rest: ask it to change an API and it can't see the dashboard that calls it. So we stitched everything into a single monorepo: one checkout, every repo, each still pushing to its own remote. A task that touches the API, the client that consumes it, and the SDK that wraps it becomes one coherent piece of work, not three context switches Gilfoyle can't make. It's the difference between an intern who's read one file and one who's read the whole org.

The actual checkout: every repo as a submodule, one place to work from.
Second, and this is the part we're proudest of: skills, reusable playbooks you attach to an agent (a name, a trigger, a set of instructions for a recurring task). We've built a stack of them encoding how we work, but the one that changed how much we trust Gilfoyle is the proof-of-work skill.
Every sandbox image ships with a real Chromium and Playwright baked in, so Gilfoyle can drive a live browser, not just read HTML. When he changes anything you can see, the skill makes him prove it: boot the dev server, navigate to the page he touched, screenshot it, and hand the proof back attached to the work. He doesn't tell you the mobile banner is fixed: he shows you the banner, on a phone-width viewport, fixed.

A real PR comment, proving a change with a screenshot of, well, this very post.
A screenshot is static proof, though, and sometimes you want to poke at the thing yourself. So Gilfoyle has a second register: live preview URLs. AnyFrame can tunnel a sandbox straight to the public internet, so when the proof needs to be interactive, Gilfoyle starts the dev server, exposes the port, and drops a live URL into the thread. You click it and you're looking at his running branch: the actual change, live, on his machine, before a single line has merged. It's the difference between a photo of the fix and the fix itself, ready for you to click around in.
That collapses the review loop: we're not reading a diff and imagining the result, then pulling the branch to check. The result is stapled to the message, a picture or a link you can click. An intern who simply announces he's done is a liability; one who hands you a screenshot and a live URL to try it yourself is a colleague.
What he actually does
Two things, mostly.
He ships code. "@gilfoyle the first-sandbox banner wraps to three lines on
mobile, fix it." He opens the file, makes the change, checks it against the
breakpoints, commits, pushes a branch. That exact task is commit b1a4382.
"The share-file URL lifetime in the tool docs says 90 days, it's actually 7,
fix it." That's c0175de. Small, real, unglamorous maintenance that used to
rot in a backlog and now gets done from a phone.
Here's that brief from earlier again, except this time we'll tell you what it actually said, because it's the one that produced the post you're reading right now:

Same screenshot as before. Read it closely: that's the brief that wrote this post.
A while later, Gilfoyle replied in the thread with a summary of what he'd built and a link to the pull request:

His reply: what landed, and a link to the PR. No follow-up questions needed.
And the PR was real: branch, diff, description, ready for a human to glance at and merge.

The pull request itself, on GitHub, under his own commit identity.
This isn't a different example dressed up to look recursive. It's the same loop, the same intern, the same Tuesday. You're reading its output.
He answers from context. Because he has the repo checked out and our tools connected, "where do we set the idle-reaper timeout?" or "what changed in the build pipeline last week?" come back with file paths and commit hashes, not vibes. He is, annoyingly, almost always right.
The goal: Gilfoyle becomes the only way we build AnyFrame
The commits up top, and the mobile fix, were the warm-up. Here's the actual plan, and it's more ambitious than "we have a neat bot": we want Gilfoyle to be the only way we develop AnyFrame. Not a sidecar, not a toy for slow Fridays. The default. Every feature, every fix, every bit of polish, routed through the intern who lives in the product. If you want to change AnyFrame, you ask Gilfoyle.
That sounds like a stunt until you sit with the consequence: it makes us the most demanding user of our own product, continuously. The instant Gilfoyle can't do something a developer needs, that's not an annoyance we route around. It's a bug report written in our own blood, landing the same week our customers would feel it.
So the loop becomes self-reinforcing: we use Gilfoyle to build AnyFrame, building AnyFrame surfaces what he's missing, we have him build that too, and he gets sharper at building AnyFrame. A good chunk of what made him able to commit in the first place was already shipped through him: we'd hit a wall, describe the fix in Discord, and watch him ship the patch that smoothed his own next task. He is, slowly, building his own better workplace, and ours. The features roll in from this loop, not a whiteboard: each one earns its place by first being something we needed and couldn't yet have.
None of this is glamorous. An agent that can talk about your code is a parlor trick. An agent you'd voluntarily make your only path to production has to clear a hundred small, boring bars: identity, auth, isolation, snapshots, permissions, trust. Living inside that constraint is how we find them. Gilfoyle isn't our mascot. He's our forcing function.
What we're teaching him next
Because the roadmap comes straight from our own friction, it's unusually concrete. Here's what Gilfoyle can't do yet, and is next in line to learn.
Take tickets, not just messages. Today you brief him in Discord; next we connect Linear and assign him issues directly: he reads the ticket and its thread, does the work, and moves the card. The backlog becomes his queue. You stop translating tickets into prompts.
Keep his checkout fresh. A long-running thread can drift behind main.
We want him to refresh on a schedule (pull latest before each task, rebase
his work-in-progress) so he's never quietly building on a week-old tree.
Survive a pause with his processes intact. An honest sore spot: a resumed sandbox brings the filesystem back perfectly from its snapshot, but anything Gilfoyle had running (a dev server, a watcher, a long build) is gone, and he has to relaunch it by hand. Bringing live processes back, not just the disk, is high on the list. We feel it every time we step away mid-task.
Remember across tasks. Each thread is its own sandbox: great for isolation, terrible for learning. Gilfoyle finishes a task and forgets it ever happened. Persistent memory would let him write down what he learned so the next task starts smarter: notice patterns, stop re-asking questions he's already answered. An intern who improves every week, not one who resets every morning.
Run on his own clock. Some work is recurring and shouldn't need a human to kick it off: refresh dependencies, sweep dead code, post a weekly summary, catch the things that quietly rot. Cron-triggered workflows would let Gilfoyle wake himself up and do the standing chores, so the "we should really automate this" backlog stops growing.
Each of these is a feature we'll ship to every AnyFrame user. Each of them exists on the list because we hit the wall first.
Why this is the proof, not just a bit
You can wire a model to a repo in an afternoon. What's hard, what AnyFrame is, is everything around the model:
- Isolation that's real. Each task runs in its own sandbox. A bad command in one thread can't touch another. Your key never leaves your server.
- Persistence that's invisible. Snapshots and resume make ephemeral sandboxes feel like a long-lived machine, without paying for idle time.
- A surface your team already lives in. Gilfoyle shows up in Discord (or Slack, or the dashboard, or via the SDK). No new tab to babysit.
- The boring infrastructure of trust. Commit identity, scoped GitHub installs, encrypted credentials, per-org isolation, an idle reaper so nothing runs forever. The stuff that turns a chatbot into a coworker you'd actually give repo access.
We didn't build Gilfoyle to write a cute blog post. We built him because we wanted an intern who never sleeps, and we happened to have built the thing that makes one. The fact that he now ships our taglines, fixes our mobile layout, and corrects our own docs, under his own commit identity, on our own private repo, is the strongest claim we can make about what you can build on AnyFrame.
So: who would your Gilfoyle be, and what would you point him at?
Want to hire your own intern? Define an agent at anyframe.dev,
grab an API key, and fork anyframe-discord-bot. You'll have a sandboxed dev
agent in your team chat before your coffee's cold. Naming him Gilfoyle is
optional but encouraged.