Back to Notes

Falconer agent now speaks git

Many questions about a codebase aren’t about the code as it stands today. They’re about how it got that way. Who owns this auth module and should be on my PR? Which files have churned the most this quarter? When did this config flag actually get flipped on in prod?

Today the Falconer agent can call git directly against your connected GitHub repos through a new agent tool, gitRaw. It’s live in production for every customer with a connected GitHub installation.

What this unlocks

These types of questions used to require GitHub’s web UI, a local clone, or a teammate who happened to remember. Now they’re only a question away in Falconer.

  • Code archaeology: who wrote this, when, and why, via git blame and commit messages.
  • Time travel: inspect any file at any commit, tag, or date. Compare how code evolved across releases.
  • Beyond PRs: surface direct commits, hotfixes, and squash details that PR search alone misses.
  • Authorship and ownership: find the right reviewer or expert for a file from real commit history, not Slack lore.
  • Release diffs: “what shipped between v1.2 and v1.3” across tags.
  • Regression hunting: narrow down when a behavior changed using bisect.
  • Churn and hotspots: identify the busiest files and riskiest surfaces from real commit activity.
Falconer agent invoking git tools to reconstruct schema.prisma evolution

Falconer agent reaching for git tools to answer precise questions on repo history

How it works under the hood

Every connected repo now lives on a shared NFS filesystem backed by S3 Files, mounted read/write by our ingest service and read only by Falconer’s UI service. When the agent calls gitRaw, it shells out against the clone at /repos/{orgId}/github/{owner}/{repo} and streams the result back into the agent context. No GitHub API calls, no rate limits, no constrained view of history.

gitRaw runs an allowlisted set of read only git subcommands as a subprocess directly against a customer’s persistent repo clone:

  • log, show, diff, blame, grep, ls-tree, rev-list, shortlog, cat-file, name-rev, describe, merge-base

Anything that writes, hits the network, or falls outside the allowlist is rejected at the tool boundary. Every invocation is scoped to a single {orgId, owner, repo} clone. There is no path traversal between customers.

We engineered the repo sync and storage to be robust and reliable. The clone is kept fresh by two parallel paths. For more details, please refer to our blog post How Falconer powers agents with AWS S3 Files.

Why one raw tool, not many

The alternative was a handful of dedicated tools (git_blame, git_log, git_diff), each with a tight schema and inline guidance. We picked the raw tool instead.

Real git questions don’t fit one subcommand. Answering “who introduced this bug” might mean a blame, then a log on the suspect commit, then a diff against the parent. The agent needs to compose that chain itself. A raw interface lets it use git the way an engineer does, not through a narrow menu of predefined queries.

The tradeoff is that the agent has to know its way around git. In practice that’s table stakes for the model. Flexibility wins.

Try it

If your team has connected GitHub to Falconer, gitRaw is on today. Ask the Falconer agent any question you’d normally answer by digging through git log, git blame, or the GitHub commits view.