There’s a class of bug in Rails apps that’s easy to ship and hard to debug. A record changed. You need to know who changed it and why. You look in paper_trail. Nothing. You check the model’s after_update callbacks. Still nothing. You trace the call site and find one of the usual suspects: someone used update_columns to bypass validations on a critical field. A rake task did a bulk update_all. A migration touched the row directly. Someone ran a console fix at two in the morning.
There’s also a newer flavor of this bug. An agent’s tool issued raw SQL because the model abstraction didn’t fit the tool’s shape, or because the agent constructed an UPDATE that bypassed the model layer entirely. The write happened. The audit didn’t see it.
We hit this bug enough times at Trusted that we built our way out of it. The result is IronTrail, an open-source Rails gem that does change data capture at the Postgres trigger layer instead of the ActiveRecord callback layer. We’ve been running it in production for over a year. It is MIT licensed, on RubyGems, and small enough to read in an afternoon.
The architectural shift it represents---moving the capture point from the application layer to the storage layer---matters more in a stack where agents are writing to your database.
The problem the application layer can’t see
Callbacks are a great audit point when the application layer is the only thing writing to your database. In Rails, that is usually true. Most writes go through save, update, or destroy, which fire callbacks. paper_trail and its alternatives tap into those callbacks and record what changed. This is mature, well-supported software. For most apps it is the right design, and we say that as long-time happy users.
But most writes is not every write. The places callbacks don’t fire are predictable and they accumulate:
update_columnsandupdate_columnskip callbacks (and validations) by design.update_all,delete_all, andupsert_alloperate at the SQL level.delete(as opposed todestroy) skips callbacks.- Migrations execute SQL directly.
- Anything that goes through
connection.executeor a similar raw-SQL escape hatch. - Anything that touches the database from outside Rails entirely: a Python worker, a SQL client, another service connected to the same Postgres.
In a hundred-engineer codebase the appearance of any of these---usually for a legitimate performance reason---is just a matter of time. And the writes that miss callbacks are often the writes you most want audited: the bulk operations, the hotfixes, the migrations.
There’s a newer flavor of this problem. Agents write to your database. They do it through tools you wrote, which call ActiveRecord (and fire callbacks) most of the time. But agents do unexpected things. A tool might construct raw SQL because the model abstraction felt awkward. A buggy tool might bypass the audit field a careful engineer would have set. As agents accumulate in the stack, the surface area of code that might write to your DB expands faster than your audit instrumentation can keep up.
There’s a parallel problem that has nothing to do with agents and is just as common: internal ops tooling. An ops engineer runs an action in Active Admin---bulk re-tag a cohort, retry a stuck onboarding step, fix a single field on a single row. The write goes through ActiveRecord and fires the callbacks. But the who gets fuzzy. A callback recording an actor needs to know about the engineer running the operation, not the application user the row belongs to, and most callback-based audit setups end up labeling the change with a generic system actor. That’s the wrong answer when an audit anomaly later needs to trace back to a person. The write-capture question and the actor-identity question are separate problems, and any team running Active Admin hits the second one whether or not they ever run an AI agent.
What we wanted was every write---application code, migrations, agent tools, console fixes, Active Admin actions---captured automatically, with application context (current user, current request, Sidekiq job) attached.
Where off-the-shelf tools fit, and don’t
| Tool | Capture point | Coverage | Ops tax |
|---|---|---|---|
| paper_trail | ActiveRecord callbacks (application layer) | Most writes, but misses update_columns, update_all, delete, migrations, raw SQL, anything outside Rails | Low. One gem, one table per model (or shared with discrimination). |
| WAL streaming (e.g. Debezium) | Postgres write-ahead log (below SQL layer) | Every write, including writes from outside the application | High. External workers, log positions, replication slots to keep healthy, eventual delivery semantics. |
| IronTrail | Postgres triggers (storage layer, inside the transaction) | Every write through Postgres---application code, migrations, agent tools, console fixes, Active Admin actions | Low-medium. One gem, one partitioned table, no external streaming infra. |
paper_trail solves most of this for most apps and I love it; it doesn’t fit our case structurally because callbacks are an application-layer concern and we needed audit at the storage layer. WAL streaming covers every write but the operational tax is steep unless you already run streaming infrastructure. What we wanted was the developer experience of paper_trail with the coverage of WAL-level capture. The path between the two design points runs through Postgres triggers.
What IronTrail does
A single PL/pgSQL function (in the gem’s lib/) is attached as a trigger to every table you want tracked. Postgres calls the function on every INSERT, UPDATE, and DELETE. The function logs the change to a single irontrail_changes table with the following shape:
Column | Type
------------+------------------------------
id | bigint
actor_type | text
actor_id | text
rec_table | text
rec_id | text
operation | text
rec_old | jsonb
rec_new | jsonb
rec_delta | jsonb
metadata | jsonb
created_at | timestamp without time zone
Partition key: RANGE (created_at)
A few choices in that schema are worth flagging.
The actor is a top-level column, not buried inside the metadata blob. actor_type and actor_id make actor-based queries fast: show me every write that this user has made in the last 24 hours is an indexable query, not a JSONB scan.
There is one audit table for everything, partitioned by created_at. Compare paper_trail, which (by default) uses one versions table per model, or one shared versions table with item_type discrimination. We found a single time-partitioned table easier to operate. Time-range queries are fast; cross-model audits are trivial; partition management is straightforward.
Both old and new are captured as JSONB, plus a precomputed delta of just the fields that changed. The delta is what most downstream consumers actually care about. Computing it once in the trigger is cheaper than recomputing it across every query later.
Installation is short:
# Gemfile
gem 'iron_trail'
$ bin/rails g iron_trail:migration
$ bin/rails db:migrate
$ rake iron_trail:tracking:enable
The last command attaches the trigger to every table that doesn’t already have it (skipping tables on a configurable ignore list). A symmetric rake iron_trail:tracking:disable removes the triggers. rake iron_trail:tracking:status shows what’s tracked, what isn’t, and what’s explicitly ignored.
The metadata pattern
The interesting design decision in IronTrail is how application-layer context (current user, current request, Sidekiq job, anything else) gets to the trigger.
The naive answer is session variables: use SET LOCAL audit.user_id = '...' before each write and have the trigger read it back. That works but is fragile. It depends on the same connection being used for the SET and the write. Connection pools, transaction boundaries, and middleware reorderings create opportunities for the variable and the write to drift apart.
IronTrail uses a different mechanism. A Rails query transformer prepends a comment to every write query:
# from MetadataTransformer (paraphrased)
def create_query_transformer_proc
proc do |query, adapter|
current_metadata = IronTrail.current_metadata
next query unless adapter.write_query?(query) && current_metadata.any?
metadata = JSON.dump(current_metadata)
safe_md = metadata.gsub('*/', '*/')
"/*IronTrail #{safe_md} IronTrail*/ #{query}"
end
end
So a normal Rails update gets sent to Postgres like:
/*IronTrail {"_actor_type":"User","_actor_id":"4711","request_id":"abc"} IronTrail*/
UPDATE clinicians SET license_state = 'CA' WHERE id = 123;
Inside the trigger, the PL/pgSQL function reads the metadata back from current_query():
SELECT split_part(
split_part(current_query(), '/*IronTrail ', 2),
' IronTrail*/', 1
) INTO it_meta;
The trigger parses the JSON, extracts _actor_type and _actor_id into their dedicated columns, and writes the rest into the metadata JSONB.
A few things follow from this choice.
The metadata travels with the statement, not with the connection. Connection pools, transaction boundaries, async writes through other libraries---none of them can desync the metadata from the write, because the metadata is part of the write’s SQL text.
The metadata is captured atomically with the write. There’s no two-phase “SET then write” window where one half succeeds and the other doesn’t.
Application code doesn’t need to know IronTrail exists. The Rails ApplicationController (and an analogous Sidekiq middleware) sets the current metadata once per request or job. The query transformer picks it up on every write that runs inside that request or job, regardless of which library issued the query.
You pay a small cost in query parsing overhead and one comment’s worth of SQL bytes per write. We’ve found that price tolerable.
What this looks like in production
We’ve been running IronTrail in our main Rails app for over a year. As of the talk we gave on this internally, the system tracks 94 tables, with 262 explicitly ignored (mostly legacy paper_trail tables we’re still migrating off and high-volume tables that aren’t worth auditing). The irontrail_changes table has accumulated roughly 30 million change rows, growing by one to seven million per week depending on system activity.
We run a few pieces of supporting infrastructure that are worth naming.
PartitionedTablesKeeperJob is a daily Sidekiq job that creates the next set of time-partitioned tables ahead of time. We use pg_party for partition management, half-month partitions (irontrail_changes_202507h1 for the first half of July 2025, _h2 for the second half), and the keeper job creates the next four months on every run.
A companion table, irontrail_trigger_errors, captures any change events that the trigger function couldn’t insert into irontrail_changes---most commonly because a partition didn’t exist for an incoming created_at. Failed audits land there with enough context to replay them later, and the application keeps running. Audit-rail failures should not break business writes.
Monitoring lives in Grafana, sourced from Prometheus. Three metrics matter:
partitioned_table_coverage_days: how far in the future do partitions exist. Alert if this dips below a threshold.iron_trail_error_max_created_atandiron_trail_error_max_id: how recent is the most recent error inirontrail_trigger_errors. Alert if either is fresher than expected.
For Sidekiq workers, an optional middleware tags every audit row with the job that produced the write:
require 'iron_trail/sidekiq'
Sidekiq.configure_server do |config|
config.server_middleware do |chain|
chain.add IronTrail::SidekiqMiddleware
end
end
There’s also a PapertrailToIrontrailMapper helper for shops migrating off paper_trail. The shape conversion is mostly mechanical: item_type becomes rec_table, item_id becomes rec_id, event becomes operation, whodunnit and whodunnit_type become actor_id and actor_type, and object / object_changes map onto rec_old, rec_new, and the recomputed rec_delta. The migration runs as a backfill, model by model, while production continues writing to both schemes until you cut over.
What we learned
A few production realities are worth flagging up front, since I didn’t see them coming. The compute overhead of the trigger itself has been smaller than I’d budgeted for. Storage growth has been the part to watch instead: the JSONB rec_old and rec_new columns add up on wide tables whose payloads mostly don’t change. High-churn tables (counters, queues, anything touched on every parent update) push the audit volume disproportionately, and the right answer is almost always the ignore list, not optimization. The bottleneck moved from compute to storage faster than I’d guessed. If you’re sizing this on day one, size the disk.
A few more things that surprised me, or that I’d want a fresh implementer to know.
Triggers survive bad code paths. This is the headline benefit. A rake task that bypasses the ORM, a migration with raw SQL, a console fix made by an on-call engineer at two in the morning---they all write through Postgres, which means they all hit the trigger.
We have not had a “change went unaudited” incident since IronTrail shipped.
Test mode requires deliberate setup. Triggers slow down a test suite slightly, and tests that mock current time or current user can produce nondeterministic audit rows. We ship a test mode that swaps the trigger function for a no-op, plus per-spec opt-in to enable real tracking for tests that need to assert on the audit log. The function swap is fast and doesn’t add or drop triggers.
Partitioned table coverage is the first thing to alert on. If PartitionedTablesKeeperJob fails silently, new writes start landing in irontrail_trigger_errors because the destination partition doesn’t exist. Monitor coverage as a leading indicator.
JSONB only, not JSON. The trigger function relies on JSONB operators (?, ||, jsonb_set). Tables with columns of type json (without the b) can’t be tracked until those columns are migrated to jsonb. Worth knowing if your codebase predates JSONB.
Trigger removal is not handled automatically when you add a table to the ignore list. Adding a table to ignored_tables prevents new triggers from being installed, but doesn’t drop existing ones. Run DROP TRIGGER iron_trail_log_changes ON <table>; (or the disable rake task) explicitly when retiring tracking on a table.
The SQL-comment metadata trick is the right boundary. A common alternative is to write audit metadata into a separate column on every model (a last_updated_by_id, for instance). That works but leaks audit concerns into every table’s schema. The comment-injection pattern keeps audit data centralized and the model schemas unchanged.
Why this matters more with agents
The reason this post lives on a publication that mostly talks about agent-native architecture is that audit blind spots get worse, not better, when agents enter the stack.
In The Form Pattern we wrote about why we constrain agents through the same validation layer humans use: the agent fills out a form, submits it, the form runs the same checks the human UI runs. That is the right boundary for keeping agent behavior in scope.
But that boundary is only as strong as the assumption that the agent always goes through it. Tool authors are people. People write tools that do things you wouldn’t predict. An agent tool that issues raw SQL because expressing the operation through ActiveRecord felt awkward---that’s a real failure mode in agentic codebases. Application-layer audit will miss it. Storage-layer audit will not.
We didn’t build IronTrail because of agents. We built it because callback-based audit was missing legitimate writes in a large Rails codebase, and we needed to know what changed and who changed it without trusting the application layer to remember. The existence of agents in the stack made that problem worse and the storage-layer answer more valuable.
Get involved
github.com/trusted/iron_trail. MIT license. Sixteen releases shipped to date. PRs and issues welcome.
The gemspec is short and the dependencies are narrow: ActiveRecord, a couple of ActiveSupport bits, nothing exotic. If you run a Postgres-backed Rails app and you’ve been frustrated by callback-based audit, the install is one Gemfile line and one migration.
The backlog we’d like contributors on, in rough order: native helpers for time-based partitioning, retention policies on the irontrail_changes table, richer metadata extraction APIs, a closer Sidekiq coupling that infers the actor from job params automatically, and a few rough edges around bulk imports. The codebase is small enough to read on a flight. The maintainers respond.
If you build something useful on top of it, we’d love to hear about it.
--- Andre, Engineering