Snapshots & rollback for AI agents.
One call captures the full state of your agent's workspace. One call rolls it back. Use it to recover from agent missteps, checkpoint after big tasks, or pin a known-good state before a risky run.
Agents make
destructive mistakes.
An LLM with a shell tool will eventually run something it shouldn't — overwrite a file, delete the wrong directory, corrupt a memory note, write garbage into a config. With no rollback, every mistake compounds.
Building this with S3 versioning, git, or homemade backup scripts means custom infrastructure for what should be a single API call. TroveFiles snapshots give you point-in-time recovery without standing any of that up.
Capture. Restore. Audit.
Capture before risky steps
Take a labeled snapshot before a destructive agent step or at the end of a successful task. One call, one ID back, retained for 30 days.
from trove_sdk import TroveClient
trove = TroveClient(api_key="trove-sk-...", namespace="alice")
# Before a risky agent step — snapshot the namespace
snap = trove.create_snapshot(label="before-quarterly-rebuild")
print(snap.snapshot_id, snap.size_bytes, snap.created_at)
# snap-1f3a2c… 4_812_393 2026-04-30T18:50:11ZRestore in one call
When something goes wrong, restore the namespace to a prior snapshot. The current state is wiped and re-extracted — whole-namespace, atomic per file.
# Agent went off the rails — restore the namespace to the snapshot
files_restored = trove.restore_snapshot("snap-1f3a2c…")
print(f"Rolled back. {files_restored} files restored.")
# List the available snapshots, newest first
for s in trove.list_snapshots():
print(s.created_at, s.label or "(no label)", s.snapshot_id)Audit via webhooks
Subscribe to snapshot.created and snapshot.restored for a complete audit trail. Every rollback fires a signed event with the namespace and snapshot ID.
from trove_sdk import TroveAdminClient
admin = TroveAdminClient(api_key="trove-admin-...", workspace_id="ws-abc123")
# Subscribe to snapshot lifecycle events for a customer
admin.create_webhook(
url="https://api.yourapp.com/trove-events",
events=["snapshot.created", "snapshot.restored"],
namespace="customer-acme",
)When teams take snapshots.
Before a destructive operation
About to run a batch rewrite, a mass delete, or a migration script? Snapshot first. If the agent gets it wrong, restore in one call.
After a successful task
Pin every successful customer task as a known-good rollback target. The next time something corrupts state, you have a clean baseline to fall back to.
Before a prompt change
Deploying a new system prompt or model version? Snapshot the workspace so you can revert if the new behavior breaks things.
Compliance & audit
Schedule snapshots on a cadence to satisfy retention requirements. Every snapshot fires a signed webhook so the audit log writes itself.
What you get.
| Scope | Per namespace (whole workspace, atomic per file) |
| Retention | 30 days from creation; re-snapshot to extend |
| Size limit | 1 GB compressed per snapshot |
| Restore type | Destructive — wipes namespace, then re-extracts |
| Path safety | Rejects absolute paths, .., and out-of-root targets |
| Events | snapshot.created, snapshot.restored (signed webhooks) |
| SDKs | Python (sync + async) |
| HTTP | POST/GET/DELETE /v1/snapshots, POST /v1/snapshots/{id}/restore |
Snapshots & rollback,
answered.
What does a TroveFiles snapshot capture?
The full state of one namespace at the moment you call create_snapshot — every file, every directory, exactly as it was. Snapshots are scoped to a single namespace, so multi-tenant isolation is preserved.
How long are snapshots retained?
Up to 30 days from creation. After that they expire automatically. To keep state longer, take a fresh snapshot before the 30-day window closes.
Is there a size limit?
Each snapshot tops out at 1 GB of compressed namespace state. Most agent workspaces are well under that — typical workspaces with markdown notes, PDFs, and intermediate files come in at a few hundred MB.
How does restore work? Is it per-file?
Restore is whole-namespace. Calling restore_snapshot wipes the current namespace and re-extracts the snapshot. It is not a per-file diff or selective revert — think "rollback the workspace" rather than "revert this single file." For per-file edits, your agent can grep, read, and re-write directly.
Are snapshot operations safe? What about path traversal?
Yes. The runtime rejects absolute paths, parent traversal (..), and any extraction target outside the namespace root. A malformed or malicious tar cannot escape the namespace.
Do snapshot operations fire webhooks?
Yes — snapshot.created and snapshot.restored fire signed webhook events to subscribed endpoints. This makes it easy to log every rollback to your audit trail or trigger downstream notifications.
Can I trigger a snapshot from inside the agent?
Yes. The TroveFiles SDK is the same surface inside or outside agent code — the agent can call trove.create_snapshot() before a risky step and trove.restore_snapshot() to undo. For most teams, snapshots are taken from your application backend rather than the agent itself.
When should I take a snapshot?
Common patterns: before deploying a new agent prompt, before a destructive batch operation, at the end of every successful customer task as a "known good" rollback target, and on a schedule for compliance.
Give your agent
an undo button.
Two methods to learn. Two lines of code. No backup infrastructure to stand up.