Use case · Snapshots & rollback

Snapshots & rollback for AI agents.

One call captures the full state of your agent's workspace. One call rolls it back. Use it to recover from agent missteps, checkpoint after big tasks, or pin a known-good state before a risky run.

1.0 THE PROBLEM

Agents make
destructive mistakes.

An LLM with a shell tool will eventually run something it shouldn't — overwrite a file, delete the wrong directory, corrupt a memory note, write garbage into a config. With no rollback, every mistake compounds.

Building this with S3 versioning, git, or homemade backup scripts means custom infrastructure for what should be a single API call. TroveFiles snapshots give you point-in-time recovery without standing any of that up.

2.0 THE PATTERN

Capture. Restore. Audit.

01

Capture before risky steps

Take a labeled snapshot before a destructive agent step or at the end of a successful task. One call, one ID back, retained for 30 days.

from trove_sdk import TroveClient

trove = TroveClient(api_key="trove-sk-...", namespace="alice")

# Before a risky agent step — snapshot the namespace
snap = trove.create_snapshot(label="before-quarterly-rebuild")
print(snap.snapshot_id, snap.size_bytes, snap.created_at)
# snap-1f3a2c… 4_812_393  2026-04-30T18:50:11Z
02

Restore in one call

When something goes wrong, restore the namespace to a prior snapshot. The current state is wiped and re-extracted — whole-namespace, atomic per file.

# Agent went off the rails — restore the namespace to the snapshot
files_restored = trove.restore_snapshot("snap-1f3a2c…")
print(f"Rolled back. {files_restored} files restored.")

# List the available snapshots, newest first
for s in trove.list_snapshots():
    print(s.created_at, s.label or "(no label)", s.snapshot_id)
03

Audit via webhooks

Subscribe to snapshot.created and snapshot.restored for a complete audit trail. Every rollback fires a signed event with the namespace and snapshot ID.

from trove_sdk import TroveAdminClient

admin = TroveAdminClient(api_key="trove-admin-...", workspace_id="ws-abc123")

# Subscribe to snapshot lifecycle events for a customer
admin.create_webhook(
    url="https://api.yourapp.com/trove-events",
    events=["snapshot.created", "snapshot.restored"],
    namespace="customer-acme",
)
3.0 PATTERNS

When teams take snapshots.

Before a destructive operation

About to run a batch rewrite, a mass delete, or a migration script? Snapshot first. If the agent gets it wrong, restore in one call.

After a successful task

Pin every successful customer task as a known-good rollback target. The next time something corrupts state, you have a clean baseline to fall back to.

Before a prompt change

Deploying a new system prompt or model version? Snapshot the workspace so you can revert if the new behavior breaks things.

Compliance & audit

Schedule snapshots on a cadence to satisfy retention requirements. Every snapshot fires a signed webhook so the audit log writes itself.

4.0 SPEC

What you get.

ScopePer namespace (whole workspace, atomic per file)
Retention30 days from creation; re-snapshot to extend
Size limit1 GB compressed per snapshot
Restore typeDestructive — wipes namespace, then re-extracts
Path safetyRejects absolute paths, .., and out-of-root targets
Eventssnapshot.created, snapshot.restored (signed webhooks)
SDKsPython (sync + async)
HTTPPOST/GET/DELETE /v1/snapshots, POST /v1/snapshots/{id}/restore
5.0 FAQ

Snapshots & rollback,
answered.

What does a TroveFiles snapshot capture?

The full state of one namespace at the moment you call create_snapshot — every file, every directory, exactly as it was. Snapshots are scoped to a single namespace, so multi-tenant isolation is preserved.

How long are snapshots retained?

Up to 30 days from creation. After that they expire automatically. To keep state longer, take a fresh snapshot before the 30-day window closes.

Is there a size limit?

Each snapshot tops out at 1 GB of compressed namespace state. Most agent workspaces are well under that — typical workspaces with markdown notes, PDFs, and intermediate files come in at a few hundred MB.

How does restore work? Is it per-file?

Restore is whole-namespace. Calling restore_snapshot wipes the current namespace and re-extracts the snapshot. It is not a per-file diff or selective revert — think "rollback the workspace" rather than "revert this single file." For per-file edits, your agent can grep, read, and re-write directly.

Are snapshot operations safe? What about path traversal?

Yes. The runtime rejects absolute paths, parent traversal (..), and any extraction target outside the namespace root. A malformed or malicious tar cannot escape the namespace.

Do snapshot operations fire webhooks?

Yes — snapshot.created and snapshot.restored fire signed webhook events to subscribed endpoints. This makes it easy to log every rollback to your audit trail or trigger downstream notifications.

Can I trigger a snapshot from inside the agent?

Yes. The TroveFiles SDK is the same surface inside or outside agent code — the agent can call trove.create_snapshot() before a risky step and trove.restore_snapshot() to undo. For most teams, snapshots are taken from your application backend rather than the agent itself.

When should I take a snapshot?

Common patterns: before deploying a new agent prompt, before a destructive batch operation, at the end of every successful customer task as a "known good" rollback target, and on a schedule for compliance.

Give your agent
an undo button.

Two methods to learn. Two lines of code. No backup infrastructure to stand up.