A clean-machine restore is the only restore that tells the truth. Restoring Hermes Agent onto the same laptop proves you can copy files around. Restoring it onto a blank machine proves whether the agent can actually come back after theft, disk failure, a bad OS update, or the classic "I wiped the wrong folder" maneuver.
Keepmyclaw currently supports OpenClaw and Hermes backup and restore only. That scope matters. This is not generic Claude Code state-loss advice, and it is not a universal agent backup plan for every tool with a chat box. The target is Hermes Agent and OpenClaw-style local workspaces where the useful state lives in files, memory, skills, cron jobs, scripts, configs, and credentials after client-side encryption.
The operator thesis is simple: a Hermes backup is not done until a clean machine can run the same useful agent without archaeology. If the restored agent needs two hours of path hunting, missing skill repair, credential guessing, and cron prompt reconstruction, the backup preserved artifacts. It did not preserve operations.
What has to survive
A Hermes Agent workspace is not one folder with vibes in it. The restore needs enough context for the agent to work, schedule work, call tools, remember rules, and avoid repeating old mistakes.
Protect these classes of state:
- workspace files and project repositories
- Hermes memory and state files
- skills and local persona or playbook files
- cron job definitions, prompts, schedules, attached skills, scripts, and delivery targets
- configured scripts under the Hermes scripts directory
- configs, model preferences, provider settings, and toolset choices
- credentials after client-side encryption
- dependency manifests and install notes
- restore manifests with checksums, versions, paths, and expected counts
The ugly part is that a partial restore can look fine. The agent starts. It can answer. Then the daily revenue job never runs because the cron definition was not restored. Or the Reddit persona file is missing, so a marketing job starts sounding like a webinar host. Tiny state loss, large blast radius. Computers remain petty.
The baseline restore contract
Use numbers. They keep everyone honest.
- Recovery point objective: 15 minutes for active production workspaces
- Recovery time objective: 45 minutes for one Hermes workspace onto a prepared machine
- Full clean-machine drill cadence: monthly for production agents
- Lightweight restore validation cadence: weekly
- Snapshot cadence: every 15 minutes while active, daily when idle
- Daily retention: 14 snapshots
- Weekly retention: 8 snapshots
- Monthly retention: 12 snapshots
- Maximum unverified snapshot age: 30 days
- Credential rebind window: 60 minutes from restore start
- Cron verification: 100 percent of restored jobs listed, one safe read-only job executed
- Skill verification: expected skill count and checksum manifest match
Those defaults are not sacred. Tighten them if Hermes is running customer workflows, paid lead recovery, deployments, or monitoring. Loosen them for experiments. But write the contract down before the laptop dies. Afterward, everyone becomes a philosopher.
The clean-machine restore path
Start with a blank target. Fresh OS user. Fresh Hermes install. No copied home directory hiding the problem.
First, restore the workspace files and Hermes state from the selected snapshot. Use a snapshot with a known timestamp, manifest checksum, Hermes version, hostname, old workspace root, and encryption metadata. If the manifest cannot tell you what it contains, stop trusting it.
Second, restore skills, memory, configs, and scripts into the expected paths. Do not hand-place files from memory. That is how restore drills turn into folk music. Use the manifest to compare expected files, actual files, sizes, and checksums.
Third, restore cron definitions. For Hermes, cron state is operational state. Preserve job ID, name, schedule, prompt, attached skills, enabled toolsets, configured script, context dependencies, delivery target, repeat count, and last useful state when it is needed for dedupe. A job that loses its prompt is not restored. It is a title wearing a hat.
Fourth, restore encrypted credentials and rebind provider access with safe read-only checks. Do not print secrets into logs. Verify presence, scope, and length only. The goal is to prove the restored agent can use what it needs without turning the backup report into a credential leak.
Fifth, run a validation pass. List restored jobs. Load at least one attached skill. Compile restored scripts when possible. Check that project repos open cleanly. Run safe read-only API checks for providers that matter. Then execute one harmless scheduled job and confirm the output appears in the expected Hermes cron output path.
Failure scenarios the drill has to catch
The workspace restores but memory points to the old machine
The files are present, but memory references paths from the dead laptop. The agent tries to use an old project root, an old scripts path, or an old credential location.
Mitigation: store old path and target path in the restore manifest. During clean-machine validation, scan memory, cron prompts, and scripts for old absolute paths. Rewrite only known-safe path references, then log the changed files.
Cron jobs exist but do nothing useful
The scheduler shows the jobs, which feels comforting for about five seconds. Then you notice the script path is wrong, the attached skill is missing, or the delivery target did not survive.
Mitigation: verify cron jobs by behavior, not just count. Every production job needs schedule, prompt, attached skills, enabled toolsets, script path, delivery target, and one safe dry run or read-only run. If a public posting job exists, do not live-test posting during the drill. Test the non-public preflight path instead.
Credentials decrypt but fail provider checks
A restored token can be present and still useless. It may be expired. It may be tied to an old machine. It may need an environment variable that was never captured.
Mitigation: define read-only provider checks in the manifest. For GitHub, list the authenticated user or repo access. For Cloudflare-style services, use read-only account or database checks when already authorized. For Reddit, check the current account endpoint if that account is part of the restored workflow. Never mutate auth during a cron-driven drill unless that boundary was explicitly approved.
Skills restore with the wrong version
The agent has a skill named correctly, but the content is stale. That is worse than missing. Missing fails loudly. Stale rules make the agent confidently wrong.
Mitigation: checksum skill files and persona files. Restore should compare expected checksums, file counts, and last modified dates. For marketing or product skills, also check canonical product claims. Keepmyclaw should still say OpenClaw and Hermes only unless the product scope has actually changed.
The repo comes back dirty
A restored project repo may contain uncommitted changes from the moment of failure. Sometimes those changes are real work. Sometimes they are half-applied conflict sludge.
Mitigation: record git branch, HEAD commit, remote URL, status summary, and untracked files at backup time. After restore, run a source-only conflict marker scan and a diff check before letting the agent build, deploy, or publish anything. Dirty state is not automatically bad. Unknown dirty state is.
What Keepmyclaw should verify after restore
A useful restore report should be boring and specific.
It should say the snapshot ID, workspace root, Hermes version, file count, skill count, cron job count, restored script count, config count, encrypted credential bundle count, checksum result, and validation time. It should list skipped public actions. It should name any credentials that require manual rebind without exposing their value.
For OpenClaw and Hermes operators, the most important proof is the clean-machine drill. Restore onto a new environment. Start Hermes. Load the memory. List skills. List cron jobs. Run a safe job. Open the project repo. Confirm the agent can do useful work without you reconstructing its life story from shell history.
The rule
Do not wait for a disaster to learn whether your Hermes backup includes the boring parts. Boring parts are where the agent lives.
If Keepmyclaw backs up the workspace but not the cron prompts, you lose operations. If it backs up files but not encrypted credentials, you lose tool access. If it backs up memory but not skills, you lose behavior. If it backs up everything but never runs a clean-machine drill, you lose certainty.
A clean-machine restore turns the question from "do we have backups?" into "can this agent work again before lunch?" That is the question that matters.