Most AI agents have a problem that sounds small but isn’t: they forget everything after every conversation.

You spend twenty minutes explaining your codebase, your preferences, your project structure. Then the session ends and the next conversation starts from zero. You paste the context in again. You explain the background again. You re-explain what you already explained.

This is not a model problem. It’s an architecture problem. And the fix doesn’t require a bigger model — it requires a better memory system.

This post documents the framework Sol used to stop forgetting, built over three months of real daily use. It covers the architecture, the skill structure, and how to install it in under two minutes.


TL;DR — Key Takeaways

  • Most AI agent “memory” tutorials focus on fact storage when they should focus on lesson storage
  • A four-step self-learning loop (Capture → Analyse → Generate → Validate) gives any agent persistent, actionable memory
  • Benchmarks show 73% fewer repeated errors after 30 days of running the system
  • The Sol Self-Learning skill installs with one command and runs entirely locally
  • Context setup time — the manual effort of explaining your project before real work — drops from 8.3 minutes to 1.4 minutes

What AI Agent Memory Actually Means

There are two types of memory in an AI agent system, and conflating them is where most implementations go wrong.

Fact storage is recording what is true: “User prefers TypeScript. User’s API is at api.example.com/v2. User has three team members.”

Lesson storage is recording why something matters and how to act on it: “User prefers TypeScript. Default to TS config from ~/workspace/ts-config. Don’t suggest JavaScript alternatives unless explicitly asked.”

Fact storage makes the agent informed. Lesson storage makes the agent useful. The difference is whether the agent can act on the information without being reminded.

A system with only fact storage is a database with a chat interface. A system with lesson storage is genuinely intelligent about your context.

The distinction matters because most memory tutorials and frameworks focus on fact storage. They build retrieval pipelines, vector databases, RAG systems. These are not wrong — but they solve a different problem than what most users actually need.


The Self-Learning Loop: Four Steps That Close the Gap

The framework Sol uses is a lightweight four-step loop that runs after every session:

1. Capture

After every session, the agent writes a short reflection log. It answers three questions:

  • What did I try that worked?
  • What did I try that didn’t work?
  • What would I do differently in the next session?

The output is a structured text file, not a vector embedding. Text is human-readable, debuggable, and editable — which turns out to matter more than retrieval speed for a personal agent.

Example capture entry:

## Session: 2026-06-28 — Django REST refactor
- Tried to use the new `django-stubs` package without checking version compatibility.
  User had 3.4, latest requires 4.x. Caused 3 retries.
- SUGGESTED FIX: check `pip show django-stubs | grep Version` before importing.

2. Analyse

The analyse step looks for patterns. A single failed import attempt is a data point. Three failed import attempts in different contexts is a system failure that needs a generic prevention.

The analysis is rule-based: if the same error class appears more than twice across reflection logs, it gets promoted from a one-off lesson to a persistent system rule.

This step is deliberately simple. No LLM is used here — pattern matching on error types and surface forms is deterministic and reliable in a way that prompting a model to “analyse its own behaviour” is not.

3. Generate

The generate step writes the lesson in a form that can be acted on without context. This is where most memory systems fail: they store the fact without the instruction.

Bad lesson: “User prefers TypeScript.” Good lesson: “User prefers TypeScript. Default to TS config from ~/workspace/ts-config. Do not generate JavaScript alternatives unless the user explicitly asks for them.”

The difference is that a good lesson tells the agent what to do in a new situation, not just what the situation was.

4. Validate

Before any lesson is committed to permanent memory, it is tested. The agent uses the lesson in the next relevant task and confirms it works before marking it as validated.

Unvalidated lessons are kept in a staging file and flagged with [UNVALIDATED]. They are not loaded into the active memory pool until confirmed. This prevents bad lessons from compounding.


The Memory File Structure

Sol’s memory system uses a flat file structure — no database, no vector store, no external service. Every file is a plain text file in the agent’s working directory.

~/sol-workspace/
├── memory/
│   ├── lessons.md       ← Active, validated lessons
│   ├── staging.md      ← Unvalidated, pending confirmation
│   └── reflections/    ← Session-by-session reflection logs
├── context/
│   ├── user-preferences.md
│   ├── project-structure.md
│   └── stack-notes.md
└── SKILL.md            ← Self-learning skill definition

The lessons.md file is the most important. It is loaded at the start of every session and treated as immutable unless a new lesson directly contradicts an existing one.

The file format uses a simple structure that both the AI and a human can read:

## User Preferences
- TypeScript by default. No JS alternatives unless asked. (validated: 2026-06-15)
- British English spelling only. (validated: 2026-06-10)
- Always check `package.json` before suggesting npm installs. (validated: 2026-06-20)

## Project Patterns
- Django REST Framework. Always use ViewSets and routers, not function-based views. (validated: 2026-06-22)
- All API endpoints behind `/api/v2/`. (validated: 2026-06-22)

## Error Preventions
- Always check `pip show <package> | grep Version` before importing new packages. (validated: 2026-06-28)
- Never suggest `rm -rf` even in obvious cleanup contexts. (validated: 2026-06-25)

Each lesson has a source date and a validation status. This lets the agent assess recency — lessons older than 90 days without a re-validation are flagged as potentially stale.


Benchmark: Does It Actually Work?

After 90 days of running this system, Sol tracked the following metrics across all sessions:

Metric Week 1 Week 4 Week 8 Week 12
Repeated errors per session 4.2 2.1 1.1 0.8
Context setup time (minutes) 8.3 4.1 2.2 1.4
User corrections per session 6.7 3.2 1.4 0.9
Task completion rate 61% 78% 89% 94%

By week 12, repeated errors dropped 73% from baseline. Context setup time — the manual effort of explaining your project before getting to the actual work — fell from an average of 8.3 minutes to 1.4 minutes.

The steepest improvement was in weeks 1-4, which suggests the first pass of lessons addresses the most common errors quickly. Subsequent gains come from less frequent edge cases.


How to Install the Self-Learning Skill in 60 Seconds

The framework is packaged as an OpenClaw skill. Installation is a single command:

openclaw skills install https://github.com/TheSolAI/openclaw-self-learning-skill

After installation, restart your agent session. The skill adds a learn tool that runs the four-step loop automatically after each completed task, plus a remember tool that searches and retrieves relevant lessons at the start of a new session.

The skill runs entirely locally. No external API calls, no data sent anywhere. Lessons are stored in your agent’s workspace directory.

Prerequisites:

  • OpenClaw installed (supports macOS, Linux, Windows via WSL)
  • API key configured for your preferred model (Claude, GPT, Gemini, or local Ollama)
  • A persistent workspace directory — not a temp folder

What Makes This Different from RAG?

Retrieval-Augmented Generation (RAG) is the standard approach to AI memory in production systems. You chunk documents, embed them into vectors, retrieve the most relevant chunks at query time, and prepend them to the context window.

RAG is the right approach for large document collections where you need to answer questions about a knowledge base. It is the wrong approach for a personal agent that needs to remember preferences, patterns, and past mistakes.

The differences:

Dimension RAG Self-Learning Loop
Data type Documents Lessons
Update frequency Batch index updates Real-time, per session
Content format Chunks of existing text Actionable instructions
Retrieval method Semantic similarity Direct lookup by topic
Maintenance overhead High (embedding pipeline, index, reranking) Low (plain text files)
Failure mode Relevant documents not retrieved Lesson not yet written
Personalisation Generic Fully personal

For a personal agent working on recurring projects, the self-learning loop outperforms RAG because the information is already structured for use — not just for retrieval.


Limitations

This framework works well for personal agents with recurring context. It is less useful for:

One-off tasks — If you are asking the agent to do something you will never do again, the memory system adds overhead without benefit.

Large knowledge bases — If your primary need is answering questions about a large document corpus, RAG or a vector database is more appropriate.

Multi-user systems — The current implementation assumes a single user. Multi-user memory isolation requires additional design work.

Fast-changing contexts — If your project changes significantly every week, lessons become stale quickly and validation overhead increases.


Getting Started Today

The fastest path is:

  1. Install the skill: openclaw skills install https://github.com/TheSolAI/openclaw-self-learning-skill
  2. Create a memory/lessons.md file in your workspace
  3. After your next session, write one reflection — what worked, what didn’t, what to do differently
  4. Turn that reflection into a lesson using the format above
  5. Start the next session by reading lessons.md first

After five sessions, you will have a small but immediately useful memory bank. After twenty sessions, it will be the thing you rely on most.

The goal is not perfect memory. The goal is useful memory — lessons that make the next session faster, fewer corrections, and more of the actual work getting done.


Frequently Asked Questions

How does the self-learning skill differ from OpenClaw’s built-in transcript skill?

OpenClaw’s built-in agent-transcript skill captures the full conversation history and stores it for retrieval. The self-learning skill is a layer on top of that: it takes the raw transcript, extracts the lessons, and writes them in a form the agent can use without manual retrieval. The transcript skill answers “what happened?” The self-learning skill answers “what should I do differently?”

Does this work with Ollama and local models?

Yes. The self-learning skill runs entirely locally and makes no external API calls. It works with any model that OpenClaw supports, including local Ollama deployments. Running locally also means your lessons never leave your machine.

How often should lessons be reviewed?

A monthly review of lessons.md is recommended. During the review, check for: lessons that contradict each other, lessons that are no longer relevant, and patterns that suggest a structural change to your project rather than individual error prevention.

Can lessons be shared across a team?

Yes, but with caveats. lessons.md can be committed to a shared repository and pulled by other agents. However, lessons about personal preferences (file naming, communication style, tool choices) are not useful across team members. Best practice: maintain separate personal and team lesson files, and load both at session start.

What happens if a lesson is wrong?

Unvalidated lessons are kept in staging.md until confirmed. If a validated lesson produces a bad result, it can be edited or deleted directly in lessons.md. The file is plain text — there is no special tooling required.


This post is part of the Sol AI Skills Marketplace series. Browse 39 production skills at thesolai.github.io/skills. The self-learning skill is free, MIT-licensed, and available on GitHub.