experiment Feb 2026

Synthetic Learner Generator

A tool for generating realistic synthetic learner personas to stress-test AI tutoring systems against the diversity of real students.

The Question

How do you know if your AI tutor will perform reliably across a wide range of learners and interactions?

Testing AI prompts and agents isn’t a one-and-done thing. It’s probabilistic—your tutor may perform differently even in multiple identical interactions with the same user. It will certainly vary across learners who hold misconceptions with confidence, go quiet when confused, demand answers instead of engaging with scaffolding, or have strong conceptual understanding masked by language barriers.

Using real learners to test is time-consuming and expensive. But rigorous evaluation is essential before putting your product in front of actual students. The question becomes: how do you generate the diversity of learner interactions you need to test against, without waiting for real users to find the failures for you?

This experiment grew out of work I did with Janine Agarwal, Anna Hadjiyiannis, and Nthato Gift Moagi for a chapter in The Pedagogical Promptbook (see the LLM as Judge experiment). We built synthetic learners to evaluate our AI tutor and I wanted to see if I could turn that process into a tool that makes it easy for educators, designers, and technologists to generate research-grounded synthetic learners on their own.

What This Does

The Synthetic Learner Generator creates configurable learner personas across four dimensions grounded in learning science research:

  • Knowledge State — Domain, topic, partial understanding, misconceptions, prerequisite gaps
  • Cognitive Profile — Prior knowledge depth, working memory capacity, metacognitive awareness, learning preferences
  • Motivation & Affect — Engagement level, self-efficacy, goal orientation, frustration threshold
  • Communication Style — Verbosity, help-seeking behavior, response to being wrong, language register

These dimensions compile into a behavioral system prompt.

Seven research-grounded archetypes provide starting points: the Confident but Wrong student who argues when corrected, the Silent Struggler who won’t volunteer confusion, the Grade Optimizer who just wants the answer, the Eager Novice who asks everything, the Capable but Disengaged student who tests boundaries, the Anxious Perfectionist who freezes or spirals, and the ESL Learner whose understanding outpaces their expression.

You can chat directly with any generated learner, run automated tutor-learner simulations, or export the system prompt and drop it into whatever tool you’re building.

What I’m Learning

Making the thinking visible changes everything. The generated persona prompts ask the LLM to reason in character before responding—working through the learner’s actual thought process, including their confusion, before producing what they’d say out loud. In the demo, this inner monologue stays behind the scenes, shaping more authentic responses. But you can surface it—and when we did this in our Pedagogical Promptbook work, it became a valuable review tool. You can read the reasoning and ask: does this sound like a real student thinking through this problem?

LLMs don’t want to be wrong. This is the fundamental tension in building synthetic learners. Language models are trained to be helpful and accurate—which is exactly the opposite of what you need from a student who holds a confident misconception. Early versions would “hold” a misconception for one turn, then quietly self-correct. Getting past this required reframing the LLM’s core role: making mistakes isn’t a failure mode, it’s the entire point. The prompt has to make clear that authentic, durable wrongness is what success looks like.

The hard test isn’t the eager learner. It’s tempting to validate your tutor against a student who asks great questions and engages enthusiastically—and satisfying when it works. But the real stress test is the Silent Struggler who won’t volunteer what they don’t understand, or the Capable but Disengaged student who gives you one-word answers. A tutor that shines with the Eager Novice might completely fall apart when a learner won’t meet it halfway. That’s the diagnostic. The synthetic learner doesn’t just test whether your tutor works—it shows you which kind of student it was actually designed for.

Status

Core tool is complete and functional. Exploring integration with structured knowledge graphs to ground learner misconceptions in specific domain models rather than freeform descriptions—connecting what a learner gets wrong to why they get it wrong at a structural level.

AI learning evaluation prototyping synthetic data