26 February 2026·Updated 12 March 2026·10 min read

How I built a production-grade adaptive study coach on Amazon Nova

Amazon NovaNova SonicMultimodalAgentic AIEdTechAWSIRT

TL;DRI built Testera, a production-grade adaptive study coach on Amazon Nova using all four capabilities: Nova 2 Lite for text generation, Nova 2 Sonic for speech-to-speech, multimodal vision for diagrams and images, and tool use for structured practice. IRT powers difficulty calibration so students focus on what they don't know.

Inspiration

Millions of students sit high-stakes exams every year: GMAT, GRE, SAT, GAMSAT, LSAT, IELTS, TOEFL, UCAT, NCLEX-RN, CFA. Many of them study the wrong things. Not because they're lazy, but because their tools have no idea who they are. They review topics they've already mastered, miss the patterns behind their mistakes, and drift off between questions while static question banks serve the next card.

In a world of infinite tabs and notifications, students sit down to study and lose focus within minutes, not because they're “too distracted” but because nothing in their prep environment responds to them as an individual.

I know this because I lived it. I was a strong student growing up, the kind of kid who picked things up fast and didn't need to try too hard. But as I got older, the world got louder. I loved doing multiple things at once, jumping between interests, chasing whatever felt exciting in the moment.

Somewhere along the way I stopped being able to focus for long stretches. Not because I stopped being capable, but because I didn't know what I wanted, and without that clarity, motivation just evaporates. You sit down to study and twenty minutes later you're three tabs deep in something completely unrelated.

Testera's focus tracker was built for exactly that moment. I wanted to build a coach that actually pays attention, because I needed one myself.

It knows what you're bad at. It remembers what you'll forget. It speaks back.

That's Testera.

What it does

Testera is an adaptive test prep platform powered by Amazon Nova that learns how you think and focuses on what actually hurts your score.

Tera is an agentic AI study coach that adapts your practice in real time using an IRT-based engine and focus tracking. Traditional platforms tell you what you got wrong. Tera tells you why, and makes it less likely you will make the same mistake twice.

Feature	What it means for the student
Adaptive question engine	Questions get harder or easier based on your live ability estimate, no wasted time on topics you've mastered
Tera (AI coach)	Agentic companion powered by Nova 2 Lite with tool use, proactively checks your weak topics, generates calibrated questions, and fetches your study plan before every reply. Tool calls render as pills in the chat so the agentic behaviour is visible
IELTS Speaking Examiner	Speech-to-speech AI examiner powered by Nova 2 Sonic, follows the full Part 1/2/3 format, streams audio back in real time, scores with 4-criterion band feedback
Snap a Problem	Photo any textbook page or whiteboard, Nova multimodal vision identifies the concept and generates an original calibrated practice question in seconds. Multimodal embeddings connect the image to your weak-topic profile
Animated Tera	11 distinct emotion states driven by CSS keyframe animations, bouncing when thinking, sparkling when you nail a question, floating z's when things go quiet
Focus tracker	Detects tab switches, idle time, and session drift, surfaces focus metrics so you can see where your attention actually goes
Writing scorer	IELTS Task 1/2 and GAMSAT Section 2 essays scored by Nova with band-level feedback in seconds
Error diagnosis	Pattern detection across your session, spots recurring traps before you sit the real exam
Live score trajectory	Ability estimate mapped to real scales: GMAT 200–800, GRE 130–170, IELTS 1–9
Active workspace	Scratchpad, calculator, formula sheet in a bottom drawer, mimics a real exam desk
Gamification	XP, streaks, achievements, and micro-celebrations to keep momentum going

Engineering at a Glance

Metric	Value
Exams supported	6 (GMAT, GRE, SAT, GAMSAT, LSAT, IELTS)
Nova models used	Nova 2 Lite · Nova 2 Sonic · Nova Multimodal Embeddings
Nova capabilities	Text generation · streaming · speech-to-speech · multimodal vision · tool use / function calling
Adaptive engine	3PL IRT · Newton–Raphson MLE · Bayesian prior N(0,1)
Spaced repetition	SM-2 algorithm · Ebbinghaus retention curve · per-topic ease factor (min 1.3)
Weak-area trigger	≥ 3 errors on same topic → remediation at θ − 0.8 · SM-2 ease < 1.8
Tera agentic coach	Nova tool use loop · 3 tools · streaming SSE · tool-use pills · auth users only
IELTS Speaking	Nova 2 Sonic bidirectional stream · Part 1/2/3 structure · 4-criterion band scoring · Polly fallback
Snap a Problem	Nova multimodal image → structured question · 6 exam types · multimodal embeddings (384-d)
Study Planner	Nova 2 Lite · IRT θ + SM-2 weak topics + target score → week-by-week schedule · 24 hr cache
Writing scorer	Nova 2 Lite · IELTS Task 1/2 + GAMSAT Section 2 · band-level feedback
Gamification	XP · streaks · achievements · toast notifications
Auth	JWT · Google OAuth · Email OTP (Amazon SES)
Frontend	Next.js 14 App Router · TypeScript · Tailwind · PWA (service worker)
Fallback question bank	53 curated questions across all 6 exams
Test suite	1,202 passing tests
AWS region	eu-west-1 (Ireland), GDPR jurisdiction · Nova Sonic: us-east-1

System Overview

Testera is a full-stack adaptive learning platform. The frontend is a Next.js 14 PWA deployed on Vercel. The backend is a FastAPI service running on AWS ECS Fargate, backed by Amazon Aurora PostgreSQL Serverless v2 and Amazon Bedrock (Nova 2 Lite) for AI inference.

🌐 Student, Browser / PWA

↓ HTTPS

Vercel, Edge CDN

Next.js 14 · App Router · SSR · RSC

↓ REST API

AWS eu-west-1, Ireland

ALB · HTTPS · ACM cert · Health checks

ECS Fargate, FastAPI

irt_engine.py
Bayesian θ estimation

selector.py
Question priority scoring

bedrock_nova.py
Prompt construction

writing_scorer.py
IELTS / GAMSAT scoring

error_analyzer.py
Session error patterns

achievement_manager.py
XP · streaks · badges

companion.py
Tera AI coach

Nova 2 Lite (eu-west-1)

Text gen · streaming · tool use · multimodal vision · embeddings

Nova 2 Sonic (us-east-1)

Speech-to-speech · IELTS Speaking · bidirectional stream

Amazon Polly (fallback)

Neural TTS · PCM stream

Aurora PostgreSQL
Serverless v2 · Multi-AZ

Amazon SES
OTP · transactional

Amazon ECR
Container images · CI/CD

Architecture on AWS

All AWS resources in eu-west-1 (Ireland), close to the primary user base and within GDPR jurisdiction.

Frontend: Next.js 14 App Router, TypeScript, Tailwind CSS, deployed on Vercel. All AI calls go through the FastAPI backend, no Bedrock credentials ever touch the browser.
Backend: FastAPI in Docker on ECS Fargate. Rolling deploys via GitHub Actions → ECR → ecs update-service --force-new-deployment.
Database: Aurora Serverless v2 scales ACUs up and down automatically, including near-zero when idle. Keeps costs low without sacrificing cold-start performance.

Adaptive Engine: IRT in Depth

Each question has three IRT parameters: discrimination a, difficulty b, and guessing c. After every answer, the student's ability estimate θ is updated using Newton–Raphson MLE with a Bayesian prior N(0,1). The prior prevents wild swings from a single lucky or unlucky answer.

θ is then used to pick the next question via a weighted priority score:

P(q) = w₁·W(q) + w₂·D(d_q, θ) + w₃·R(t_q)

W(q), error rate on this topic over recent attempts (surface weak areas)
D(d_q, θ), how close the question's difficulty is to current θ (zone of proximal development)
R(t_q), recency penalty (avoid repeating questions just seen)

Weights are tuned per exam, GMAT gets heavier weakness targeting, IELTS gets more difficulty progression. If a student accumulates errors on the same topic, the selector switches to remediation mode and rebuilds confidence before pushing harder.

Score mapping

Exam	Scale
GMAT	200–800
GRE	130–170
IELTS	1–9
SAT	400–1600
GAMSAT	30–90
LSAT	120–180

Amazon Nova Integration

Testera uses all four major Nova capabilities, text generation, speech-to-speech, multimodal understanding, and tool use.

1. Nova 2 Lite: Core intelligence

Every question, explanation, study plan, and writing score runs through eu.amazon.nova-2-lite-v1:0 via the Bedrock Converse API in eu-west-1 (GDPR jurisdiction). Nova 2 Lite was chosen for three reasons: it handles UK English (colour, favour, analyse) natively, essential for IELTS and GAMSAT users, it returns strict JSON reliably, and at ~€0.17/active user/month it makes affordable test prep sustainable.

Streaming: Tera's chat responses stream token-by-token using converseStream → Server-Sent Events → browser. Zero waiting, natural “thinking” effect.

Student types → POST /companion/stream → converseStream → SSE chunks → browser renders live

Tera is an agentic system. For authenticated users, Tera runs a Nova tool-use loop before every reply. She has three tools:

Tool	What it does
`get_weak_topics`	Pulls the student's SM-2 weak topics in real time
`generate_question`	Generates a calibrated practice question on demand
`get_study_plan`	Fetches or generates the student's personalised Nova study plan

Nova decides autonomously which tool (if any) to call. Each call renders as a pill above Tera's reply, e.g. ✦ Tera checked your weak topics, so the agentic behaviour is visible inside the actual product. Anonymous users get the plain streaming path.

Question generation: Exam-specific system prompts with explicit difficulty calibration (1–10 scale), distractor quality rules, and dual-critic verification. Nova returns strict JSON, no regex hacks. If Nova returns malformed JSON, the circuit breaker falls back to the cached question bank.

DIFFICULTY CALIBRATION:
- Difficulty 7/10 means ~32% of test-takers get it wrong.
- Use advanced concepts with 3+ step reasoning.

QUALITY REQUIREMENTS:
1. Only ONE answer can be correct (unambiguous)
2. All distractors must be plausible common errors
3. Self-contained, no external references needed
4. Consistent numbers and values throughout

Study Planner: POST /users/study-plan sends the student's IRT θ + SM-2 weak topics + target score to Nova and gets a concrete week-by-week daily schedule back. Cached 24 hours per user.

Writing scorer: Students submit IELTS or GAMSAT essays. Nova evaluates them against official band descriptors and returns a structured response: band score, per-criterion breakdown, specific improvements, and a model answer.

Tera's coaching logic

Below is a simplified illustrative version, the production prompt includes exam-specific vocabulary, structured JSON schemas, and additional guardrails.

python

TERA_SYSTEM_PROMPT = """
When a student makes a mistake:
1. Acknowledge what they got right first
2. Identify the specific error pattern, not just "wrong answer"
3. Give one actionable tip, not a lecture
4. Offer a follow-up question on the same concept

Be concise. Never condescending.
Domain: {exam_type}. Use appropriate vocabulary and scoring conventions.
"""

{exam_type} is replaced at runtime with the student's selected exam (e.g. IELTS, GMAT). Domain vocabulary (GMAT critical reasoning traps, GAMSAT Section III patterns, IELTS band descriptors) roughly doubles output quality on the same model.

2. Nova 2 Sonic: IELTS Speaking Examiner

Route: /speaking, the only AI-powered IELTS speaking practice tool that uses real speech-to-speech.

Student holds mic button → browser captures PCM16 audio (16 kHz, mono)
  → POST /api/speaking/turn (multipart)
    → invoke_model_with_bidirectional_stream (amazon.nova-2-sonic-v1:0, us-east-1)
      → Nova Sonic processes speech, generates examiner response
        → LPCM 24 kHz audio streamed back → browser plays immediately

The examiner follows the official IELTS Speaking format across three parts:

Part 1, 5 personal questions on familiar topics
Part 2, Cue card given, 1-minute prep timer, candidate speaks 1–2 minutes
Part 3, Abstract discussion questions related to the Part 2 topic

At the end, Nova 2 Lite evaluates the full conversation transcript and returns a band score breakdown (Fluency & Coherence, Lexical Resource, Grammatical Range, Pronunciation) with specific, actionable feedback.

Fallback: If Nova Sonic is unavailable, the examiner role falls back to Nova 2 Lite text generation + AWS Polly TTS, the student experience is never interrupted. Voice: Ruth (British English neural), appropriate for IELTS.

3. Nova Multimodal: Snap a Problem

Route: /snap, photo any textbook page, whiteboard, or diagram and get a calibrated practice question in seconds.

Snap a Problem, Nova multimodal vision generating a practice question from a photo

Student uploads / cameras an image
  → POST /api/snap/question
    → Nova 2 Lite converse() with image bytes + exam_type
      → Structured JSON: topic, difficulty, question, 4 options, correct_index, explanation
        → Student answers inline, sees full explanation

Nova reads the image, identifies the mathematical concept or logical principle shown, and generates an original practice question inspired by (not copying) that content, calibrated to the chosen exam (GMAT, GRE, SAT, GAMSAT, LSAT, IELTS).

Nova multimodal embeddings (amazon.nova-2-multimodal-embeddings-v1:0, 384 dimensions) also run silently on each image to enable similarity search, connecting a student's photo to the most relevant weak areas in their profile.

Tech Stack

Layer	Technology
Frontend	Next.js 14 App Router, TypeScript, Tailwind CSS, PWA (service worker)
Backend	FastAPI (Python 3.11), ECS Fargate, Docker
Database	Amazon Aurora PostgreSQL Serverless v2
AI (text + tool use)	Amazon Bedrock, Nova 2 Lite (`eu.amazon.nova-2-lite-v1:0`, eu-west-1)
AI (speech-to-speech)	Amazon Nova 2 Sonic (`amazon.nova-2-sonic-v1:0`, us-east-1)
AI (multimodal embeddings)	Nova multimodal embeddings (`amazon.nova-2-multimodal-embeddings-v1:0`, 384-d)
TTS fallback	Amazon Polly (Ruth, British English neural, streaming PCM)
Email	Amazon SES (OTP, transactional)
Container registry	Amazon ECR
Frontend hosting	Vercel (Edge CDN)
Load balancer	AWS ALB + ACM
CI/CD	GitHub Actions → ECR → ECS rolling deploy

Lessons Learned

Here's what actually mattered:

Prompt engineering beats model size. Adding exam-specific vocabulary and structured JSON schemas to the system prompt roughly doubled output quality on the same Amazon Nova 2 Lite model. A smarter prompt is almost always cheaper than a bigger model.
IRT is worth the complexity. A simple difficulty ladder would have been faster to build. But the 3PL model gives you a real ability estimate, one you can map to actual exam scales and explain to a student. That credibility matters.
Speech-to-speech changes what's possible. Nova Sonic's bidirectional streaming, PCM16 in, LPCM 24 kHz out, makes the IELTS Speaking Examiner feel like a real interview. TTS bolt-ons cannot replicate this; the latency difference is immediately noticeable.
Visible agentic behaviour builds trust. Tool-use pills showing ✦ Tera checked your weak topics are a one-line UI change that transforms a black-box AI into something students understand and trust. Transparency is a product feature, not just a compliance checkbox.
Multimodal input removes friction. Students photograph textbook pages they're already looking at. The image → structured question pipeline meets them where they are rather than asking them to reframe the problem into text first.
Circuit breakers are not optional. Bedrock will occasionally be slow or unavailable. Building fallbacks before launch, question bank, Polly TTS for Sonic, cached study plans, meant zero student-facing failures.
Aurora Serverless v2 cold starts are real. Near-zero ACU scaling is great for cost, but the first query after idle can be slow. Warm-up pings on a schedule solved it without giving up the savings.
Focus tracking changes how students feel about the product. It's not the flashiest feature, but students who see their own attention data become more invested in improving it. Behavioural feedback loops are underrated in edtech.

Acknowledgements

Thanks to the IRT research community: the 3PL model and Newton–Raphson ability estimation that power Testera's adaptive engine have decades of rigorous work behind them.

High-stakes exams. All four Nova capabilities. Live at testera.org.

Live Responsible AI Guide Read on AWS Builder Center

Responsible AI

All practice questions and explanations are AI-generated original content, not reproductions of official exam items.

No proprietary or licensed exam questions are fed into Nova.
Questions are clearly presented as AI-generated practice material, not official exam items.
Testera does not imply endorsement by ETS, GMAC, College Board, ACER, or any other exam body.
Authenticated users can flag any question as incorrect or low-quality, every report is reviewed.
Testera is for practice only, not official scoring or academic advice.

IELTS and GAMSAT: Practice questions are AI-generated originals written to reflect the style and cognitive demands of each exam. They are not sourced from, affiliated with, or endorsed by the British Council, IDP, Cambridge Assessment English (IELTS) or ACER (GAMSAT). Testera's IELTS writing scorer evaluates essays against publicly documented band descriptors as a study aid, it does not produce official band scores. See testera.org/terms for full details.

Sources & References

Amazon Bedrock, Amazon Nova model family documentation. Source
Amazon Nova 2 Sonic, Speech-to-speech model documentation. Source
Amazon Polly, Neural TTS documentation (Sonic fallback). Source
Lord, F.M. (1980). Applications of Item Response Theory to Practical Testing Problems. Erlbaum. Source

Share this article:

Anya Chueayen

Founder of Aqta. Before this, I worked on integrity at social media platforms, the unglamorous side of AI where human behaviour, edge cases, and ethics collide at scale. That work convinced me that responsible AI needs infrastructure, not just good intentions. Based in Dublin, closely watching how regulation is reshaping what we build and how.

Connect on LinkedIn

If you're interested in the governance side of AI systems like this, these two pieces go deeper.

The human supply chain behind AI

The invisible labour that powers AI systems, and why it matters for governance.

Who's accountable when healthcare AI makes a mistake?

Ireland's Medical Council says doctors remain responsible for AI decisions. But how can they be confident in tools they don't fully understand?