How I Built a Production-Grade Adaptive Study Coach on Amazon Nova

Testera, AI Study Agent on Amazon Nova
Also published on AWS Builder Center
TL;DRI built Testera, a production-grade adaptive study coach on Amazon Nova using all four capabilities: Nova 2 Lite for text generation, Nova 2 Sonic for speech-to-speech, multimodal vision for diagrams and images, and tool use for structured practice. IRT powers difficulty calibration so students focus on what they don't know.

Inspiration

Millions of students sit high-stakes exams every year: GMAT, GRE, SAT, GAMSAT, LSAT, IELTS, TOEFL, UCAT, NCLEX-RN, CFA. Many of them study the wrong things. Not because they're lazy, but because their tools have no idea who they are. They review topics they've already mastered, miss the patterns behind their mistakes, and drift off between questions while static question banks serve the next card.

In a world of infinite tabs and notifications, students sit down to study and lose focus within minutes, not because they're “too distracted” but because nothing in their prep environment responds to them as an individual.

I know this because I lived it. I was a strong student growing up, the kind of kid who picked things up fast and didn't need to try too hard. But as I got older, the world got louder. I loved doing multiple things at once, jumping between interests, chasing whatever felt exciting in the moment.

Somewhere along the way I stopped being able to focus for long stretches. Not because I stopped being capable, but because I didn't know what I wanted, and without that clarity, motivation just evaporates. You sit down to study and twenty minutes later you're three tabs deep in something completely unrelated.

Testera's focus tracker was built for exactly that moment. I wanted to build a coach that actually pays attention, because I needed one myself.

It knows what you're bad at. It remembers what you'll forget. It speaks back.

That's Testera.

What it does

Testera is an adaptive test prep platform powered by Amazon Nova that learns how you think and focuses on what actually hurts your score.

Tera is an agentic AI study coach that adapts your practice in real time using an IRT-based engine and focus tracking. Traditional platforms tell you what you got wrong. Tera tells you why, and makes it less likely you will make the same mistake twice.

FeatureWhat it means for the student
Adaptive question engineQuestions get harder or easier based on your live ability estimate, no wasted time on topics you've mastered
Tera (AI coach)Agentic companion powered by Nova 2 Lite with tool use, proactively checks your weak topics, generates calibrated questions, and fetches your study plan before every reply. Tool calls render as pills in the chat so the agentic behaviour is visible
IELTS Speaking ExaminerSpeech-to-speech AI examiner powered by Nova 2 Sonic, follows the full Part 1/2/3 format, streams audio back in real time, scores with 4-criterion band feedback
Snap a ProblemPhoto any textbook page or whiteboard, Nova multimodal vision identifies the concept and generates an original calibrated practice question in seconds. Multimodal embeddings connect the image to your weak-topic profile
Animated Tera11 distinct emotion states driven by CSS keyframe animations, bouncing when thinking, sparkling when you nail a question, floating z's when things go quiet
Focus trackerDetects tab switches, idle time, and session drift, surfaces focus metrics so you can see where your attention actually goes
Writing scorerIELTS Task 1/2 and GAMSAT Section 2 essays scored by Nova with band-level feedback in seconds
Error diagnosisPattern detection across your session, spots recurring traps before you sit the real exam
Live score trajectoryAbility estimate mapped to real scales: GMAT 200–800, GRE 130–170, IELTS 1–9
Active workspaceScratchpad, calculator, formula sheet in a bottom drawer, mimics a real exam desk
GamificationXP, streaks, achievements, and micro-celebrations to keep momentum going

Engineering at a Glance

MetricValue
Exams supported6 (GMAT, GRE, SAT, GAMSAT, LSAT, IELTS)
Nova models usedNova 2 Lite · Nova 2 Sonic · Nova Multimodal Embeddings
Nova capabilitiesText generation · streaming · speech-to-speech · multimodal vision · tool use / function calling
Adaptive engine3PL IRT · Newton–Raphson MLE · Bayesian prior N(0,1)
Spaced repetitionSM-2 algorithm · Ebbinghaus retention curve · per-topic ease factor (min 1.3)
Weak-area trigger≥ 3 errors on same topic → remediation at θ − 0.8 · SM-2 ease < 1.8
Tera agentic coachNova tool use loop · 3 tools · streaming SSE · tool-use pills · auth users only
IELTS SpeakingNova 2 Sonic bidirectional stream · Part 1/2/3 structure · 4-criterion band scoring · Polly fallback
Snap a ProblemNova multimodal image → structured question · 6 exam types · multimodal embeddings (384-d)
Study PlannerNova 2 Lite · IRT θ + SM-2 weak topics + target score → week-by-week schedule · 24 hr cache
Writing scorerNova 2 Lite · IELTS Task 1/2 + GAMSAT Section 2 · band-level feedback
GamificationXP · streaks · achievements · toast notifications
AuthJWT · Google OAuth · Email OTP (Amazon SES)
FrontendNext.js 14 App Router · TypeScript · Tailwind · PWA (service worker)
Fallback question bank53 curated questions across all 6 exams
Test suite1,202 passing tests
AWS regioneu-west-1 (Ireland), GDPR jurisdiction · Nova Sonic: us-east-1

System Overview

Testera is a full-stack adaptive learning platform. The frontend is a Next.js 14 PWA deployed on Vercel. The backend is a FastAPI service running on AWS ECS Fargate, backed by Amazon Aurora PostgreSQL Serverless v2 and Amazon Bedrock (Nova 2 Lite) for AI inference.

🌐 Student, Browser / PWA
↓ HTTPS
Vercel, Edge CDN
Next.js 14 · App Router · SSR · RSC
↓ REST API
AWS eu-west-1, Ireland
ALB · HTTPS · ACM cert · Health checks
ECS Fargate, FastAPI
irt_engine.py
Bayesian θ estimation
selector.py
Question priority scoring
bedrock_nova.py
Prompt construction
writing_scorer.py
IELTS / GAMSAT scoring
error_analyzer.py
Session error patterns
achievement_manager.py
XP · streaks · badges
companion.py
Tera AI coach
Nova 2 Lite (eu-west-1)
Text gen · streaming · tool use · multimodal vision · embeddings
Nova 2 Sonic (us-east-1)
Speech-to-speech · IELTS Speaking · bidirectional stream
Amazon Polly (fallback)
Neural TTS · PCM stream
Aurora PostgreSQL
Serverless v2 · Multi-AZ
Amazon SES
OTP · transactional
Amazon ECR
Container images · CI/CD

Architecture on AWS

All AWS resources in eu-west-1 (Ireland), close to the primary user base and within GDPR jurisdiction.

  • Frontend: Next.js 14 App Router, TypeScript, Tailwind CSS, deployed on Vercel. All AI calls go through the FastAPI backend, no Bedrock credentials ever touch the browser.
  • Backend: FastAPI in Docker on ECS Fargate. Rolling deploys via GitHub Actions → ECR → ecs update-service --force-new-deployment.
  • Database: Aurora Serverless v2 scales ACUs up and down automatically, including near-zero when idle. Keeps costs low without sacrificing cold-start performance.

Adaptive Engine: IRT in Depth

Each question has three IRT parameters: discrimination a, difficulty b, and guessing c. After every answer, the student's ability estimate θ is updated using Newton–Raphson MLE with a Bayesian prior N(0,1). The prior prevents wild swings from a single lucky or unlucky answer.

θ is then used to pick the next question via a weighted priority score:

P(q) = w₁·W(q) + w₂·D(dq, θ) + w₃·R(tq)

  • W(q), error rate on this topic over recent attempts (surface weak areas)
  • D(dq, θ), how close the question's difficulty is to current θ (zone of proximal development)
  • R(tq), recency penalty (avoid repeating questions just seen)

Weights are tuned per exam, GMAT gets heavier weakness targeting, IELTS gets more difficulty progression. If a student accumulates errors on the same topic, the selector switches to remediation mode and rebuilds confidence before pushing harder.

Score mapping

ExamScale
GMAT200–800
GRE130–170
IELTS1–9
SAT400–1600
GAMSAT30–90
LSAT120–180

Amazon Nova Integration

Testera uses all four major Nova capabilities, text generation, speech-to-speech, multimodal understanding, and tool use.

1. Nova 2 Lite: Core intelligence

Every question, explanation, study plan, and writing score runs through eu.amazon.nova-2-lite-v1:0 via the Bedrock Converse API in eu-west-1 (GDPR jurisdiction). Nova 2 Lite was chosen for three reasons: it handles UK English (colour, favour, analyse) natively, essential for IELTS and GAMSAT users, it returns strict JSON reliably, and at ~€0.17/active user/month it makes affordable test prep sustainable.

Streaming: Tera's chat responses stream token-by-token using converseStream → Server-Sent Events → browser. Zero waiting, natural “thinking” effect.

Student types → POST /companion/stream → converseStream → SSE chunks → browser renders live

Tera is an agentic system. For authenticated users, Tera runs a Nova tool-use loop before every reply. She has three tools:

ToolWhat it does
get_weak_topicsPulls the student's SM-2 weak topics in real time
generate_questionGenerates a calibrated practice question on demand
get_study_planFetches or generates the student's personalised Nova study plan

Nova decides autonomously which tool (if any) to call. Each call renders as a pill above Tera's reply, e.g. ✦ Tera checked your weak topics, so the agentic behaviour is visible inside the actual product. Anonymous users get the plain streaming path.

Question generation: Exam-specific system prompts with explicit difficulty calibration (1–10 scale), distractor quality rules, and dual-critic verification. Nova returns strict JSON, no regex hacks. If Nova returns malformed JSON, the circuit breaker falls back to the cached question bank.

DIFFICULTY CALIBRATION:
- Difficulty 7/10 means ~32% of test-takers get it wrong.
- Use advanced concepts with 3+ step reasoning.

QUALITY REQUIREMENTS:
1. Only ONE answer can be correct (unambiguous)
2. All distractors must be plausible common errors
3. Self-contained, no external references needed
4. Consistent numbers and values throughout

Study Planner: POST /users/study-plan sends the student's IRT θ + SM-2 weak topics + target score to Nova and gets a concrete week-by-week daily schedule back. Cached 24 hours per user.

Writing scorer: Students submit IELTS or GAMSAT essays. Nova evaluates them against official band descriptors and returns a structured response: band score, per-criterion breakdown, specific improvements, and a model answer.

Tera's coaching logic

Below is a simplified illustrative version, the production prompt includes exam-specific vocabulary, structured JSON schemas, and additional guardrails.

python
TERA_SYSTEM_PROMPT = """
When a student makes a mistake:
1. Acknowledge what they got right first
2. Identify the specific error pattern, not just "wrong answer"
3. Give one actionable tip, not a lecture
4. Offer a follow-up question on the same concept

Be concise. Never condescending.
Domain: {exam_type}. Use appropriate vocabulary and scoring conventions.
"""

{exam_type} is replaced at runtime with the student's selected exam (e.g. IELTS, GMAT). Domain vocabulary (GMAT critical reasoning traps, GAMSAT Section III patterns, IELTS band descriptors) roughly doubles output quality on the same model.

2. Nova 2 Sonic: IELTS Speaking Examiner

Route: /speaking, the only AI-powered IELTS speaking practice tool that uses real speech-to-speech.

Student holds mic button → browser captures PCM16 audio (16 kHz, mono)
  → POST /api/speaking/turn (multipart)
    → invoke_model_with_bidirectional_stream (amazon.nova-2-sonic-v1:0, us-east-1)
      → Nova Sonic processes speech, generates examiner response
        → LPCM 24 kHz audio streamed back → browser plays immediately

The examiner follows the official IELTS Speaking format across three parts:

  • Part 1, 5 personal questions on familiar topics
  • Part 2, Cue card given, 1-minute prep timer, candidate speaks 1–2 minutes
  • Part 3, Abstract discussion questions related to the Part 2 topic

At the end, Nova 2 Lite evaluates the full conversation transcript and returns a band score breakdown (Fluency & Coherence, Lexical Resource, Grammatical Range, Pronunciation) with specific, actionable feedback.

Fallback: If Nova Sonic is unavailable, the examiner role falls back to Nova 2 Lite text generation + AWS Polly TTS, the student experience is never interrupted. Voice: Ruth (British English neural), appropriate for IELTS.

3. Nova Multimodal: Snap a Problem

Route: /snap, photo any textbook page, whiteboard, or diagram and get a calibrated practice question in seconds.

Snap a Problem, Nova multimodal vision generating a practice question from a photo
Snap a Problem, Nova multimodal vision generating a practice question from a photo
Student uploads / cameras an image
  → POST /api/snap/question
    → Nova 2 Lite converse() with image bytes + exam_type
      → Structured JSON: topic, difficulty, question, 4 options, correct_index, explanation
        → Student answers inline, sees full explanation

Nova reads the image, identifies the mathematical concept or logical principle shown, and generates an original practice question inspired by (not copying) that content, calibrated to the chosen exam (GMAT, GRE, SAT, GAMSAT, LSAT, IELTS).

Nova multimodal embeddings (amazon.nova-2-multimodal-embeddings-v1:0, 384 dimensions) also run silently on each image to enable similarity search, connecting a student's photo to the most relevant weak areas in their profile.

Tech Stack

LayerTechnology
FrontendNext.js 14 App Router, TypeScript, Tailwind CSS, PWA (service worker)
BackendFastAPI (Python 3.11), ECS Fargate, Docker
DatabaseAmazon Aurora PostgreSQL Serverless v2
AI (text + tool use)Amazon Bedrock, Nova 2 Lite (eu.amazon.nova-2-lite-v1:0, eu-west-1)
AI (speech-to-speech)Amazon Nova 2 Sonic (amazon.nova-2-sonic-v1:0, us-east-1)
AI (multimodal embeddings)Nova multimodal embeddings (amazon.nova-2-multimodal-embeddings-v1:0, 384-d)
TTS fallbackAmazon Polly (Ruth, British English neural, streaming PCM)
EmailAmazon SES (OTP, transactional)
Container registryAmazon ECR
Frontend hostingVercel (Edge CDN)
Load balancerAWS ALB + ACM
CI/CDGitHub Actions → ECR → ECS rolling deploy

Lessons Learned

Here's what actually mattered:

  • Prompt engineering beats model size. Adding exam-specific vocabulary and structured JSON schemas to the system prompt roughly doubled output quality on the same Amazon Nova 2 Lite model. A smarter prompt is almost always cheaper than a bigger model.
  • IRT is worth the complexity. A simple difficulty ladder would have been faster to build. But the 3PL model gives you a real ability estimate, one you can map to actual exam scales and explain to a student. That credibility matters.
  • Speech-to-speech changes what's possible. Nova Sonic's bidirectional streaming, PCM16 in, LPCM 24 kHz out, makes the IELTS Speaking Examiner feel like a real interview. TTS bolt-ons cannot replicate this; the latency difference is immediately noticeable.
  • Visible agentic behaviour builds trust. Tool-use pills showing ✦ Tera checked your weak topics are a one-line UI change that transforms a black-box AI into something students understand and trust. Transparency is a product feature, not just a compliance checkbox.
  • Multimodal input removes friction. Students photograph textbook pages they're already looking at. The image → structured question pipeline meets them where they are rather than asking them to reframe the problem into text first.
  • Circuit breakers are not optional. Bedrock will occasionally be slow or unavailable. Building fallbacks before launch, question bank, Polly TTS for Sonic, cached study plans, meant zero student-facing failures.
  • Aurora Serverless v2 cold starts are real. Near-zero ACU scaling is great for cost, but the first query after idle can be slow. Warm-up pings on a schedule solved it without giving up the savings.
  • Focus tracking changes how students feel about the product. It's not the flashiest feature, but students who see their own attention data become more invested in improving it. Behavioural feedback loops are underrated in edtech.

Acknowledgements

Thanks to the AWS Nova Hackathon team for the opportunity and to the IRT research community, the 3PL model and Newton–Raphson ability estimation that power Testera's adaptive engine have decades of rigorous work behind them.

High-stakes exams. All four Nova capabilities. Live at testera.org.

Responsible AI

All practice questions and explanations are AI-generated original content, not reproductions of official exam items.

  • No proprietary or licensed exam questions are fed into Nova.
  • Questions are clearly presented as AI-generated practice material, not official exam items.
  • Testera does not imply endorsement by ETS, GMAC, College Board, ACER, or any other exam body.
  • Authenticated users can flag any question as incorrect or low-quality, every report is reviewed.
  • Testera is for practice only, not official scoring or academic advice.

IELTS and GAMSAT: Practice questions are AI-generated originals written to reflect the style and cognitive demands of each exam. They are not sourced from, affiliated with, or endorsed by the British Council, IDP, Cambridge Assessment English (IELTS) or ACER (GAMSAT). Testera's IELTS writing scorer evaluates essays against publicly documented band descriptors as a study aid, it does not produce official band scores. See testera.org/terms for full details.

Sources & References

  1. Amazon Bedrock, Amazon Nova model family documentation. Source
  2. Amazon Nova 2 Sonic, Speech-to-speech model documentation. Source
  3. Amazon Polly, Neural TTS documentation (Sonic fallback). Source
  4. Lord, F.M. (1980). Applications of Item Response Theory to Practical Testing Problems. Erlbaum. Source
Share this article:
Anya Chueayen

Anya Chueayen

Founder of Aqta. Before this, I worked on integrity at social media platforms, the unglamorous side of AI where human behaviour, edge cases, and ethics collide at scale. That work convinced me that responsible AI needs infrastructure, not just good intentions. Based in Dublin, closely watching how regulation is reshaping what we build and how.

If you're interested in the governance side of AI systems like this, these two pieces go deeper.

© 2026 Aqta. All rights reserved.

Request access

Choose your access type and tell us about your use case.

Access type