Skimreader.ai is an AI-powered eReader that summarizes and helps you learn textbooks and long-form articles while you read — with page-level quizzes that reinforce recall and quick note export for revision. The product proposition: save hours per week and improve retention for students and researchers, directly inside a distraction-free reader.
Objectives & outcomes
The skimreader.ai build was focused on the end of the data integration tunnel. It didn't require much integration or modelling to develop, but rather the work was focused on how a novel AI system could interact with written data to provide something useful.
Key outcomes targeted:
- •“Summarize while you read” flow with page-level concepts, questions and quick reviews.
- •A study model based on the SQ3R education model with live generated quizzes to reinforce recall; optional text-to-speech mode.
- •Frictionless export of highlights/notes (Markdown) and simple onboarding (email or Google sign-in).
Context, constraints & success criteria
Content varies widely (scanned PDFs, academic PDFs with math, long HTML articles). Constraints included maintaining low latency for page-level analysis, grounding summaries to avoid hallucination, predictable operating cost per page, and accessibility for neurodiverse users (short chunks, quiz cadence, optional audio).
Building an entirely original model for interacting with AI. Typically we interact with AI in the form of a chatbot, but for this project we needed to build a custom model that was tailored to be interacted with as if it were a page on a book.
Approach
Engagement model
Discovery with students/researchers → architecture & prototypes → guided delivery.
We mapped study behaviours (skim → deep-read → review) and built a thin slice: PDF/HTML ingestion → layout-aware chunking → page-level concept extraction → SQ3R-style questions → notes export. We added an evaluation harness for grounding and latency and iterated UI patterns (inline quiz, highlights, and “explain like I’m 5” toggles) to keep the reader focused.
- •Automate data-to-insight at the page level: reproducible pipelines and consistent prompts per content type (PDF vs HTML).
- •Ssearch across the open doc + prior notes, with citations back to the page/section.
Build & phases
Three concurrent tracks produced quick wins while laying a scalable foundation.
Discovery
User research, content inventory, grounding & latency targets.
- Identify primary content types (textbooks, research PDFs, long-form articles).
- Define evaluation criteria (precision, recall, grounding, latency, cost).
- Baseline parsing quality: headings, figures, math, footnotes.
Design
Reader architecture and study model.
- Layout-aware chunking; embeddings for context windows; prompt contracts per page type.
- SQ3R interaction model: survey prompts, page questions, periodic quick reviews.
- Governance: RBAC for saved docs, testing, lineage of generated notes.
Delivery
Ship the thin slice, then scale by template.
- Priority features: summaries, quizzes, highlights, Markdown export, Google sign-in.
- Performance budget per page; streaming responses for perceived speed.
- Human-in-the-loop: quick “fix/cite” actions to correct or ground outputs.
Iteration
Scale to more sources and modes.
- Richer PDF heuristics; web article readability; mobile-first interactions.
- Prompt library versioning; red-team evaluation; notes export integrations.
- Text-to-speech mode and spaced-repetition review sets.
Selected results
- Inline summaries and quizzes reduce context-switching; students stay in the reader.
- Markdown exports plug into Notion/Obsidian; research notes stay portable.
- Users can semantically search within the active document and their saved notes with citations.
- Support for large PDFs with predictable per-page latency and cost controls.
Solution overview
Reader platform. Web app as the system of study; ingestion for PDFs/HTML; layout-aware parsing; automated tests; latency/cost monitors. A governed semantic layer indexes pages, notes and citations for retrieval.
AI-enhanced learning.page-level question generation aligned to SQ3R; templated summaries with links back to page locations.
Custom dashboard. Role-based access control (RBAC) ensures each user only sees their documents and notes. Built with Next.js, the app is fast, secure, and includes a custom library section for storage of books and quizzes.
Notable benefits
What this means for students & researchers:
- Trustworthy learning: summaries cite sections; questions tie back to the page.
- Faster comprehension: key concepts are extracted as you read, not after.
- Explainable AI: every answer can be traced to a page span with controls to refine.
- Consistency: a shared study model (SQ3R) across textbooks and articles.
- Accessibility: short chunks, quiz cadence; optional text-to-speech “listen” mode.
- Portability: Markdown exports drop into your note system.
AI is an accelerant, not a substitute. Skimreader keeps you in the loop — seeing more, sooner, with evidence at every step.
The Kali Software team



