Day 47 - Question Answering Modes

Date: 2025-11-11 (Tuesday)
Status: “Done”


Context-Based vs. Closed-Book QA

Two common QA setups share the same transformer backbone but differ in inputs and evaluation.

Context-Based (Open Book)

  • Input: question + supporting context paragraph(s).
  • Output: span extraction or short generation grounded in context.
  • Training: supervised spans (e.g., start/end indices) or seq2seq with context.
  • Failure mode: wrong or missing span when context is noisy.

Closed-Book

  • Input: question only; the model must rely on internal knowledge.
  • Output: generated answer without explicit context.
  • Training: language modeling style on large corpora; often fine-tuned on QA pairs.
  • Failure mode: hallucination; mitigated by stronger pre-training and knowledge distillation.

Picking the Mode

  • If you can supply documents at runtime -> prefer context-based (more controllable, cite-able).
  • If latency/storage prevents retrieving context -> closed-book is lighter but riskier.

Evaluation Notes

  • Context-based: exact match / F1 over spans; check grounding to provided text.
  • Closed-book: BLEU/ROUGE and human factuality checks; add retrieval if drift is high.

Practice Targets for Today

  • Draft examples for both modes using your domain data.
  • Define metrics you will track (span EM/F1 vs. generative ROUGE/factuality).
  • List retrieval options to upgrade closed-book to open-book if needed.