Accepting any correct reading

Reading practice has a fairness problem I kept bumping into. A single kanji often has several valid readings, and the input method a learner types with can quietly produce the character itself instead of the kana you asked for. If the grader is strict, you can be marked wrong for being right. So in Kanji Climb, my pathway-based trainer, reading mode is built to judge the answer rather than the typing.

Where the readings come from

Before any grading happens, the app has to know every legitimate answer for a character. The dataset is built once by a small Python merge step and shipped as a flat JSON file, app/kanji-dataset.json, about 240 KB covering 2,224 kanji. It stitches together two open datasets: a topological ordering of kanji by component dependency (so prerequisites always appear before the characters that need them), and KANJIDIC2-derived metadata for meanings, readings, and JLPT levels. Each entry is deliberately terse:

{ "c": "雨", "m": ["rain"], "on": ["う"], "kun": ["あめ","あま"],
  "deps": ["一","口"], "jlpt": 5, "grade": 1 }

The two reading fields are the heart of this post. on holds the on'yomi, the readings borrowed from Chinese; kun holds the kun'yomi, the native Japanese readings. Most characters carry several of each, and any of them can be the answer a learner has in mind. That is the first design fact I had to respect: there is rarely one right reading, there is a set.

The fairness problem

A grader that only accepts one stored reading is not really testing knowledge; it is testing whether the learner guessed which reading I happened to list first. That is unfair on its own. Then the keyboard makes it worse.

A Japanese IME is a conversion engine. You type kana and it offers to swap them for kanji. A learner who types the correct reading and accepts the suggestion can hand back the kanji glyph itself instead of the hiragana I expected. They knew the reading perfectly. The input method just got in the way. Marking that wrong is the kind of small unfairness that quietly erodes trust in a study tool, and trust is most of what keeps anyone coming back to a flashcard.

What counts as correct

So reading mode accepts two distinct shapes of answer, and the check itself is short enough to read in one sitting:

function checkAnswer(entry, raw) {
  const input = raw.trim();
  if (!input) return { kind: "idk" };
  // ... meaning mode ...
  // Reading: accept the kanji char itself (IME-converted),
  // any on/kun reading, in hiragana or katakana.
  if (input === entry.c) return { kind: "correct" };
  const norm = normalizeReading(input);
  for (const r of [...entry.on, ...entry.kun]) {
    if (normalizeReading(r) === norm) return { kind: "correct" };
  }
  return { kind: "wrong" };
}

Two things matter here. The first line of the reading branch, input === entry.c, is the IME escape hatch: if the learner ends up with the character on screen, that is strong evidence they reached the right reading, even if the path there ran through a conversion. The loop then spreads on and kun into a single list and checks the input against the whole set, not one canonical pick. The principle is easy to state and easy to forget while writing the comparison: know all of the right answers before ever telling someone they were wrong.

Normalizing before comparing

Accepting multiple readings is only half the work. The other half is making the comparison robust to the small ways input drifts. The trap I hit was katakana. A learner, or their keyboard, may produce a reading in katakana when my stored reading is in hiragana. Those are the same sounds in two scripts, and a naive string equality treats them as different strings and fails the answer.

Hiragana and katakana sit in adjacent, parallel blocks in Unicode, so the conversion is just a fixed offset. The katakana block I care about runs from U+30A1 to U+30F6, and the matching hiragana lives exactly 0x60 codepoints lower. I fold every katakana codepoint down into its hiragana twin, then strip whitespace, so the comparison runs on sound rather than on which kana table the characters came from:

function kataToHira(s) {
  let out = "";
  for (const ch of s) {
    const cp = ch.codePointAt(0);
    if (cp >= 0x30A1 && cp <= 0x30F6) out += String.fromCodePoint(cp - 0x60);
    else out += ch;
  }
  return out;
}

function normalizeReading(s) {
  return kataToHira(s.trim()).replace(/\s+/g, "");
}

The order is the load-bearing part: normalize first, then test the glyph and the readings. Iterating by codepoint with for...of rather than by UTF-16 code unit also keeps multibyte characters intact, which is the sort of detail that bites you only once. Once this runs, a learner can answer in hiragana, in katakana, or by surfacing the kanji, and a correct answer counts in all three.

The meaning side gets the same care

Meaning mode has its own version of the same fairness problem, and the noise is English rather than kana. I lowercase, strip surrounding punctuation, and remove a leading article or infinitive marker ("to", "the", "a", "an") before comparing, because it felt unkind to lose a card over the word "to" in front of a verb. The dataset also lists meanings most-common-first, so the first entry is the answer a learner is most likely to reach for.

There is one more concession. KANJIDIC glosses are often phrases like "see, look at". If a learner types just "see", a strict equality would fail them even though they clearly know the character. So after the exact check, I split each stored meaning on commas and spaces and accept the input if it appears as a whole token, gated to inputs of at least three characters so a stray "a" or "to" cannot match by accident:

if (norm.length >= 3) {
  for (const m of entry.m) {
    if (normalizeMeaning(m).split(/[, ]/).includes(norm))
      return { kind: "correct" };
  }
}

Keeping the flow unbroken

Lenient grading only helps if the learner can keep typing without reaching for the mouse. One small detail does a lot of quiet work: in study view the answer box auto-refocuses on every click. A document-level click listener sends focus back to the input, with a guard so clicking a real control still works:

document.addEventListener("click", e => {
  if (state.view !== "study") return;
  const tag = (e.target.tagName || "").toLowerCase();
  if (tag === "button" || tag === "select" || tag === "input") return;
  els.answer.focus();
});

The view check matters because the trainer has a second screen, a pathway grid of all 2,224 kanji, where stealing focus would break clicking through the dependency graph. Inside study view, though, you can glance at the card, click anywhere to dismiss it, and the cursor is still waiting. The rhythm of practice never breaks.

When the answer is revealed, the card shows the full set: every on reading and every kun reading, joined and labeled, so the multiplicity lands at the moment it teaches. The learner sees not just whether they were right, but the other readings they could have given. That is the right time for that lesson, after the attempt, rather than cluttering the prompt before they have even tried.

The grader should never know fewer answers than the learner does. Anything less turns a memory test into a guessing game about my data.

Takeaways

When a domain has multiple correct answers, store all of them and check against the full set before judging anything wrong.
Normalize input into one canonical form first. Folding katakana to hiragana is a fixed 0x60 codepoint shift, and iterating by codepoint keeps multibyte characters whole.
Accept the artifacts of the tool the learner uses. The kanji glyph itself is valid evidence of a right reading, so make it the first thing you accept.
Strip the noise that does not carry meaning, like leading articles, and allow whole-token matches on multi-word glosses so people are graded on knowledge, not formatting.
Protect the flow: auto-refocus the input in study view, but guard the focus steal so other screens stay usable. Generosity in grading only pays off if the typing stays smooth.

Accepting any correct reading.