Localizing the whole app into 38 languages

I localized the whole of Sequence, app and widgets alike, into 38 App Store locales. That sentence is short. The work behind it took a while to figure out, especially as one developer also trying to keep every store listing as honest as the binary. Here is what made it tractable, and a couple of things I would do differently.

A String Catalog as the single source of truth

The whole effort hangs on one file: Localizable.xcstrings, Apple's String Catalog format. Instead of a loose pile of per-language .strings files that drift out of sync, the catalog is a single JSON document Xcode understands natively. Sequence's app catalog currently holds 855 keys, with sourceLanguage set to en, and the widgets carry their own smaller catalog of 36 keys. Keeping the widget strings in a separate SequenceWidgets/Localizable.xcstrings matters because widgets run in a different extension target; sharing one catalog across targets would mean shipping every app string into the widget bundle.

One detail that confused me early: the number of languages in the catalog is not the same as the number of App Store locales. The catalog stores 36 base languages (de, fr, zh-Hans, and so on), but App Store Connect wants regional variants split out, so the store ships 38 locales by mapping one catalog language to several. English alone fans out to en-US, en-GB, and en-AU; Spanish covers es-ES and es-MX; French covers fr-FR and fr-CA; Portuguese covers pt-BR and pt-PT. The scripts span Latin, Cyrillic, CJK, and right-to-left (Arabic and Hebrew). That range is deliberate, because each script class breaks a different assumption. CJK changed my sense of how short a label can be. Right-to-left flips the whole layout. Picking a wide spread early surfaces those problems while the app is still small enough to fix them cheaply.

Keeping context attached to every string

The thing that quietly trips up a localization is a translator staring at a bare word with no idea what it does. "Rest" is a noun and a verb. "Play" is a button, a verb, and a noun. A string table that ships only the English word ships that ambiguity to everyone downstream.

So I lean on String(localized:comment:) at the call site, with the comment carrying the intent. The interesting cases are the interpolated ones, where the format string itself has to be translatable and the comment explains the variables:

// Real summaries from SequenceOverviewView.swift
case .interval:
    return String(localized: "\(config.rounds) rounds",
                  comment: "Interval card summary in action row")
case .countUp where goal != nil:
    return String(localized: "Goal: \(formatDuration(goal))",
                  comment: "Count-up goal summary in action row")

// And the count-based key, stored as %lld in the catalog
"%lld×%lld (%lld total)"  // comment: "Repetition card summary: sets×reps (total)"

The comment travels with the key straight into the catalog, so whoever translates "rounds" knows it is a count of interval rounds, not a shape. Swift's string interpolation compiles down to a printf-style format key (%lld for the 64-bit counts), which is exactly what the catalog stores. The payoff is that the German translation can reorder those positional arguments freely, since CJK and several European languages do not keep "sets before reps" in the same order English does.

Coverage that stays complete as features land

The tricky part of localization is not the first pass. It is the second hundred passes, as features keep shipping and new strings quietly slip in untranslated. An app is never "done" being localized; it is only localized as of the last build.

The String Catalog earns its keep here because every translation carries an explicit state. Across Sequence's catalog the vast majority of entries sit at state: "translated", with a handful at "new" waiting on me, and Xcode also stamps an extractionState on the source key itself. When I rename an English string, the matching key flips to extractionState: "stale", which is the catalog's way of saying "the source moved, go re-check the 35 translations that descended from it." That single flag is what stops translations from silently rotting:

"Primary Action" : {
  "extractionState" : "stale",   // source text changed since last extract
  "localizations" : {
    "de" : { "stringUnit" : { "state" : "translated", "value" : "…" } }
  }
}

When I expanded the app from four card types to eight, adding Rest, Interval, Checklist, and Random Duration, the new strings showed up immediately as untranslated gaps rather than as English text leaking into a Japanese build three weeks later. The build also surfaces missing entries, which is how commit "Fix build warnings and add missing localization entries" happened: the compiler treating gaps as something worth flagging is the feedback loop that makes full coverage a habit rather than a one-time sprint.

New strings appear as untranslated in the catalog the moment they are referenced in code.
A changed source string marks the key stale, so dependent translations get re-reviewed.
The whole state is one screen in Xcode with per-language completion, not a spreadsheet reconciliation exercise.

The screenshot pipeline that makes it sustainable

A localized binary is only half the job. The App Store listing has its own screenshots, and a screenshot full of English text inside an otherwise translated store page feels a little off. Multiply that by every device class and every locale and the manual version of this task is just not realistic for one person.

I drive the capture through Fastlane's snapshot, which runs a dedicated UI test target. The SequenceScreenshots XCTestCase walks the app through a fixed lineup of eight hero screens: the sequences list, a sequence overview, the player countdown, a rest card, an interactive checklist, the completion screen, the template picker, and the AI generation flow. Each test taps through real UI using accessibility identifiers, then calls snapshot() at the right moment:

override func setUpWithError() throws {
    continueAfterFailure = false
    app.launchArguments += ["-UITesting"]   // seed demo data, skip onboarding
    setupSnapshot(app)
    app.launch()
}

func test02_SequenceOverview() {
    app.staticTexts["Morning Routine"].tap()
    let start = app.buttons["StartSequenceButton"]   // a11y id, language-agnostic
    _ = start.waitForExistence(timeout: 5)
    snapshot("02_SequenceOverview")
}

The -UITesting launch argument is the quiet hero. It tells the app to seed a known set of demo sequences ("Morning Routine," "HIIT Workout") so every locale renders identical content, and it lets the test target tap a button by its accessibility identifier rather than its visible label. Querying app.buttons["StartSequenceButton"] instead of app.buttons["Start"] is what keeps the same test script working when the button reads "Iniciar" or "開始."

The Snapfile pins the device matrix to the two sizes Apple actually requires, the 6.9" iPhone 17 Pro Max and the 13" iPad Pro, and lists all 38 locales. Two lanes cover the two jobs:

fastlane screenshots captures every device class across all 38 locales for a release, unattended.
fastlane screenshots_lang lang:en-US captures a single locale for fast iteration, when I am tuning one screen and do not want to wait on the full matrix.

A second lane runs the raw captures through a frame_screenshots.py step that drops each shot into a device bezel with marketing copy. That second mode, single-language, is the one that keeps things manageable day to day. Being able to render just one locale in a couple of minutes means I actually look at the screenshots while building, instead of discovering a clipped German label the night before submission.

A fully localized App Store presence is not a feature you bolt on at the end. It is a pipeline you build once so that shipping in 38 locales costs about the same as shipping in one.

What I would tell myself at the start

Adopt the String Catalog from day one. The per-language completion state and the stale flag are what turn localization from a guess into a checklist.
Always pass a comment, especially on interpolated strings. The cost is one argument; the payoff is translators who can reorder %lld arguments correctly instead of guessing what a button means.
Split the widget catalog from the app catalog. They live in different targets, and you do not want every app string riding along in the widget bundle.
Drive screenshots by accessibility identifier, never by visible text, and gate demo data behind a launch argument like -UITesting so every locale renders the same content.
Pick a script range that hurts early: CJK for length, right-to-left for layout. The pain is cheaper to fix while the app is small.

None of these steps is clever on its own. Together, though, they were the difference between an app that happens to have a few translations and one that feels at home in 38 places at once. The thing I keep coming back to: the value is not the 38 languages, it is that adding the 39th would now be almost free.

Localizing the whole app into 38 languages.

A String Catalog as the single source of truth

Keeping context attached to every string

Coverage that stays complete as features land

The screenshot pipeline that makes it sustainable

What I would tell myself at the start

Workstation4

Synthesizing a cajon hit in the browser, one tap at a time.

Generating routines with an on-device model.