HELGE SVERREAll-stack Developer
Bergen, Norwayv13.0
est. 2012  |  300+ repos  |  4000+ contributions
Tools  |   Theme:
Tofu and the missing-glyph box
The blank square that appears when a font has no glyph for a codepoint, where the term 'tofu' comes from, and how it differs from U+FFFD.

What tofu is

When a text renderer needs to draw a character but the font has no glyph for that codepoint, it draws a fallback shape instead. That fallback is most often a small rectangular box, sometimes with a diagonal slash or the hex codepoint inside it. Developers and typographers call this fallback tofu because the empty rectangle looks like a block of tofu: missing glyph.

Tofu is a rendering symptom, not a Unicode character. The underlying codepoint still exists in the byte stream. The rectangle is just what the font (or the renderer) substitutes when it cannot find a glyph to draw.

Where the word comes from

The term spread within type-design and internationalization communities in the 2000s as Unicode coverage expanded and unmapped codepoints became more common. Google's Noto font family takes its name from the phrase "no more tofu" — the project's explicit goal is to provide glyphs for every assigned Unicode codepoint so that no application running Noto ever has to fall back to the empty rectangle.

The .notdef glyph

Every TrueType and OpenType font is required to define a glyph at index 0, called .notdef (short for "not defined"). This is the glyph the renderer substitutes when it asks the font for a codepoint that has no mapping in the font's character-to-glyph table.

Font designers can draw .notdef however they want. Common designs include:

  • An empty rectangle (the classic tofu)
  • A rectangle with a diagonal slash through it
  • A rectangle with a question mark inside
  • A rectangle containing the hex digits of the missing codepoint

The OpenType specification recommends that .notdef be visually distinctive so users can tell at a glance that something is missing. Some operating systems and browsers ignore the font's .notdef entirely and draw their own fallback, which is why the same missing character can render as different boxes on different platforms.

Different renderings

The shape of the box varies across the rendering stack:

  • macOS and iOS typically render tofu as a thin rectangle with a slash
  • Windows often uses a square containing the four hex digits of the codepoint
  • Chromium and Firefox consult a chain of fallback fonts before giving up and drawing their own box
  • PDF engines without font fallback (DomPDF, TCPDF in some configurations) draw whatever the font itself defines as .notdef

This is why a document that looks fine in the browser can render with tofu in a PDF: the browser silently swapped in a fallback font that had the missing glyph, while the PDF engine used only the font that was embedded.

Not the same as U+FFFD

U+FFFD (), the REPLACEMENT CHARACTER, is sometimes confused with tofu but is a different mechanism. U+FFFD is an actual Unicode codepoint used to mark a byte sequence that could not be decoded as valid UTF-8 (or UTF-16). It is inserted into the text stream by the decoder before the renderer ever sees the data.

Tofu, by contrast, happens at the rendering stage. The text is valid Unicode; the font just lacks a glyph for it. A string can show U+FFFD in one place (a decoding error) and tofu in another (a rendering error) for two different reasons.

Font fallback

Modern operating systems and browsers reduce tofu through font fallback chains. When the primary font lacks a glyph, the renderer searches a list of secondary fonts and picks the first one that has it. macOS uses CoreText for this, Windows uses DirectWrite, Linux typically uses fontconfig. The result is that a single visible string can be drawn from glyphs in several different fonts without the user noticing.

PDF engines that embed only the fonts referenced in the document, with no system fallback, are the most common place to encounter tofu in 2026. The fix is either to embed a font with broader coverage (Noto Sans covers most living scripts) or to ensure the text reaching the engine only uses codepoints the embedded font supports.




<!-- generated with nested tables and zero regrets -->