Web Typography

My notes on best practices for web typography for multi lingual content

Font size

16px or 12pt is the recommended size for body text. But do not set this value in pixels or points. Set it as 100%. font-size: 100%; In most browsers, this defaults to 16 pixels. Inheriting the base font size from browsers allow users to use browser preference to set their comfortable font size.

Then we can use other relative units (em or rem) to set font sizes for other elements. This is crucial because it means that changing the base font size will also change all other font sizes.

The body text of the page should respect the browser setting.(Note: Wikipedia does not do that, even though it is recommended in style guide)

As per https://accessibility.digital.gov/visual-design/typography/: Use a large enough font size for body text so that people can comfortably read. Use at least an effective size of 16px, but this can vary depending on the design of the font.

Accessibility

  • At age 40, only half the light gets through to the retina as it did at age 20. For 60-year-olds, it’s just 20%. People in their 60s need three times more light for comfortable reading than those in their 20s(Reference)

  • Nearly 9% of Americans are visually impaired, meaning their vision cannot be completely corrected with lenses.

  • Most people, when sitting comfortably, are about 20 to 23 inches from their computer screens. In fact, 28 inches is the recommended distance, because this is where vergence is sufficiently low to avoid eye strain. This is much further than the distance at which we read printed text — most people do not hold magazines at arm’s length!(source)

  • 16-pixel text on a screen is about the same size as text printed in a book or magazine; this is accounting for reading distance. Because we read books pretty close — often only a few inches away — they are typically set at about 10 points. If you were to read them at arm’s length, you’d want at least 12 points, which is about the same size as 16 pixels on most screens:(source)

  • People don't zoom. The people who most need to increase font size are people 65+, which is the group least-likely to be skilled enough to have adjusted settings.(source)

Script characteristics

Depending one the script characteristics, specifically the nature of glyphs in the scrpt, latin defaults for font metrics will need adjustments. Following are two script classification based on glyphs.

Tall scripts

Language that require extra line height to accommodate larger glyphs, including South and Southeast Asian and Middle Eastern languages listed below(incomplete).

  • ar - Arabic

  • bn- Bengali

  • ml- Malayalam - This script has vertical stacking and the line height-font size ratio should be chosen to avoid parts of it not chopped off.

  • fa- Persian

  • gu- Gujarati

  • hi - Hindi

  • mr - Marathi

  • as - Assamese

  • kh - Khmer

  • kn- Kannada

  • my-Myanmar

  • ne - Nepali

  • pa-Panjabi

  • si -Sinhala

  • ta-Tamil

  • te-Telugu

  • th-Thai

  • ur-Urdu

  • ko-Korean - The height is mostly same as Latin but the density of glyphs are higher. Slightly higher linespacing expected

  • vi -Vietnamese - Even though the script used is latin, it has many diacritic marks that are specific to vietnamese.

  • bo-Tibetan - This script has multi level vertical stacking and no word spacing. bo.wikipedia.org increased the default font size to 125%(20px) using custom styling

Dense scripts

Scripts with glyphs having more complex strokes in comparison with latin need larger font size for legibility. Chinese, Indic scripts are some examples. Most of these scripts are legible in 12pt or 16px. Anything below makes them hard to read

  • bn- Bengali - Glyphs are very dense with strokes joining in accute angles. Need minimum 12pt/16px font size. Script example

  • ml- Malayalam

  • gu- Gujarati

  • hi - Hindi

  • mr - Marathi

  • as - Assamese

  • kh - Khmer

  • kn- Kannada

  • my - Myanmar

  • ne - Nepali

  • pa - Panjabi

  • si - Sinhala

  • ta - Tamil

  • te - Telugu

  • th - Thai

  • lo - Lao

  • ur - Urdu

  • vi - Vietnamese

  • bo - Tibetan

  • zh - Chinese - High script density and require larger fontsize. The Chinese wikipedia sets larger font size using custom styling.

  • ja - Japanese- High script density and require larger fontsize. The Japanse wikipedia sets larger font size using custom styling.

A script can be tall and dense at the same time. That means, they need larger font size and larger line spacing.

Google material guidelines does a similar grouping of scripts.

Font selection

Selecting a font stack that works across all operating systems, browsers, their versions for all scripts we want to support is a hard problem. It is harder because we don't have a clear definition of what works good. I provide two principles about default fonts.

Respect the native fonts set by users Do not set a specific font for any platform. If the platform provides options to change the default fonts, allow them to use that features and consume our content. Just set sans-serif, or sans as font stacks. Simple and stupid.

OR

Enforce a typographic identity Use webfonts to enforce a brand typograpy without relying on platforms fonts and not allowing users to override. Choose the best fonts for the script wisely and embed it, take care of performance aspects.

Anything in between by guessing the availability of fonts and using long font stacks does not work for targetting multi script, multi lingual audience. The concept of universally available fonts in a platform and using it in font stack is very fragile for non-English content. To make it worse, the default, native proprietory fonts some operating systems ships for some scripts are very broken and aesthetically ugly. They are ugly because they had the same constraints of Noto(see below) and follow the UI metrics guidelines that is designed as one metrics for all scripts.

Here is a quote from John Hurdson about constraints in developing types for Windows and how it affected Malayalam font Nirmala UI(Nirmala Malayalam, another mindless type design - debate between Hashim PM and John Hudson, the designer of Nirmala and Kartika fonts. 28 July 2013 https://web.archive.org/web/20190921104444/http://www.typophile.com:80/node/105005)

The fixed restrictions of the UI metrics was the primary, non-negotiable term in the Nirmala UI design brief: whatever we did had to fit within the vertical metrics of the Segoe UI and other UI fonts. The core target size for UI use, despite the increase in screen resolutions on many Win8 devices, is still 9pt at 96ppi, i.e. 12 ppem, with some Office UI items displaying at 8pt (with further restrictions on ppi height through VDMX adjustments at some sizes). At 12 ppem, we have exactly 3 pixels below the baseline before we hit the OS/2 WinDescent limit, beyond which glyphs will be clipped. Many of the Indian writing systems make significant use of the space below the baseline, so we had to employ a number of strategies to squeeze subjoined letters and other descending shapes into the UI metrics. The results are not all pleasant, and some contravene the norms of these writing systems, achieving only a legible decipherability, rather than true readability.

Noto

Noto is a font family comprising over 100 individual fonts, which are together designed to cover all the scripts encoded in the Unicode standard. The Noto family is designed with the goal of achieving visual harmony (e.g., compatible heights and stroke thicknesses) across multiple languages/scripts.This multi script requirement is hard to achieve and Noto tried its best. But note that typefaces designed specifically for a single script does not have that constraint and in general they tend to be more aesthetically true to the script.

  • Noto being the default sans serif font in android phones(shipped by google), there is no need to mention in the font stack for them.

  • Noto is not installed by default in any desktop operating systems. Each comes with their native fonts as default sans-serif, serif fonts. So adding Noto in font stack does not serve there too

  • The smart phone market in south asian countries are owned by Xiami and others who ship their own theming and allows users to customize default UI fonts.

  • Noto is designed as a graceful final fallback solution for all scripts. Using that as the "Choice" is a lazy solution. Not mentioning it in font stack also gives the same solution.

Hyphenation

If text is justified, it is important to hyphenate to avoid big whitespaces between words.

Latest version of Chromium based browsers has built in hyphenation for many languages. The HTML elements should annotate the language using lang attribute. Then use the following css:

text-align: justify;
hyphens: auto;

Recommendations

  • It is better to not justify if hyphenation support is not available. For other browsers or old versions https://github.com/mnater/Hyphenopoly can be used. This library uses hyphenation patterns I authored for Indic languages

  • The hyphenation character defaults to Soft Hyphen(0x00AD) but that is not always optimal. Several scripts prefere non-visible hyphenation character at the place of word break. This can be controlled by hyphenate-character CSS property. But not widely implemented. -webkit-hyphenate-character: ''; works for webkit browsers

  • Refer https://www.w3.org/TR/typography/#hyphenation ( which I contributed with content and examples)

Android: Android comes with hyphenation support. For Indic languages it uses the hyphenation patterns I authored

Web page

Counters

There are predefined counter styles for lists. But that does not mean they are the preferred counter styles for the script. For example, there is a predefined counter style malayalam that use Malayalam numberals, but they are almost never used. Devanagari counter styles are not used by all languages that use Devanagari. For example, Hindi prefers default 1,2,3 style. This is not without debates. For example, Hindi wikipedia has a top prominent dropdown to choose the preferred number style(this makes the interface quite bad in my opinion)

Opentype features

Tabular numbers

Fixed-width numbers are useful for tabular data, where comparing columns across rows is desired. In CSS this can be done by adding style font-feature-settings: "tnum";. Enabling this for tables is recommended. A good quality typeface will have this implemented and it helps a lot for fast scanning and analysing data.

Fractions

This feature is contextually sensitive and will convert "words" of numbers separated by forward slash into proper fractions. Good quality fonts will have these opentype rules. In CSS this can be done by adding style font-feature-settings: "frac";

Avoid using stylistic alternates ss01. that are font specific. Applying this on a font fallback chain will result unexepected behaviours. Example

Smart quotes

CSS properties that should not be used

  • font-smoothing - badly speced and unxpected results depending background and foreground color. Pick a good font instead.

  • font-stretch - Don't override the designer of the font. The results won't be pretty, especially for complex scripts

  • font-size-adjust - Don't override the designer of the font. The results won't be pretty, especially for complex scripts

  • :dir() The :dir() CSS pseudo-class matches elements based on the directionality of the text contained in them. Be aware that the behavior of the :dir() pseudo-class is not equivalent to the [dir=…] attribute selectors. The latter match the HTML dir attribute, and ignore elements that lack it — even if they inherit a direction from their parent. (Similarly, [dir=rtl] and [dir=ltr] won't match the auto value.) In contrast, :dir() will match the value calculated by the user agent, even if inherited. it doesn’t have great support, is considered experimental

  • direction The direction CSS property sets the direction of text, table columns, and horizontal overflow. Use rtl for languages written from right to left (like Hebrew or Arabic), and ltr for those written from left to right (like English and most other languages). Note that text direction is usually defined within a document (e.g., with HTML's dir attribute) rather than through direct use of the direction property.

  • letter-spacing Don't change this to support complex scripts. Just use defaults

  • font-kerning By default kerning is enabled by browsers. Don't play with it.

Text decoration

Italic

Italic or even the slanted text decorations are not universal. Due to latin influence some non-latin scripts started using it, but they are mostly artificial. Typefaces for non-latin script not always comes with italic variants. The issue is browsers and editors will create a synthetic italic in those cases - that is a forced 20 degree skewing to the glyphs and it is not nice to that script. Also known as faux italic.

Underline

For emphasis, avoid using underlines. If at all this is required please be aware that glyph heights vary depending on the script and it may look like script is striked off. ഇതുപോലെ is a Malayalam example, the script has bottom tails that already looks like underlines. An additional underline would cut through the glyphs. Because of this, browser now detect these boundaries and avoid overwriting glyphs in link underlines. The css property text-decoration-skip: ink; can be used to skip glyphs that cross that underline. MDN documentation

Vertical Rhythm

Resources & tools for vertical rhythm

Here’s a list of really cool and useful tools and resource when it comes to rhythm in web typography.

  • Syncope Syncope is a WYSIWYG tool for establishing vertical rhythm on websites.

  • Archetype Create beautiful web typography designs, in the browser.

  • Grid Lover Establish a typographic system with modular scale & vertical rhythm.

  • Gutenberg A meaningful web typography starter kit.

Good examples

Some examples of good typography appled for long form content. Content is latin only though

Typesetting in web

Last updated