📚
Docs
  • Welcome
  • Santhosh Thottingal
    • Coding
    • Software I use
    • Research Papers
    • Talks
    • Projects
    • In news
    • Ideas
    • Books
  • Malayalam Computing
    • Unicode
      • Syllable
      • Conjunct
      • Articles
    • Input methods
      • Inscript
      • Swanalekha
      • Handwriting Recognition
        • Procrustes Analysis
      • Proprietory Input Methods
      • What is a good input method?
      • Typewriter
    • Script Rendering
      • Orthography
      • Ya Ra Va Signs
      • U signs
    • Type Design
      • Color Fonts
      • Curves
      • Design Ideas
      • Manjari
        • Gallery
      • Chilanka
      • Gayathri
      • Customize Malayalam fonts in Linux
      • Articles
      • Tools
      • Type classification
        • Display typefaces
    • Spellcheck
      • History
      • Dictionary based approach
      • Nature of Malayalam spelling mistakes
      • Morphology analyser based approach
      • Tools and services
      • Links
    • Hyphenation
      • Web page
    • Typesetting
      • LaTeX
      • Scribus
      • PDF
      • XeTeX
      • Indesign
      • Markup languages
    • Speech Recognition
    • Speech Synthesis
      • Dhvani
    • Collation
    • Corpus
    • Morphology Analysis
      • Mlmorph
        • Snippets
      • Part of speech tagging
      • Morphology complexity
    • Named Entity Recognition
    • Numbers
      • Number spellout
      • Hindi
    • Machine Translation
      • Neural Machine Translation
    • Optical Character recognition
    • Transliteration
    • Digitization
    • NLP
      • Low resource languages
      • Natural Language Generation
    • Grammar analysis
      • Style checkers
    • Dictionary
      • Lexicon
    • Natural Language Understanding
    • Natural Language Generation
    • Swathanthra Malayalam Computing
    • Meta
      • Malayalam Sign Language
      • പദനിർമിതി
      • History
      • ലിപിപരിണാമം നിലച്ചുപോയോ?
      • ഭാഷാ പഠനം
      • ശ്രേഷ്ഠ ഭാഷ
      • Dictionary
    • Encyclopedia
    • Government
      • Script
      • കേരള ഭാഷാ ഇൻസ്റ്റിറ്റ്യൂട്ട്
  • Academic Research
    • Knowledge Dissemination
    • Research papers
    • Reproducible Research
  • Arts
  • Books
  • Blockchain
  • Computer Science
    • Data, Information, Knowledge
    • Theory of computation
    • Compilers and Interpreters
    • Graphics
    • Data Visualization
    • Parsers
    • Data Structures & Algorithms
    • Finite State Transducer
  • Cyberspace
    • Digital Governance
    • കേരളത്തിൽ
    • Online Abuse
  • Databases
  • Education
    • Finite State Transducers
    • Digital Education
    • Digital Literacy
      • ഡിജിറ്റൽ സാക്ഷരതാ പദ്ധതി
      • Resources
    • Remote Learning
    • General Learning
  • Entertainment
  • Frontend technology
    • Colors
    • Design systems
    • CSS
    • PWA
    • SPA
    • Vue
  • Generative Graphics
    • Drawbot
    • Matrix Digital Rain
  • Hardware
  • Internet
    • Etiquettes
    • Privacy
    • IPFS
    • Resilience
    • Decentralization
    • Network debugging tools
  • Knowledge Representation
  • Languages & Scripts
    • Arabic
    • Vattezhuth
  • Life
    • Digital Minimalism
  • Linux
  • Machine learning
    • Neural Networks
    • Dialog systems, Information retrieval
    • Large Language Models
    • Embedding
    • ML in Production
    • Retrieval Augmented Generation
  • Mathematics
  • Music
  • Parenting
  • Politics
    • Hatred, Hinduthwa, Nationalism
  • Productivity
  • Problem Solving
  • Science
  • Software Libraries
  • Software Engneering
    • Architecture
    • Product Management
    • Docker
    • Programming
      • Javascript
    • People
    • Performance
    • Code Review
  • Web3
  • Web Typography
  • Writing
  • പാട്ടുകൾ
    • കുട്ടിപ്പാട്ടുകൾ
  • മലയാളം അച്ചടി
  • ഗവേഷണപ്രബന്ധങ്ങൾ
Powered by GitBook
On this page
  • Font size
  • Accessibility
  • Links
  • Script characteristics
  • Tall scripts
  • Dense scripts
  • Font selection
  • Noto
  • Links
  • Hyphenation
  • Counters
  • Links
  • Opentype features
  • Tabular numbers
  • Fractions
  • Smart quotes
  • CSS properties that should not be used
  • Text decoration
  • Italic
  • Underline
  • Vertical Rhythm
  • Resources & tools for vertical rhythm
  • Good examples
  • Typesetting in web

Web Typography

My notes on best practices for web typography for multi lingual content

PreviousWeb3NextWriting

Last updated 1 year ago

Font size

16px or 12pt is the recommended size for body text. But do not set this value in pixels or points. Set it as 100%. font-size: 100%; In most browsers, this defaults to 16 pixels. Inheriting the base font size from browsers allow users to use browser preference to set their comfortable font size.

Then we can use other relative units (em or rem) to set font sizes for other elements. This is crucial because it means that changing the base font size will also change all other font sizes.

Accessibility

Links

Script characteristics

Depending one the script characteristics, specifically the nature of glyphs in the scrpt, latin defaults for font metrics will need adjustments. Following are two script classification based on glyphs.

Tall scripts

Language that require extra line height to accommodate larger glyphs, including South and Southeast Asian and Middle Eastern languages listed below(incomplete).

  • ar - Arabic

  • bn- Bengali

  • ml- Malayalam - This script has vertical stacking and the line height-font size ratio should be chosen to avoid parts of it not chopped off.

  • fa- Persian

  • gu- Gujarati

  • hi - Hindi

  • mr - Marathi

  • as - Assamese

  • kh - Khmer

  • kn- Kannada

  • my-Myanmar

  • ne - Nepali

  • pa-Panjabi

  • si -Sinhala

  • ta-Tamil

  • te-Telugu

  • th-Thai

  • ur-Urdu

  • ko-Korean - The height is mostly same as Latin but the density of glyphs are higher. Slightly higher linespacing expected

  • vi -Vietnamese - Even though the script used is latin, it has many diacritic marks that are specific to vietnamese.

Dense scripts

Scripts with glyphs having more complex strokes in comparison with latin need larger font size for legibility. Chinese, Indic scripts are some examples. Most of these scripts are legible in 12pt or 16px. Anything below makes them hard to read

  • ml- Malayalam

  • gu- Gujarati

  • hi - Hindi

  • mr - Marathi

  • as - Assamese

  • kh - Khmer

  • kn- Kannada

  • my - Myanmar

  • ne - Nepali

  • pa - Panjabi

  • si - Sinhala

  • ta - Tamil

  • te - Telugu

  • th - Thai

  • lo - Lao

  • ur - Urdu

  • vi - Vietnamese

  • bo - Tibetan

A script can be tall and dense at the same time. That means, they need larger font size and larger line spacing.

Font selection

Selecting a font stack that works across all operating systems, browsers, their versions for all scripts we want to support is a hard problem. It is harder because we don't have a clear definition of what works good. I provide two principles about default fonts.

Respect the native fonts set by users Do not set a specific font for any platform. If the platform provides options to change the default fonts, allow them to use that features and consume our content. Just set sans-serif, or sans as font stacks. Simple and stupid.

OR

Enforce a typographic identity Use webfonts to enforce a brand typograpy without relying on platforms fonts and not allowing users to override. Choose the best fonts for the script wisely and embed it, take care of performance aspects.

Anything in between by guessing the availability of fonts and using long font stacks does not work for targetting multi script, multi lingual audience. The concept of universally available fonts in a platform and using it in font stack is very fragile for non-English content. To make it worse, the default, native proprietory fonts some operating systems ships for some scripts are very broken and aesthetically ugly. They are ugly because they had the same constraints of Noto(see below) and follow the UI metrics guidelines that is designed as one metrics for all scripts.

The fixed restrictions of the UI metrics was the primary, non-negotiable term in the Nirmala UI design brief: whatever we did had to fit within the vertical metrics of the Segoe UI and other UI fonts. The core target size for UI use, despite the increase in screen resolutions on many Win8 devices, is still 9pt at 96ppi, i.e. 12 ppem, with some Office UI items displaying at 8pt (with further restrictions on ppi height through VDMX adjustments at some sizes). At 12 ppem, we have exactly 3 pixels below the baseline before we hit the OS/2 WinDescent limit, beyond which glyphs will be clipped. Many of the Indian writing systems make significant use of the space below the baseline, so we had to employ a number of strategies to squeeze subjoined letters and other descending shapes into the UI metrics. The results are not all pleasant, and some contravene the norms of these writing systems, achieving only a legible decipherability, rather than true readability.

Noto

Noto is a font family comprising over 100 individual fonts, which are together designed to cover all the scripts encoded in the Unicode standard. The Noto family is designed with the goal of achieving visual harmony (e.g., compatible heights and stroke thicknesses) across multiple languages/scripts.This multi script requirement is hard to achieve and Noto tried its best. But note that typefaces designed specifically for a single script does not have that constraint and in general they tend to be more aesthetically true to the script.

  • Noto being the default sans serif font in android phones(shipped by google), there is no need to mention in the font stack for them.

  • Noto is not installed by default in any desktop operating systems. Each comes with their native fonts as default sans-serif, serif fonts. So adding Noto in font stack does not serve there too

  • The smart phone market in south asian countries are owned by Xiami and others who ship their own theming and allows users to customize default UI fonts.

  • Noto is designed as a graceful final fallback solution for all scripts. Using that as the "Choice" is a lazy solution. Not mentioning it in font stack also gives the same solution.

Links

Hyphenation

If text is justified, it is important to hyphenate to avoid big whitespaces between words.

Latest version of Chromium based browsers has built in hyphenation for many languages. The HTML elements should annotate the language using lang attribute. Then use the following css:

text-align: justify;
hyphens: auto;

Recommendations

  • It is better to not justify if hyphenation support is not available. For other browsers or old versions https://github.com/mnater/Hyphenopoly can be used. This library uses hyphenation patterns I authored for Indic languages

  • The hyphenation character defaults to Soft Hyphen(0x00AD) but that is not always optimal. Several scripts prefere non-visible hyphenation character at the place of word break. This can be controlled by hyphenate-character CSS property. But not widely implemented. -webkit-hyphenate-character: ''; works for webkit browsers

Counters

Links

Opentype features

Tabular numbers

Fixed-width numbers are useful for tabular data, where comparing columns across rows is desired. In CSS this can be done by adding style font-feature-settings: "tnum";. Enabling this for tables is recommended. A good quality typeface will have this implemented and it helps a lot for fast scanning and analysing data.

Fractions

This feature is contextually sensitive and will convert "words" of numbers separated by forward slash into proper fractions. Good quality fonts will have these opentype rules. In CSS this can be done by adding style font-feature-settings: "frac";

Smart quotes

CSS properties that should not be used

  • font-smoothing - badly speced and unxpected results depending background and foreground color. Pick a good font instead.

  • font-stretch - Don't override the designer of the font. The results won't be pretty, especially for complex scripts

  • font-size-adjust - Don't override the designer of the font. The results won't be pretty, especially for complex scripts

  • letter-spacing Don't change this to support complex scripts. Just use defaults

  • font-kerning By default kerning is enabled by browsers. Don't play with it.

Text decoration

Italic

Italic or even the slanted text decorations are not universal. Due to latin influence some non-latin scripts started using it, but they are mostly artificial. Typefaces for non-latin script not always comes with italic variants. The issue is browsers and editors will create a synthetic italic in those cases - that is a forced 20 degree skewing to the glyphs and it is not nice to that script. Also known as faux italic.

Underline

Vertical Rhythm

Resources & tools for vertical rhythm

Here’s a list of really cool and useful tools and resource when it comes to rhythm in web typography.

Good examples

Some examples of good typography appled for long form content. Content is latin only though

Typesetting in web

Links

The body text of the page should respect the browser setting.(Note: , even though )

As per : Use a large enough font size for body text so that people can comfortably read. Use at least an effective size of 16px, but this can vary depending on the design of the font.

At age 40, only half the light gets through to the retina as it did at age 20. For 60-year-olds, it’s just 20%. People in their 60s need three times more light for comfortable reading than those in their 20s()

of Americans are visually impaired, meaning their vision cannot be completely corrected with lenses.

Most people, when sitting comfortably, are from their computer screens. In fact, 28 inches is the recommended distance, because this is where is sufficiently low to avoid eye strain. This is much further than the distance at which we read printed text — most people do not hold magazines at arm’s length!()

16-pixel text on a screen is as text printed in a book or magazine; this is accounting for reading distance. Because we read books pretty close — often only a few inches away — they are typically set at about 10 points. If you were to read them at arm’s length, you’d want at least 12 points, which is about the same size as 16 pixels on most screens:()

People don't zoom. The people who most need to increase font size are people 65+, which is the group least-likely to be skilled enough to have adjusted settings.()

A tool to try and choose a scale rhythm(ratio of typeface sizes used for headings, body content

Tim Brown’s

A set of js, css, sass libraries to implement this -

Typesetting body text - Talk by Tim brown

using musical scales for better scale harmony - Owen Gregory

Fluid Type Scale

bo-Tibetan - This script has multi level vertical stacking and no word spacing. increased the default font size to 125%(20px) using

bn- Bengali - Glyphs are very dense with strokes joining in accute angles. Need minimum 12pt/16px font size.

zh - Chinese - High script density and require larger fontsize. The Chinese wikipedia sets larger font size using .

ja - Japanese- High script density and require larger fontsize. The Japanse wikipedia sets larger font size using .

Google material guidelines does a similar of scripts.

Here is a quote from John Hurdson about constraints in developing types for Windows and how it affected Malayalam font Nirmala UI(Nirmala Malayalam, another mindless type design - debate between Hashim PM and John Hudson, the designer of Nirmala and Kartika fonts. 28 July 2013 )

a (not exhaustive) list of fonts, grouped by script, that are available via the Windows 10 and Mac OS X operating systems, as well as Google's Noto fonts and SIL fonts.

- w3.org internationalization

Refer ( which I contributed with content and examples)

Android: Android comes with hyphenation support. For Indic languages it the hyphenation patterns I authored

There are counter styles for lists. But that does not mean they are the preferred counter styles for the script. For example, there is a predefined counter style malayalam that use Malayalam numberals, but they are almost never used. Devanagari counter styles are not used by all languages that use Devanagari. For example, Hindi prefers default 1,2,3 style. This is not without debates. For example, Hindi wikipedia has a top to choose the preferred number style(this makes the interface quite bad in my opinion)

Avoid using stylistic alternates ss01. that are font specific. Applying this on a font fallback chain will result unexepected behaviours.

:dir() The :dir() CSS pseudo-class matches elements based on the directionality of the text contained in them. Be aware that the behavior of the :dir() pseudo-class is not equivalent to the [dir=…] attribute selectors. The latter match the HTML dir attribute, and ignore elements that lack it — even if they inherit a direction from their parent. (Similarly, [dir=rtl] and [dir=ltr] won't match the auto value.) In contrast, :dir() will match the value calculated by the user agent, even if inherited. it doesn’t have great support, is considered

direction The direction CSS property sets the direction of text, table columns, and horizontal overflow. Use rtl for languages written from right to left (like Hebrew or Arabic), and ltr for those written from left to right (like English and most other languages). Note that text direction is usually defined within a document (e.g., with ) rather than through direct use of the direction property.

For emphasis, avoid using underlines. If at all this is required please be aware that glyph heights vary depending on the script and it may look like script is striked off. is a Malayalam example, the script has bottom tails that already looks like underlines. An additional underline would cut through the glyphs. Because of this, browser now detect these boundaries and avoid overwriting glyphs in link underlines. The css property text-decoration-skip: ink; can be used to skip glyphs that cross that underline.

Syncope is a WYSIWYG tool for establishing vertical rhythm on websites.

Create beautiful web typography designs, in the browser.

Establish a typographic system with modular scale & vertical rhythm.

A meaningful web typography starter kit.

Can you typeset a book with CSS?

Wikipedia does not do that
it is recommended in style guide
https://accessibility.digital.gov/visual-design/typography/
Reference
Nearly 9%
about 20 to 23 inches
vergence
source
about the same size
source
source
https://type-scale.com/
modular typeface scales concept
Modular Scale
https://github.com/modularscale
https://vimeo.com/156203722
Composing the New Canon: Music, Harmony, Proportion
https://www.fluid-type-scale.com/calculate
bo.wikipedia.org
custom styling
Script example
custom styling
custom styling
grouping
https://web.archive.org/web/20190921104444/http://www.typophile.com:80/node/105005
Selecting Typefaces For Body Text
Five Principles For Choosing And Using Typefaces
Best Practices For Combining Typefaces
This page provides
Font styles & font fallback
https://www.w3.org/TR/typography/#hyphenation
uses
Web page
predefined
prominent dropdown
https://www.w3.org/TR/typography/#lists
Example
experimental
HTML's dir attribute
ഇതുപോലെ
MDN documentation
Syncope
Archetype
Grid Lover
Gutenberg
https://thereader.mitpress.mit.edu/habits-of-expert-software-designers/
https://distill.pub/2021/multimodal-neurons/
https://www.w3.org/Talks/2013/0604-CSS-Tokyo/
All you need to know about hyphenation in CSSclagnut
Logo
Smart Quotes for Smart People
Logo
Rhythm in Web TypographyBetter Web Type
Logo
Firefox settings for default font size.
From https://rsms.me/inter/#features/tnum
Source https://rsms.me/inter/#features/frac