Work

UTF-8

standard · 1992

Character Encoding Internationalization Web Standards

UTF-8 is a variable-width character encoding that can represent every character in the Unicode standard. Created by Ken Thompson and Rob Pike, it became the dominant character encoding on the web, enabling true internationalization.

Origins

In 1992, Thompson and Pike designed UTF-8 during dinner at a New Jersey diner. They sought an encoding that could represent all Unicode characters while remaining backward-compatible with ASCII and avoiding embedded null bytes.

Design Elegance

UTF-8’s brilliance lies in its properties:

How It Works

UTF-8 uses variable-length encoding:

Impact

UTF-8 transformed computing: