Text Tools

Unicode Character Inspector

Inspect any text character by character. See code points, UTF-8 and UTF-16 bytes, escapes, HTML entities, and reveal hidden zero-width characters. No signup.

Mode

Paste text and inspect every character.

Input text0 UTF-16 code units

Try a sample

Code points

Iterated by Unicode scalar

UTF-16 code units

What .length returns in JS

UTF-8 bytes

When encoded as UTF-8

Unique code points

Distinct characters

Invisible / format

Zero-width, format, or non-ASCII space

Control

ASCII control characters

ASCII

Code points below U+0080

Non-ASCII

Code points U+0080 and above

Per-character breakdown

0 rows

Type or paste text above to see every character broken down.

What the columns mean

Code point

Unicode scalar value, written U+XXXX in hex

UTF-8

Bytes when the character is encoded in UTF-8 (1 to 4)

UTF-16

Code units in UTF-16. Two units mean a surrogate pair

How to use

Open Inspect text mode, then paste any text you want to break down into the input area. Try a sample like Smart quotes or Hidden zero-width to see the table in action.
Read the summary panel for total code points, UTF-16 length, UTF-8 byte size, and the count of invisible, control, ASCII, and non-ASCII characters.
Scan the per-character table. Rows in amber are invisible or control characters with a friendly name, so you can spot the zero-width space or BOM that is breaking your input.
Switch to Lookup code point mode to convert in the other direction. Enter U+1F600, 128512, \u00e9, —, ©, or a name like 'zero width space' and the tool resolves it to the matching character with full byte and escape data.
Use Copy character or Copy input to grab any value, or click an example shortcut to load a known code point quickly.

About this tool

Unicode Character Inspector breaks any text down to its individual Unicode characters and reveals exactly what is inside a string. Paste a block of text and the tool iterates code point by code point (collapsing surrogate pairs the way Array.from(string) does) and produces a row per character with the rendered glyph, its code point in U+XXXX hex notation and decimal, the Unicode general category (Lu uppercase letter, Ll lowercase letter, Nd decimal digit, Cf format, Cc control, Zs space separator, and the rest), the UTF-8 byte sequence (1 to 4 bytes), the UTF-16 code units (1 unit for the BMP, 2 for surrogate pairs in supplementary planes), the JavaScript escape sequence (\uXXXX or \u{XXXXX}), and the HTML numeric entity. Rows for invisible and control characters (zero-width space U+200B, zero-width joiner, zero-width non-joiner, byte order mark U+FEFF, no-break space U+00A0, soft hyphen, left-to-right and right-to-left marks, paragraph and line separators, and other format characters) are highlighted in amber and labelled with a friendly name so you can see at a glance which copy-paste glitch is breaking your form validation, your CSV import, your URL slug, or your typography. A summary panel at the top reports the total number of code points, the UTF-16 length (the value JavaScript .length returns), the UTF-8 byte length (the size when encoded as UTF-8), the unique code point count, and the count of invisible, control, ASCII, and non-ASCII characters in the input. Lookup mode flips the tool into a reverse search: type any code point in U+1F600, 0x1F600, 1F600, decimal, \u00e9, \u{1F600}, €, —, or © form, or a friendly name like 'zero width space', 'em dash', 'BOM', 'NBSP', or 'replacement character', and the tool resolves it back to the matching character with the same full breakdown table. Useful for debugging encoding bugs, fixing smart-quote and ligature glitches, finding hidden zero-width separators in pasted code, sanity-checking emoji ZWJ sequences, computing the real byte size of a tweet or push notification, generating JavaScript and HTML escape sequences for tricky characters, exploring combining marks and surrogate pairs while you learn UTF-16, and answering 'what character is this and why is my regex not matching it' in seconds. Everything happens in your browser using native Unicode property regex (\p{...}) and JavaScript code point APIs, so the strings you paste here never leave your device.

Free to use. Works in your browser. No signup, no login.

Related tools

Unicode Character Inspector

How to use

About this tool

You may also like

Character Counter

Text Cleaner

HTML Entity Encoder Decoder

URL Encoder Decoder

Find and Replace