Text Tools
Unicode Character Inspector
Inspect any text character by character. See code points, UTF-8 and UTF-16 bytes, escapes, HTML entities, and reveal hidden zero-width characters. No signup.
Mode
Paste text and inspect every character.Try a sample
Code points
0
Iterated by Unicode scalar
UTF-16 code units
0
What .length returns in JS
UTF-8 bytes
0
When encoded as UTF-8
Unique code points
0
Distinct characters
Invisible / format
0
Zero-width, format, or non-ASCII space
Control
0
ASCII control characters
ASCII
0
Code points below U+0080
Non-ASCII
0
Code points U+0080 and above
Per-character breakdown
0 rowsType or paste text above to see every character broken down.
What the columns mean
Code point
Unicode scalar value, written U+XXXX in hex
UTF-8
Bytes when the character is encoded in UTF-8 (1 to 4)
UTF-16
Code units in UTF-16. Two units mean a surrogate pair
Category
Unicode general category, for example Lu (uppercase letter)
Escape
JavaScript escape sequence (\\uXXXX or \\u{XXXXX})
HTML
HTML numeric character reference, for example ©
How to use
- Open Inspect text mode, then paste any text you want to break down into the input area. Try a sample like Smart quotes or Hidden zero-width to see the table in action.
- Read the summary panel for total code points, UTF-16 length, UTF-8 byte size, and the count of invisible, control, ASCII, and non-ASCII characters.
- Scan the per-character table. Rows in amber are invisible or control characters with a friendly name, so you can spot the zero-width space or BOM that is breaking your input.
- Switch to Lookup code point mode to convert in the other direction. Enter U+1F600, 128512, \u00e9, —, ©, or a name like 'zero width space' and the tool resolves it to the matching character with full byte and escape data.
- Use Copy character or Copy input to grab any value, or click an example shortcut to load a known code point quickly.
About this tool
Unicode Character Inspector breaks any text down to its individual Unicode characters and reveals exactly what is inside a string. Paste a block of text and the tool iterates code point by code point (collapsing surrogate pairs the way Array.from(string) does) and produces a row per character with the rendered glyph, its code point in U+XXXX hex notation and decimal, the Unicode general category (Lu uppercase letter, Ll lowercase letter, Nd decimal digit, Cf format, Cc control, Zs space separator, and the rest), the UTF-8 byte sequence (1 to 4 bytes), the UTF-16 code units (1 unit for the BMP, 2 for surrogate pairs in supplementary planes), the JavaScript escape sequence (\uXXXX or \u{XXXXX}), and the HTML numeric entity. Rows for invisible and control characters (zero-width space U+200B, zero-width joiner, zero-width non-joiner, byte order mark U+FEFF, no-break space U+00A0, soft hyphen, left-to-right and right-to-left marks, paragraph and line separators, and other format characters) are highlighted in amber and labelled with a friendly name so you can see at a glance which copy-paste glitch is breaking your form validation, your CSV import, your URL slug, or your typography. A summary panel at the top reports the total number of code points, the UTF-16 length (the value JavaScript .length returns), the UTF-8 byte length (the size when encoded as UTF-8), the unique code point count, and the count of invisible, control, ASCII, and non-ASCII characters in the input. Lookup mode flips the tool into a reverse search: type any code point in U+1F600, 0x1F600, 1F600, decimal, \u00e9, \u{1F600}, €, —, or © form, or a friendly name like 'zero width space', 'em dash', 'BOM', 'NBSP', or 'replacement character', and the tool resolves it back to the matching character with the same full breakdown table. Useful for debugging encoding bugs, fixing smart-quote and ligature glitches, finding hidden zero-width separators in pasted code, sanity-checking emoji ZWJ sequences, computing the real byte size of a tweet or push notification, generating JavaScript and HTML escape sequences for tricky characters, exploring combining marks and surrogate pairs while you learn UTF-16, and answering 'what character is this and why is my regex not matching it' in seconds. Everything happens in your browser using native Unicode property regex (\p{...}) and JavaScript code point APIs, so the strings you paste here never leave your device.
Free to use. Works in your browser. No signup, no login.
Related tools
You may also like
Character Counter
Detailed character, letter, number, space, and line counts.
Open tool
TextText Cleaner
Remove duplicate lines, blank lines, extra spaces, tabs, and invisible characters.
Open tool
DeveloperHTML Entity Encoder Decoder
Two-way HTML entity encoder and decoder with named, decimal, and hex modes.
Open tool
DeveloperURL Encoder Decoder
Encode and decode percent-encoded URLs.
Open tool
TextFind and Replace
Find and replace text in plain or regex mode with live match highlighting.
Open tool