Invisible Character Detector

How to use

Paste any text into the input area on the left. The scanner runs instantly using your browser's Unicode-aware iterator.
Read the per-character breakdown to see which codepoints are present in your input, sorted by frequency. Each row links to the codepoint, official Unicode name, and the family it belongs to.
Use the eight removal toggles to control which families of invisible characters get stripped. Defaults handle the common offenders; enable variation selectors and non-standard whitespace only when you want them gone.
Optionally enable Preserve zero-width joiners inside emoji to keep family and skin-tone emoji sequences intact, or Replace non-standard whitespace with a regular space to normalize NBSP and ideographic space to U+0020 instead of removing them.
Copy the cleaned output, or use Replace input with cleaned text to keep iterating on the result without juggling two panels.

About this tool

Invisible Character Detector scans any text for codepoints that produce no visible glyph but still travel through your editor, your forms, your databases, and your APIs. It catches the byte order mark (U+FEFF), the zero-width family (U+200B zero-width space, U+200C non-joiner, U+200D joiner, U+2060 word joiner, U+180E Mongolian vowel separator), the soft hyphen (U+00AD) and other format characters, the full bidirectional control set used in Trojan Source attacks (U+202A through U+202E and U+2066 through U+2069 plus the LTR and RTL marks), every variation selector (U+FE00 through U+FE0F and the supplementary block at U+E0100), tag characters at U+E0000 through U+E007F, the C0 and C1 control characters that show up when Windows-1252 text is mistaken for UTF-8, and the non-standard whitespace that looks identical to a regular space but breaks equality checks (no-break space, narrow no-break space, ideographic space, en, em, thin, hair, figure, punctuation, three-per-em, four-per-em, six-per-em, medium mathematical, line and paragraph separators, and the Ogham space mark). Each occurrence is reported with its codepoint, official Unicode name, category, line, and column position so you can find it in the source. A per-character breakdown table groups the same codepoint together with a count so you can see at a glance which two or three offenders are causing the problem. Eight removal toggles let you strip categories selectively: keep variation selectors so emoji presentation stays intact, keep the zero-width joiner inside emoji sequences so family, profession, and skin-tone emoji survive, or convert non-standard whitespace to regular spaces instead of removing it entirely so the layout still reads naturally. Sensible defaults strip the common offenders (zero-width characters except inside emoji, BOM, soft hyphens, bidi controls, tag characters, and unusual control characters) and leave the rest alone unless you opt in. An inline preview replaces each invisible codepoint with a labeled badge so you can see exactly where the offenders live before you decide what to do with them. Useful for cleaning text pasted from rich editors and SaaS forms (Notion, Confluence, Google Docs, Slack, email), preparing data for CSV import and database lookup, debugging form validation failures, auditing source files for Trojan Source attacks, normalizing names and addresses that arrived through localized keyboards, and turning a copy that mysteriously fails string equality into one that just works. Everything runs locally in your browser using grapheme-aware iteration; the text you inspect is never uploaded to a server.

Free to use. Works in your browser. No signup, no login.

Related tools

How to use

About this tool

You may also like

Text Cleaner

Unicode Character Inspector

Find and Replace

Character Counter

UTF-8 Byte Counter

Letter Frequency Counter