Why are hidden characters dangerous?

They can alter text rendering, bypass security filters, break parsers, or enable phishing attacks with look-alike URLs.

What characters are detected?

20+ types: ZWSP, ZWNJ, ZWJ, BOM, soft hyphen, RTL/LTR overrides, NBSP, En/Em/Thin/Hair spaces, line/paragraph separators, and more.

Can I remove hidden characters?

Yes. Click Clean to strip all invisible characters, then Copy Clean to get sanitized text.

How does the RTL Override attack work?

U+202E forces right-to-left rendering, making 'invoice_fdp.exe' display as 'invoice_exe.pdf' to trick users.

Is my text sent to a server?

No. All analysis runs in your browser. No data is uploaded or stored.

Why does my JSON fail to parse?

An invisible BOM (U+FEFF) at the start of the file is a common cause. Paste your JSON here to detect it.

Regular space vs non-breaking space?

Regular space (U+0020) allows line breaks; NBSP (U+00A0) prevents them. They look identical but are different characters.

Can hidden characters affect SEO?

Yes. Zero-width chars in titles, meta descriptions, or URLs can cause indexing issues and duplicate content problems.

Character Diagnostic Lab

Hidden Character Detector

Paste text → Scan → See hidden chars highlighted with color labels. Free, instant, private.

20+ TypesInstantFreePrivate

Scanner

Ctrl+Enter to scan

Test Examples

Load Sample Text

Zero-Width Spaces

Click to load into scanner

RTL Override Attack

Click to load into scanner

Mixed Hidden Chars

Click to load into scanner

Clean Text

Click to load into scanner

Reference

Hidden Character Reference

Name	Code	Risk	Purpose
Zero-Width Space	U+200B	high	Break long words without visible space
Zero-Width Non-Joiner	U+200C	high	Prevent ligature joining in Arabic/Indic scripts
Zero-Width Joiner	U+200D	high	Force ligature or join emoji sequences
Byte Order Mark	U+FEFF	medium	Mark file as UTF-8/UTF-16 (first byte)
Soft Hyphen	U+00AD	medium	Suggest hyphenation point for word breaks
RTL Override	U+202E	high	Force right-to-left text direction
Non-Breaking Space	U+00A0	low	Prevent line break between words
Line Separator	U+2028	medium	Unicode line separator (breaks JS strings)

Scenarios

When to Check for Hidden Characters

Suspicious URLs that look normal but redirect elsewhere — RTL override attacks reverse the displayed filename.

Text copied from PDFs or web pages often carries invisible formatting characters that break search and comparison.

Source code with zero-width spaces causes compilation errors that are impossible to spot visually.

JSON/CSV data with BOM or invisible joiners will fail parsing even when the content looks valid.

Usernames and passwords with hidden characters can bypass security filters or cause authentication failures.

Full Guide

What Are Hidden Unicode Characters?

Hidden characters are Unicode characters that have no visible glyph but still exist within text. They consume data storage, influence text layout and rendering, and can cause hard-to-diagnose bugs in programming, data processing, and security systems.

Unicode defines hundreds of control and formatting characters, but the most commonly encountered fall into three groups: zero-width characters (invisible spacers), special whitespace variants, and bidirectional (bidi) control characters used for right-to-left scripts.

Common Hidden Character Types

1. Zero-Width Characters

U+200B (Zero-Width Space), U+200C (ZWNJ), and U+200D (ZWJ) are the three most dangerous invisible characters. They take up zero visual space but alter text processing behavior. ZWJ is legitimately used in compound emoji sequences (family groups, flag sequences), but it is also exploited to create phishing URLs that appear identical to legitimate ones.

2. Byte Order Mark (BOM)

U+FEFF was originally designed to signal byte order in UTF-16 files. Today, it most often appears accidentally at the start of UTF-8 files saved by certain Windows text editors. A BOM in a JSON or CSV file causes parser errors that are extremely confusing because the first character of the file is invisible rather than the expected { or header text.

3. Bidirectional (Bidi) Characters

RTL Override (U+202E) forces text to render right-to-left. This is a well-known attack vector: a file named invoice_‮fdp.exe displays as invoice_exe.pdf on many operating systems, tricking users into executing a malicious binary they believe is a harmless PDF.

4. Special Whitespace

Non-Breaking Space (U+00A0), En Space, Em Space, Thin Space, and Hair Space look like regular spaces but have different widths and break behaviors. They commonly appear when copying text from PDFs, Word documents, or richly formatted web pages. String comparison fails silently: "hello world" with NBSP is not equal to "hello world" with a regular space, despite looking identical on screen.

Hidden Characters in Security

Invisible characters are a favored weapon in phishing and social engineering attacks. Common techniques include:

Homograph attacks: Combining hidden characters with Unicode look-alikes of Latin letters to forge domain names.
RTL override: Reversing the displayed file extension to hide the true format (.exe masquerading as .pdf).
Zero-width injection: Inserting ZWSP into usernames or passwords to bypass blacklists or security filters.
Steganographic watermarking: Embedding a unique combination of invisible characters in documents to trace the source of leaks.

Prevention in Code

Always normalize input with String.prototype.normalize() before processing.
Use a regex to strip control characters: /[-‍ ]/g
Validate file BOM before parsing JSON/CSV — strip from the first byte.
Display codepoints (U+XXXX) alongside each character when debugging text processing.
Use an editor with visible whitespace mode (VS Code: Toggle Render Whitespace).
Add a CI/CD pipeline step to check for hidden characters in source code and config files.

FAQ

Frequently Asked Questions

Related Tools

More Text Tools

Unicode Encoder and Decoder — \uXXXX U+XXXX HTML Entity UTF-8 Hex

Encode text to Unicode escape sequences (\uXXXX, U+XXXX, HTML entities, UTF-8 hex) and decode them back to readable text. Supports the full Unicode range including emoji, CJK, Vietnamese diacritics, and all scripts. Free, instant, runs entirely in your browser.

Find and Replace Text Online — Regex Search Replace Tool

Find and replace text online with regex support, case-sensitive matching, whole-word search, and highlighted matches. See match count and replace all or one at a time. Free browser-based tool for writers, developers, and data analysts.

Word Counter — Count Words Characters Sentences Paragraphs

Count words, characters (with and without spaces), sentences, paragraphs, reading time, speaking time, and top keywords in real time. Free word counter for writers, students, and SEO professionals.

Diff Checker — Compare Two Text Blocks Side by Side Online

Compare two blocks of text side by side instantly. See added lines highlighted green, deleted lines red, and unchanged lines gray. Line numbers on both sides, ignore-whitespace toggle, case-insensitive option. Free, private, runs in your browser.

JSON Validator & Formatter — Check & Fix JSON Syntax Online

Validate JSON syntax instantly and see the exact error with line and column numbers. Format (pretty-print) or minify valid JSON. Free, private, runs entirely in your browser.

Case Converter — UPPERCASE lowercase Title camelCase snake_case kebab-case

Convert text between 11 case formats: UPPERCASE, lowercase, Title Case, camelCase, PascalCase, snake_case, kebab-case, and more. Instant copy, free, no signup.

About Text Tools

Text tools handle the daily grind of working with strings, paragraphs, and documents: counting words, reversing characters, transforming case, generating slugs, splitting long text, previewing Markdown. These replace separate desktop apps and complex CLI commands with a single URL you can bookmark and use without setup.

Why it matters

Writers, editors, and content teams work with text constraints everywhere — Twitter's 280-char limit, LinkedIn's 1,300-char optimal post, academic abstracts of 250 words, SEO meta descriptions capped at 155. A word counter that shows characters (with and without spaces), words, sentences, paragraphs, and reading time lets you hit platform specs without switching between tools.

Privacy and safety

Text tools process input entirely in your browser. Your blog draft, legal contract, or confidential email never leaves your device. Even the word counter doesn't transmit your text — it runs a simple counting function locally, which is actually all that's needed. If a text tool claims to 'process' your text on their server, the scope for data leakage is enormous and almost never justified.

Best practices

For SEO titles, aim for 50-60 characters including spaces (Google truncates longer titles)
Meta descriptions work best at 150-155 characters — Google has been showing ~160 on desktop, ~120 on mobile
When generating slugs, keep them short (3-5 words), all lowercase, hyphens-not-underscores, avoid stop words
Markdown preview is useful BEFORE publishing to verify headings, links, and lists render correctly on the target platform