Byte Counter — UTF-8, UTF-16 & UTF-32 Byte Calculator

Need the exact byte size of a string? Paste any text below to see its length in UTF-8, UTF-16, and UTF-32 bytes alongside character and code-point counts.

Byte Counter

UTF-8 byte calculator

Enter text
0
UTF-8 bytes
0
UTF-16 bytes
0
UTF-32 bytes
0
Characters
0
Code points

UTF-8 uses 1–4 bytes per character: ASCII is 1 byte, accented Latin 2, most CJK 3, and emoji 4.

How many bytes is your text?

Characters and bytes are not the same thing. A string’s byte size depends on the encoding: UTF-8 uses one byte for ASCII and up to four for emoji, UTF-16 uses two or four, and UTF-32 always uses four per code point. Paste your text above and this tool reports all three at once, so you know exactly how much space a string occupies.

Why byte size matters

Byte limits show up everywhere developers work: database column sizes (VARCHAR is often measured in bytes), HTTP headers and cookies, JSON payload budgets, SMS segments, QR codes, and API field caps. A 20-character string can be 20 bytes or 80 bytes depending on the script, so counting characters alone can silently blow a limit. UTF-8 is the default for the web, JSON, and most databases.

Characters, code points, and bytes

This tool separates three different counts. Characters are grapheme clusters — what a reader perceives as one symbol, including emoji built from several code points. Code points are individual Unicode scalar values. Bytes are the encoded storage size. An emoji like 👍 is one character, one code point, and four UTF-8 bytes; a flag emoji is one character but two code points and eight UTF-8 bytes.

Private and instant

Everything is computed in your browser using the standard TextEncoder, so the counts match exactly what your server or database will store. Nothing you paste is uploaded, which makes it safe for tokens, keys, and other sensitive strings. Edit the text and every figure updates live.

FAQs

How many bytes is one character in UTF-8?

Between 1 and 4. ASCII letters and digits are 1 byte, accented Latin and Greek/Cyrillic are 2, most Chinese/Japanese/Korean characters are 3, and emoji and rarer symbols are 4.

What is the difference between characters, code points, and bytes?

A character (grapheme) is what you see as one symbol; a code point is one Unicode scalar value; bytes are the encoded size. A single emoji can be 1 character, 1 code point, and 4 UTF-8 bytes — and emoji built from several code points use even more.

Which encoding should I count for a database?

Usually UTF-8, the default for the web, JSON, and most modern databases. Check whether your column length is defined in bytes or characters — UTF-8 multibyte characters can exceed a byte-based limit.

Is my text sent to a server?

No. Byte counting runs entirely in your browser with the standard TextEncoder, so it is safe for API keys, tokens, and other sensitive strings.

Last updated: June 14, 2026

Ready to see how your post really looks?

Paste your draft into the PostTruncate editor and instantly see live previews across LinkedIn, X, Instagram, Facebook, Threads, and TikTok — with fold lines, thread splits, and limit warnings updating as you type. Free, instant, and nothing ever leaves your browser.

Start writing — it's free