GSM-7 vs UCS-2
SMS has two encodings:| Encoding | Character set | Per-segment cap (single) | Per-segment cap (concatenated) |
|---|---|---|---|
| GSM-7 | Latin letters, digits, basic punctuation | 160 chars | 153 chars |
| UCS-2 | Unicode (emoji, Arabic, Chinese, …) | 70 chars | 67 chars |
Examples
| Body | Length | Encoding | Segments |
|---|---|---|---|
"Hello Chidi, your ride arrives in 3 min." | 42 | GSM-7 | 1 |
| 160 chars of Latin text | 160 | GSM-7 | 1 |
| 161 chars of Latin text | 161 | GSM-7 | 2 (99 + 62) |
"Your ride is here! 🚗" | 20 | UCS-2 (because of 🚗) | 1 |
"Your ride is here! 🚗 " + repeated 60 chars | 80 | UCS-2 | 2 |
POST /v1/sms-templates/:id/render for templates, or compute it locally:
Why the count drops on concatenation
Multi-segment SMS uses a small header on each part to tell the phone how to reassemble the message. That header eats 7 chars in GSM-7 and 3 characters in UCS-2 — so a “160-char” segment really only has 153 of your characters to work with once it’s concatenated.Cost implications
A friendly emoji can 2x or 3x your cost if you weren’t expecting UCS-2:Best practices
Keep transactional under 160 GSM-7
Tight, no emoji, no fancy quotes. One segment per send.
Ascii-ify URLs
Don’t use URL shorteners with non-Latin chars — those force UCS-2.
Watch smart quotes
Word / Notes replace
" with " and ' with ' — both UCS-2.Preview campaign copy
Use the template render endpoint in CI to assert segment count ≤ 1.
Handy bytes worth avoiding
These tiny Unicode characters force the whole body to UCS-2:"(U+201C) and"(U+201D) — curly quotes'(U+2019) — typographic apostrophe…(U+2026) — ellipsis–(U+2013) and—(U+2014) — en/em dashes- Non-breaking space (U+00A0)
", ', ..., -, --, and plain space respectively.