Skip to main content

String Split That Does Not Butcher Emojis Using Intl.Segmenter Precision Boxing

Hi, everyone. Let's meet Intl.Segmenter. Hello!

(Intl.Segmenter) G'day, mate.

Say, what does Intl mean?

(Intl.Segmenter) Inter-nasho, mate.

Fascinating. But it has "l".

(Intl.Segmenter) You bet, I fascinate... the area. Shorten it or bin it. You know mate, I've been here since April 2024. I'm the Ecky Nasho Thingo.

(Sips coconut water.)

Ecky... Nasho Thingo. What is that?

(Intl.Segmenter) Ecky, E-C-M-A. Nasho, Inter-nasho. Thingo is a thing. Yeah nah.

(Another sips of coconut water.)

What a... bit. And my word, you're quite the modern feature! So... what exactly are you for?

(Intl.Segmenter) I chop text like a beaut, mate. Whether it's splitting by what folks see as a single character — like an emoji or accent, whole words even when there's no space... lookin' at you, Thai and Lao 🧐...or full-on sentences for counting, parsing, or translating. All while minding Unicode rules and cultural quirks, like Chinese where one squiggle means a whole word, or those fancy emoji family reunions, or even Arabic running right-to-left like a kangaroo backwards.

(Cracks open a stubby.)

Kangaroo backwards. 🤔 Sir, you drank coconut water, then opened a small bottle of lager. Do forgive me — is that spoken for?

(Intl.Segmenter) Deadset. I speak of a stubby every time. Or, as you, mate, called it "a small bottle of larger". Make up your mind, small or larger? What am I, a...

Right. It's lager... with the actual "g", never mind. But... isn't it we, the left-to-right lot, who are actually the ones going "backwards"? You know, right handed, but writing from left to right?

(Intl.Segmenter) Mate, I saw a kangaroo walking backwards. That doesn't mean...

(Slips on the boxing gloves.)

Yes, yes. Let's try... you to split the characters from a text!

(Intl.Segmenter) Reckon you're good to go.


Snippet

This is the snippet.

(Intl.Segmenter) By crikey! That's a bloody mouthful comment, yeah? 👀

What comment? Oh, all right. Here's without the comment:

ℹ️ More details about Intl.Segmenter constructor on MDN.


Demonstration

We can see the different output when using the function above, or in the UI it's Grapheme, and the regular JavaScript .split("") (Normal).

Put emoji or other than alphabet to see the difference.

Sample: Lá Tí DØ 🧐 哎呀


Other

And this... is the snapshot when we pass undefined (empty argument), null, and so on.

(Intl.Segmenter) Yeah nah, I'm accommodating. Look how lovely my face is.

(Intl.Segmenter) Smoko run, mate? Macca's or servo sanga?

I don't... servo... motor? Mecca? Thank you, good sir. I believe that might be a wonderful... meaning.

(Intl.Segmenter) Bikkie?

Bickie?

(Intl.Segmenter) Barbie then?

Are you talking about machinery or action figure?

(Intl.Segmenter) HAHAHA. Yeah nah, I'm right.


Thanks for visiting. See you next time! 😃

Comments

Monkey Raptor uses biscuits. More info on Privacy Policy