> I believe this is also the Unicode recommendation when context doesn't determi... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		zinekeller on Nov 22, 2021 \| parent \| context \| favorite \| on: TIL the assumption that string length does not cha... > I believe this is also the Unicode recommendation when context doesn't determine a different algorithm to read. Except that emojis are universally two "characters", even those that are encoded as several codepoints. Also, non-composite Korean jamo versus composited jamo.

garmaine on Nov 22, 2021 [–]

Like this: “:)” ?

Japanese kana also count as two characters. Which they largely are when romanized, on average. Korean isn’t identical but the information density is approximately the same. Good enough to approximate as such and have a consistent rule.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact