> I believe this is also the Unicode recommendation when context doesn't determine a different algorithm to read.
Except that emojis are universally two "characters", even those that are encoded as several codepoints. Also, non-composite Korean jamo versus composited jamo.
Japanese kana also count as two characters. Which they largely are when romanized, on average. Korean isn’t identical but the information density is approximately the same. Good enough to approximate as such and have a consistent rule.
Except that emojis are universally two "characters", even those that are encoded as several codepoints. Also, non-composite Korean jamo versus composited jamo.