text-icu package provides bindings for the ICU library. The library author notes that there can be some memory overheads of copying Haskell memory area (Text) to a fixed memory area for FFI to ICU library. text-icu provides automatic buffer mangement. So, you don’t need to mess around with Ptr and raw heap memory allocation, but there is a usually expected overhead.

kyagrd@kyahp:~/cscs/text-icu-ko$ ghci -XOverloadedStrings
GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim … linking … done.
Loading package integer-gmp … linking … done.
Loading package base … linking … done.
Prelude Data.Text.ICU> :m + Data.Text.ICU.Normalize
Prelude Data.Text.ICU Data.Text.ICU.Normalize>

By NFKD (compatibility decomposition) and NFKC (compatibility composition), Hangul chosung jamo can be equated with Hangul compatibility jamo, as follows:

…> (normalize NFD “나”,normalize NFD “ㄴㅏ”)
(“\4354\4449”,“\12596\12623”)
…> (normalize NFKD “나”,normalize NFKD “ㄴㅏ”)
(“\4354\4449”,“\4354\4449”)
…> normalize NFKD “나” == normalize NFKD “ㄴㅏ”
True

However, Unicode standard does not seem to provide some relation between jongsung jamo and compatibility jamo (hence, cannot expect ICU library to have such faclility).

…> (“난”,“ㄴㅏㄴ”)
(“\45212”,“\12596\12623\12596”)
…> (normalize NFD “난”,normalize NFD “ㄴㅏㄴ”)
(“\4354\4449\4523”,“\12596\12623\12596”)
…> (normalize NFKD “난”,normalize NFKD “ㄴㅏㄴ”)
(“\4354\4449\4523”,“\4354\4449\4354”)
…> normalize NFKD “난” == normalize NFKD “ㄴㅏㄴ”
False

So, in order to define an operator like “난” =:= “ㄴㅏㄴ” to be True, one needs to implement by themselves refering to the Hangule related unicode codepage.

References:

#langdev channel at Ozinger IRC network

UTF8 한글 문자열을 첫가끝 낱자(자소)로 분해하기

http://hackage.haskell.org/package/text-icu