Re: question about IRI spec

* Jeremy Carroll wrote:
>For B, my code does an initial pass of the characters in each component, 
>looking for problematic characters e.g. "--" in the host, or "/./" in 
>the path. If it finds such problematic characters it may trigger more 
>expensive processing (e.g. IDNA syntax checking). What are the 
>characters I should be looking for in the component? i.e. please suggest 
>a set of characters is such that if none of these characters is in the 
>IRI then it is necessarily in NKFC? An example would be the set 
>[^\x20-\x7F] which would at least allow me to avoid NKFC checking for 
>URIs. Again I am expecting an answer in terms of some table from 
>unicode.org. e.g. if each character is neither a compatibility character 
>nor a composing character then the component is in NKFC.

http://www.unicode.org/unicode/reports/tr15/ has a quickCheck function
for that. I guess libraries like ICU already offer something like it.
