agenda+ guidance on identifiers [I18N Action 207]

All,

I have an action item [1] to check our guidelines for clarity on Unicode 
subsets for identifiers. This is a quick summary of what I found and 
propose to do about it.

Specdev added section 8.3 relatively recently. This contains guidance 
for specs on "application internal identifiers" [2] and for identifiers 
that are not "application internal identifiers" (effectively everything 
else). Note that charmod-norm ("String Matching") is the source for 
section 8.3's guidance [3].

Application internal identifiers are meant to be "never shown to users 
and are always used for matching or processing within an application or 
protocol" and our guidance is that these should be case-insensitive 
printable ASCII. The 2119 keyword here is SHOULD.

Other identifiers we recommend (SHOULD) allow Unicode characters and be 
case and normalization sensitive.

We have some additional guidance that I won't go into here, mostly to do 
with non-character code points.

We are missing clear guidance on how spec authors should decide between 
these two regimes. We also don't provide guidance on how best to subset 
ASCII for identifiers. Note that identifiers in this context include the 
spec-local syntax or domain specific languages (DSL), as well as 
user-defined values (such as variable names).

We should probably develop guidelines to cover the decision tree. Very 
few DSLs are *never* shown to end users. The other factor is whether the 
syntax is machine generated from business objects ("data") that might 
include non-ASCII values. We want to stay out of situations like CSS has 
(with two regimes simultaneously). We also want to document best 
practices in choosing start and part characters and at least mention 
bidi controls (which present spoofing problems if not addressed).

I think the next step is to propose a pull request outlining proposed 
additional guidelines. That PR would also include some addition of 
internal pointers from other parts of specdev that talk about 
identifiers, notably section 6.2, to make navigation easier. I do not 
propose to edit charmod-norm at this time, although we probably could 
add a pointer back to specdev.

Look forward to discussing,

Addison

[1] https://github.com/w3c/i18n-actions/issues/207
[2] https://www.w3.org/TR/charmod-norm/#dfn-application-internal-identifier
[3] https://www.w3.org/TR/charmod-norm/#specifying-content-restrictions

-- 
Internationalization is not a feature.
It is an architecture.

Received on Wednesday, 11 March 2026 00:25:29 UTC