Re: [community-group] [Format] Token name character restrictions (#60)

In the spirit of the DTCG's "inclusive" principle, my vote would be to _not_ impose any restrictions. Folks around the world may well want to name tokens and groups using their native languages and scripts and those won't always be A - Z. And, if someone wants to throw some emoji or Klingon in there too, then who am I to say no :-)

This approach does introduce some potential challenges though:

## Name clashes after conversions
Export tools like Style Dictionary and Theo will typically take a source token name and convert it into a variable name that follows the rules and conventions of the programming language they're outputting. E.g. a token named "Acid Green" might get turned into camel case `acidGreen` for JavaScript or kebab case `$acid-green` for SASS.

These conversions will often trim off leading and trailing whitespace in the source name. Also, the case may be ignored (e.g. when outputting to kebab case all letters are converted to lowercase). So there can be situations where a token file has different tokens with names that differ only in things like whitespace or case. Their converted names in the output might therefore be identical and cause clashes. For example:

```jsonc
{
   "      Acid Green": { "type": "color", "value": "#82ff4d" },
   "acid    green   ": { "type": "color", "value": "#abcdef" },
}
```

Might get exported as SASS variables like this:

```scss
$acid-green: #82ff4d;
$acid-green: #abcdef;
```
...which probably not what you want.

That being said, these scenarios are probably rare. Also, it might only be certain kinds of tools (e.g. export tools) fir which this is even a problem - a GUI tool might just display the token name as is and not need to convert it in any way. I therefore think it's reasonable to just let tools deal with this as they see fit. E.g. report a warning or error to the user, or tweak the output name to keep it unique (e.g. append a number to it something like that).


## Invalid names after conversions
Some programming languages may impose restrictions on their variable names (e.g. in JavaScript a variable name cannot begin with a digit). It's therefore likely that, without restrictions, some token names might not be easily converted to variable names in code.

However, as with the previous point, I think this is something tools can deal with. They could warn the user, omit characters that aren't allowed in the syntax of the target language, prepend or append things as needed, etc.

If a design system team is facing issues because a name they've given a token doesn't work in one or more platforms they need to support, then perhaps they need to choose a different name for their token.


## Implications for aliases
Perhaps not an issue per se, but just wanted to point this out for completeness. If we do not restrict the characters that can be used for token and group names, then we need to cater for that in our reference syntax:

* Characters that are part of the reference syntax (`{`, `.` and `}`) need to be escaped somehow if they occur in a token or group name that is used in the reference.
* Each segment of the reference needs to be an _exact_ match to the corresponding token or group name (so whitespace, case, etc. all need to match exactly).

For example:

```jsonc
{
  "group 1": {
    "token 1": {
      "value": 42
    },
    "   Awkward.to{REFERENCE}  ": {
      "value": 99
    }
  },

  "alias 1": {
    "value": "{group 1.token 1}"
    // This alias token's value will resolve to 42
  },

  "alias 2": {
    // Perhaps we can use backslashes to escape special characters in a reference?
    "value": "{group 1.   Awkward\.to\{REFERENCE\}  }"
    // This alias token's value will resolve to 99
  }
}
```

## Summary
As I mentioned at the start, I don't think the format needs to or should restrict what characters are allowed to be used in names. However, I do think we need to:

* Update our alias section to include a way of escaping special characters inside a reference, and to explicitly state that the segments must be exact matches to the names they reference
* Provide advice for authors to help them choose names that avoid the issues I've highlighted here

-- 
GitHub Notification of comment by c1rrus
Please view or discuss this issue at https://github.com/design-tokens/community-group/issues/60#issuecomment-921322536 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 16 September 2021 23:08:53 UTC