Re: If not JSON, what then ?

I don't know when Poul-Henning will catch a train the next time...

But I was interested in whether CDDL can be used as a specification
language here, because it is always good to look at more use cases.
So let's see (mostly untested at this point because I have no idea
whether anybody wants to use CDDL in this context, but examples should
work with the CDDL tool today):

> Schemas
> =======
> 
> There needs a "ABNF"-parallel to specify what is mandatory and
> allowed for these headers in "common structure".

Indeed, CDDL is essentially ABNF ported to tree grammars.

The top-level data model of the proposed format could be expressed as:

header-value = [* dict-element]
dict-element = [name, value-map]
name = text                          ; possibly restricted
value-map = {* name => value}        ; empty by default
value = text / bytes / number / time / value-map
                                     ; add as needed

> Ideally this should be in machine-readable format, so that
> validation tools and parser-code can be produced without
> (too much) human intervation.  I'm tempted to say we should
> make the schemas JSON, but then we need to write JSON schemas
> for our schemas :-/

For -09, we are discussing to add a separate machine-readable (JSON)
encoding to be used by tools, in addition to the human-readable format
to be used by humans.  (No intention to make both the same, that would
be a classical mistake.)

> Since schemas basically restict what you are allowed to
> express, we need to examine and think about what restrictions
> we want to be able to impose, before we design the schema.
> 
> This is the least thought about part of this document, since
> the train is now in Lund:

OK, let's see what restrictions CDDL offers today (they are called
"annotations" there, a not so bright name to be changed):

> Unicode strings:
> ----------------
> 
> * Limit by (UTF-8) encoded length.
>  Ie: a resource restriction, not a typographical restriction.

That would be .size:

dns-label = text .size (1..63)

> * Limit by codepoints
>  Example: Allow only "0-9" and "a-f"
>  The specification of code-points should be list of codepoint
>  ranges.  (Ascii strings could be defined this way)

Today this is generally done via regexps.

> * Limit by allowed strings
>  ie: Allow only "North", "South", "East" and "West"

Those are typically done constructively:

direction = "North" / "South" / "East" / "West"

Of course, regexps can do that, too, if needed.

> Tokens
> ------
> 
> * Limit by codepoints
>  Example: Allow only "A-Z"

token1 = text .regexp "[A-Z]"

> * Limit by length
>  Example: Max 7 characters

CDDL can only count characters (as opposed to bytes) employing regexps
right now.
Another extension may be needed if there indeed is a good use case for
counting characters.

> * Limit by pattern
>  Example: "A-Z" "a-z" "-" "0-9" "0-9"
>  (use ABNF to specify ?)

(Regexps, again)

> * Limit by well known set
>  Example: Token must be ISO3166-1 country code
>  Example: Token must be in IANA FooBar registry

Interesting.  There currently is no formal interface to IANA (or ISO)
registries; we don't have an informal escape like ABNF has in prose-val.
"Annotations" could be added as needed and they are close.

> Qualified Tokens
> ----------------
> 
> * Limit each of the two component tokens as above.
>  
> Binary Blob
> -----------
> 
> * Limit by length in bytes
>  Example: 128 bytes
>  Example: 16-64 or 80 bytes

blob1 = bytes .size (16..64 / 80)

> Number
> ------
> 
> * Limit resolution
>  Example: exactly 3 decimal digits

Would need a new CDDL "annotation", say

q = number .decimals 3

> * Limit range
>  Example: [2.716 ... 3.1415]

etopi = 2.718..3.1415

> Integer
> -------
> 
> * Limit range
>  Example [0 ... 65535]

ex1 = 0..65535
ex2 = uint .size 2

> Timestamp
> ---------
> 
> (I cant thing of usable restrictions here)

ts = uint   ; (or whatever type of timestamp you want;
            ;  `time` as a POSIX time and RFC3339 dates are built in)

Grüße, Carsten

Received on Tuesday, 2 August 2016 13:52:35 UTC