Re: Getting Off The Ground

On 9/18/2015 3:32 PM, Andrew Hankinson wrote:

Thanks for the pointers to the various projects and their goals. Will 
make interesting reading.

> I have experimented (successfully) with round-tripping MEI XML to JSON and back to XML, so there is no reason why this is not possible -- a hierarchical markup language is a hierarchical markup language. There's also no reason it could not go to YAML or any other representation.

Agreed, and I can't presently think of a case where XML is 
insufficiently expressive, just that it is excessively wordy, and, for 
web/javascript use, requires an extra layer of parsing to obtain 
something useful to javascript. But you have a mental model below that 
makes JSON equivalence of XML appear more complex than it needs to be. 
We don't need to be encumbered with misperceptions of that sort.

The point I was trying to make is that if the standard wants to be 
web-friendly, maybe it needs to have its primary expression be 
web-friendly JSON.

That doesn't preclude having a standard translation between JSON and XML 
so that one can borrow and benefit from schema tools and validation, if 
those are more available and more robust than JSON schema tools, 
validators, and parsers.

I have seen what appear to be iron-clad JSON parsers for nearly every 
language, so XML doesn't seem to have an advantage there. While I have 
seen schema tools and validators "advertised" for JSON, I have no 
experience with them, nor how they compare to their XML equivalents, nor 
in how many different languages they may be available.

> The simplicity of JSON is somewhat deceptive, though. To get the same functionality as XML you need special treatment of attributes, text, and nested tags which are concepts that are not in many "markup" languages. So, an XML snippet like:
>
> <foo bar="baz">This is some <b>bold</b>text</foo>
>
> Would need to expand to something like:
>
> {
>    "foo": [{
>      "@attributes": { "bar": "baz" },
>      "@value": "This is some",
>      "@tail": "text",
>      "b": [{
>         "@value": "bold"
>      }]
>    }]
> }
>
> Which quickly becomes rather unreadable.

But that is because it isn't done "correctly", not being normalized 
properly, and wouldn't easily support

<foo bar="baz">This <snicklefritze>is</snicklefritze> some 
<b>bold</b>text</foo>

This is a rather deeper topic than I expected in this discussion, but 
the devil is in the details, so if there are misperceptions about 
details, it is better to clear them up early than to let them influence 
design choices.

An XML nested tag consists of 3 subitems: a tag name, a collection of 
attribute values indexed by key, and an array of content.

A simple convention is to represent an XML tag by an array, where the 
tag and attributes are the first and second items, and remaining items 
are content, where content is either string or child tag.

    [
         "foo",
         {"bar": "baz" },
         "This is some",
         [
               "b",
               {},
               "bold",
         ],
         "text",
     ]


The above convention probably produces the most compact representation 
of XML in JSON, but it may or may not be perceived as friendly to access 
from programs, or to keep track of when mentally interpreting it. A 
different convention would be to add noise words to name the three items 
associatively, with content still being either string or subtag:

    {
         "tag": "foo",
         "attr": {"bar": "baz" },
         "data": [
             "This is some",
             {
                  "tag": "b",
                   "attr": {},
                   "data": [
                         "bold",
                   ]
             },
             "text",
         ]
     }

> -Andrew

Glenn

Received on Friday, 18 September 2015 23:30:24 UTC