RE: DOM in _event.data and KVPs

I want to return to a problem that Stefan raised earlier: how does the processor determine if data consists of key-value pairs or not?  This is particularly a problem with external events, which can have come from any source, possibly over an event i/o processor that hasn't been defined yet.  The cases of XML and JSON aren't too difficult, because they have a well-defined format, but KVPs do not.  However there are two cases where data is unambiguously KVPs:  a) when it is specified via <param> in <donedata> b) when it is encoded as POST parameters over the basichttp event i/o processor.  

I think we need to separate out the heuristics for determining what the data is from the statements about how to represent a specific format.  Furthermore these heuristics have to be specific to the datamodel (it doesn't make sense to tell the XPath datamodel to interpret something as JSON, for example.)  I think that we could say something along the following lines for the ECMAScript datamodel:


_event.data is populated with data provided by an external event or by <param> or <content>.  The source of the data may have provided some indication of  its type, but is not required to do so.  The processor MUST apply the following heuristics to determine the type of the data:
1.  The processor MUST interpret the data as kvps if was specified by <param> inside <donedata> or if it is provided as encoded POST parameters in an event delivered by the basic http event i/o processor.
2. Otherwise if the Processor supports JSON and is able interpret the data as JSON, it SHOULD do so.
3. Otherwise if the Processor is able interpret the data as a valid XML document, it SHOULD do so.
4. Otherwise if the Processor is able to able to interpret the data as kvps, it SHOULD do so. 
5. Otherwise the Processor MUST interpret the data as a space-normalized string.  

It would be possible to combine clauses 1 and 4, but with the current ordering the most specific and easy-to-detect cases come first.  (If we just look at what's in the current spec, we could drop clause 4 altogether, but then the basic http event i/o processor would be the only one that could deliver kvps.)

- Jim
-----Original Message-----
From: Stefan Radomski [mailto:radomski@tk.informatik.tu-darmstadt.de] 
Sent: Monday, April 01, 2013 4:41 PM
To: David Junger
Cc: VBWG Public (www-voice@w3.org)
Subject: Re: DOM in _event.data and KVPs

On Apr 1, 2013, at 9:54 PM, David Junger <tffy@free.fr>
 wrote:

> I agree that it's nice for authors to know there's always going to be a Document, but using <content> as the root doesn't work semantically and is inconsistent with what you'd get if you used 'src' instead of <content> (the spec is quite clear that those should be equivalent).

That's a good point.

> Jim's solution is not the most flexible but it's not too hard to wrap a list of nodes into a root element, and it could be done as in HTML parsing: if the parser is told (using the 'type' attribute on <content>), or guesses, the document type, then it can insert the missing root. Then we'd have problems only for node lists where the type is unspecified (or unsupported) and heuristics fail.
> 

Well, let's look at our options here:

1. Introduce an additional root element
- Inapplicable because, application developers would expect the same DOM whether they specify it via content.src or inline. That's your argument from above and I agree.

2. Realize that multi-rooted XML in content is itself not a XML document and treat it as space-normalized strings
- Applicable but somewhat unintuitive for application developers. This is Jim's suggestion.

3. Always represent _event.data as a DocumentFragment
- This penalizes application developers who use only a single root element as all the methods from document are unavailable.

4. Have some heuristics together with content.type for clarification
- I am not sure, whether this is suited to resolve the issue, if you specify content.type as XML and have it multi-rooted you'd still need the interpreter introduce a root element. Something which I'd prefer the user to do.

5. Treat multi-rooted XML in <content> as an error
- Not sure whether this is preferable to any variant, but it's one way to make it explicit for application developers. The implied assumption is that no one wants space normalized XML fragments.

6. Only take the first element of content
- This still might leave application developers wondering what happened to the rest of the document but provides a "path to a solution" for application developers as they will see a DOM in _event.data to start with.

Personally, I'd go for:
1. Multi-rooted XML in content?
2. Log a message and take first element as root of new document.

This is in line with what our interpreter does if we encounter multiple elements where only a single one is allowed (e.g. <content> itself).

> 
>> As the expr attribute of content is already subject to evaluation by the datamodel and it's somewhat obvious to return JSON for nested ECMAScript data structures in expr
> 
> ECMAScript objects don't map to JSON in general. And passing actual ECMAScript references to local targets is often preferable even when the structure can be stringified.

Huh? I was under the impression that e.g. an invoked SCXML session does not share the datamodel with its parent? In fact it could run on a completely different host. So passing "references to local targets" would just end up as a potentially undeclared identifier in the other component, if I understood you correctly. It might work if you cannot separate datamodels due to running in the same JS context, but it is nothing the standard allows as far as I read it.

Regards
Stefan

Received on Tuesday, 2 April 2013 20:54:06 UTC