Re: ISSUE-29: Proposal to resolve DOM origin generalization from Nathan on 2010-08-12 (public-rdfa-wg@w3.org from August 2010)

From: Nathan <nathan@webr3.org>
Date: Thu, 12 Aug 2010 15:46:36 +0100
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <4C64094C.4020701@webr3.org>
Hi Manu,

Overall I'm keen to see the source/origin retrieved by methods rather 
than being held in properties; primarily to side skirt any comparison 
issues and align with traditional RDF.

The only proposals I have to put forward is to create a new method which 
matches store.filter() but returns Elements/Nodes rather than Triples - 
namely:
   store.filterElements(s,p,o,filter);
or
   document.getElementsByFilter(s,p,o,filter);

or perhaps more simply, simply add a 'returnElements' parameter to the 
store.filter() method / where appropriate thus allowing developers to 
choose whether they want RDFTriples or DOMNodes returned.

This allows for the same desired functionality to be implemented simply, 
whilst avoiding any property issues - afaict.

Further comments in line:

Manu Sporny wrote:
> On 08/09/2010 12:17 AM, Manu Sporny wrote:
>> 3) ISSUE-29: DOM origin generalization (on Manu)
>>    http://www.w3.org/2010/02/rdfa/track/issues/29
>>
>> We need to re-think how .origin is exposed via the RDFa API as well
>> as what it means for a triple to have an origin. .origin also doesn't
>> make sense if the RDFa API is not implemented in a DOM environment.
> 
> In a previous spec, the .origin property was specified for the subject,
> predicate, and object for a triple. This had three problems:
> 
> 1. .origin didn't make sense in a non-DOM environment.
> 2. Just specifying the .origin wasn't very flexible when it came to
>    carrying other data that may be important to developers.
> 3. The name "origin" could be confused with the Origin property in
>    HTTP.
> 
> This resulted in several changes to the editors draft spec:
> 
> 1. .origin was renamed to .source and placed in a dictionary called
>    .info for each subject, predicate and object in a triple.
> 2. .source was optional, so non-DOM environments would still be able
>    to be compliant with the RDFa API.
> 3. .info could be used to carry any arbitrary developer information.
> 4. .source would not be easily confused with other DOM/HTTP concepts.
> 
> Both Ivan and Nathan raised concerns that comparing triples would cause
> confusion because of the .info property on each subject, predicate,
> object in a triple. Toby mentioned that adding the .info property to the
> triple would solve that issue. Ivan and Nathan stated that it would
> still make triple-to-triple comparison difficult.
> 
> I had mentioned that we can state that default comparisons should ignore
> the .info property altogether unless the developer specifically overrode
> the comparison operator to take the .info property into account.

Sadly though you can't modify how the comparison == operator works in 
many languages, and the only (afaik) to compensate for this would be to 
introduce an object.equals(o) method.

> Mark suggested removing the .info property from subjects, predicates,
> objects and triples and migrating it to the Property Group object. This
> way, comparison wouldn't be affected and the developer could specify
> whether or not they wanted to retrieve the DOM node associated with a
> particular part of a Property Group. I forgot that we had already done
> this, but hadn't explained how the .info gets populated and what type of
> information it can store. However, we may want to remove .info entirely
> and replace it with a method and leave the implementation of how to
> track .info up to developers. More on this below...

agree with 'replace it with a method' rather than using properties on 
the objects.

> Mark also mentioned modifying the Data Query interface such that a
> developer could specify whether or not they wanted to include DOM
> information or not. This is the current select() interface on DataQuery:
> 
>    Sequence<PropertyGroup> select (in Object? query,
>                                    in optional Object template);
> 
> We would change that interface to this:
> 
>    Sequence<PropertyGroup> select (in Object? query,
>                                    in optional Object template,
>                                    in optional array options);
> 
> 'options' could be an array of string options that should be used to
> build the resulting Property Groups when performing a query.

I like this, but would perhaps prefer a simple boolean to include source 
nodes or not.

> So, a query might look like the following:
> 
> pgs = query.select(..., ..., ["source",]);
> 
> would return an array of property groups that include the source
> information for the PG and all properties of the PG. We would have to
> extend the PropertyGroup interface by adding one of the two following
> methods:
> 
>     Sequence<any> source (in string predicate);
> 
> OR
> 
>     Sequence<any> info (in optional string predicate,
>                         in optional string name);
> 
> I'm leaning toward the latter because it allows the developer more
> flexibility in having many more informational items associated with a
> PropertyGroup. For example, with the latter all of these are possible:
> 
> // get the Property Group's subject declaration elements:
> subjectElements = pg.info(None, "source");
> 
> // get the object declaration elements for all "foaf:name" properties
> objectElements = pg.info("foaf:name", "source");
> 
> One drawback to this approach is that you don't know where predicate's
> sources are, but I couldn't think of a use case where you'd care as most
> predicates are found in @rel/@property/@typeof properties and I couldn't
> think of a case where you'd want to retrieve those.
> 
> I guess we could do something like this:
> 
> subjectElements = pg.info("foaf:name", "subjectSource");
> predicateElements = pg.info("foaf:name", "predicateSource");
> objectElements = pg.info("foaf:name", "objectSource");
> 
> A non-DOM-based RDFa API would return an empty array for each of these
> queries, which would be fine.

as mentioned above, personal preference - if taking this route - would 
go with a simple 'includeSources' param (or similar):
pg.info("foaf:name");
pg.info("foaf:name", true);

> Now, as far as implementations go, I'd expect that each subject,
> predicate, object of a triple would store the source element (which is
> exactly what the current draft text states). So, Ivan, Nathan and Toby's
> concerns are still there because developers may not know that we intend
> that comparisons should be done with only the raw triple data, excluding
> source. So we will have to state something about doing comparisons
> between core RDF types.
> 
> This is all preliminary of course and is open to debate and general
> discussion. I haven't spent a great deal of time thinking through every
> detail of this proposal, so there may be holes in it or someone may have
> a different approach that we haven't heard about yet. Thoughts?

so long as any complexity in the final solution is on the implementers 
side rather than the end user, and that common operations such as 
comparing triples/iris/literals work as an end user expects them to, I'm 
easy as to whatever the ultimate solution is.

Best,

Nathan
Received on Thursday, 12 August 2010 14:47:41 UTC