- From: Shane McCarron <shane@aptest.com>
- Date: Thu, 14 Jul 2011 14:08:03 -0500
- To: public-rdfa-wg@w3.org
- Message-ID: <4E1F3E93.3060202@aptest.com>
Forgive me. I missed the meeting and I am still reeling from this. My comments are inline and are not as well thought out as they should be. But I wanted to get this out whilst it is still fresh in everyone's mind. Oh... and -10 to removing @profile On 7/14/2011 12:44 PM, Gregg Kellogg wrote: > On today's RDF Web Apps call [1], there was some discussion of > @profile. ISSUE-96 [2] relates to document ready. I encourage people > with an opinion on the use of @profile in RDFa to voice their opinions. > > Basically, until all @profile documents are loaded and processed, an > application cannot reliably access the RDFa API because the URI > resolution of types and properties cannot be reliably resolved until > all term and prefix definitions have complete. Also, the failure to > load a profile can mean that an entire sub-tree of the document must > be ignored. The RDFa processing model requires that the document be evaluated from beginning to end, and that @profile attributes are processed as they are encountered. RDFa doesn't allow for ad hoc evaluation of *parts* of a document / DOM tree. Assuming I am correct in this, my response to the above has to be "so?" I can envision an RDFa API implementor wanting to break the processing model by evaluating the document piecemeal as requests for triples are made, but I don't think that is feasible REGARDLESS of whether or not @profile is supported. For example, a given triple in the middle of a document might reference a bnode that is from some other part of the document. You wouldn't know if that reference made sense unless you had processed the whole document, right? (Note - that might be a red herring but it seems important to me right now). I think it is a fundamental requirement of the processing model that the entire document is processed and its triples identified before the RDFa API can perform any operations on the document. If this is the case... then surely the issue of whether a profile is accessible is resolved long before the RDFa API has to do anything? All it would need do is query the triple store returned from the RDFa Processor and return the triples that match the query. (I know the RDFa API has other capabilities, but they all come down to manipulating triples or finding data related to certain components of triples). If there are no triples because a profile was unavailable... okay. > > Loading a profile is problematic due to HTTP same-origin restrictions > [3]. This can be alleviated by insuring that profile documents are all > CORS enabled, but it remains a complicating factor. Not all authors > will be in a position to set CORS policy for served profiles. Do you honestly believe that there will be that many important profiles? I agree that Bob's Auto Shop might find setting up CORS and a profile challenging, but they aren't going to do it anyway. There will be a handful of important profiles, and those will work right. Almost by definition... since if they do not work right you won't get any triples. Organizations like Dublin Core and Creative Commons will ensure their profiles work right, their CORS is set up right, etc. How could they not? It's in their own interest, it's not actually hard if you can read and use an editor, and it is REQUIRED for their content to be used on the semantic web via RDFa. If you are building a profile, you are building it for use in RDFa via @profile. q.e.d. > > A profile may fail to load because of network connectivity issues, > meaning that the same document may be interpreted differently > depending on environmental conditions. Yes, but we have always said this is such a vanishingly small issue that it is not worth worrying about, other than to define what a conforming processor should do when it occurs. > > Multiple profiles may define the same term differently, which could > lead to confusion (on the part of the user, behavior is well-specified > within the processing rules). > Nothing we can possibly do here will change this. Hell, people interpret the meaning of the existing @rel values differently in HTML4. And there was only ONE Profile for for that! Even if there were no @profile, @vocab does the same time in an even more flagrant manner (since a processor doesn't dereference @vocab to infer meaning - it just trusts that there are terms). > Note that the default profile does not present the same problems, > since it is assumed that RDFa processors will internally cache the > default profile. Concerns were raised about the relatively closed > nature of relying on the default profile for prefix definitions, as > frequent changes to the profile place a burden on processor > developers, and even with a simple registration form, it places a > barrier to entry and is generally not in the open nature of RDF. I guess I agree that there might be a barrier. However, we never envisioned that the default profile definition would change frequently. Surely the collection of interesting terms will not expand quickly. And it is even less likely that new, interesting vocabularies that are expected to be included in a default profile will arise daily! The vision was that, once a year or so, there might be a reason to revise the profile. At least, that was my vision. Moreover, since a profile has an explicit URI, I can reference it via @profile and KNOW that I am getting the collection I wanted at the time I wrote my document (or set up my web site / CMS, or whatever). > > Personally, I really see the advantage of a profile mechanism. In > addition to declaring prefixes, it allows an author to establish a > number of terms for use within their document. CURIEs have been > criticized (rightly or wrongly) as being to complex and error prone, > but terms are consistent with similar usage in Microdata and > Microformats, but it's not feasible to include a large number of terms > in a default profile, where term definitions may overlap between > different vocabularies. Right. But if I were an *author* who cared about a collection of terms, I would use @vocab. @profile wasn't really targeted at authors. It was targeted at taxonomy creators (e.g. microformats, that news thingy, dublin core, facebook, even schema.org) to make it easier for authors to rely upon the set of terms they need to express their content. > > However, the use of profiles is a substantially complicating factor in > RDFa. Removing it still makes RDFa authoring much simpler than other > RDF variations, as for most developers, almost all of the prefix > definitions they would need to use can be included in the default > profile. Also, the use of @vocab provides much of the benefits of a > custom profile when a single vocabulary is being used (e.g., > http://www.w3.org/2006/03/hcard or http://schema.org/). Also, custom > prefix definitions may still be introduced into a document using the > @prefix definition. However, it means that organizations outside of our own have no way to define the collections of prefixes and terms that are relevant to their content developers (schema.org, facebook, the news people). The major reason to support @profile was to permit those organizations and others to override the defaults. Without such a mechanism, there is no way, for example, for me to ensure that MY authors are restricted to using terms I want and prefixes that map to what I mean. Instead, they are going to map to whatever the implementation that happens to be parsing the content means. Frankly, that's worse than useless to me. In the absence of @profile, the only way I as a content author or publisher with captive authors can be confident that the semantics are EXACTLY what I want is to.... what? Define all my prefixes explicitly on the document element and require that my authors only use scoped terms? I could probably define my own vocabulary via @vocab... But as currently defined @vocab doesn't clear the terms from the default profile out of my context. So I have no good way to control what is available in my context. If my vocabulary defines "nert" my users will use that. If one accidently types @rel="next" it will turn into a triple because I have no way to turn that off. And if two years down the road the default profile learns some new term (e.g., 'security') and one of my authors mistakenly use that term in a document, I suddenly have a new triple that has some weird meaning that author and I never intended. And I have no way to prevent it! At least with @profile we have a way to explicitly reference the default profile and know what is getting imported into the context space (although there is no way to perform a 'reset' as there is with @vocab - I still think that's a mistake). > > This would also have the benefit that the RDFa API would not have > profile load latency issues > > * Potential same-origin problems in loading profile, > * Profile loading relies on network connectivity, > * Processing complication introduced due to profile load failure, > * Latency introduced by having to load profile before proceeding with > application processing, > * Need to add notification mechanisms to know when profile processing > has completed, > * Potential difference in CURIE/term meaning based on multiple > overlapping profile definitions, > * No clear community requirement for profiles other than the default. > (Sophisticated authors have other syntactic mechanisms to draw on). I feel very strongly that letting the RDFa API drive the definition of RDFa is a mistake. The API is a nice feature, and it works regardless of the issues above. Latency, notifications, mutation, etc. are all issues in ALL client-side APIs. Rare edge cases like the network working fine to load the initial page, but no longer working .5 seconds later when I want to load a profile, are no reason to throw out the feature. It is trivial to implement the RDFa processor such that it calls back to the RDFa API when it is done processing... if that's how you want to implement it. Until that callback has occurred, the RDFa API isn't ready. I don't know what the RDFa API says about this today, but no matter what happens with @profile there will always be a gap between DOMReady and when the RDFa API can do something with triples.... so I am sure you handle this already (or will do). And I disagree that there are no community requirements for this. I am sure that Ivan has an opinion about this. Others will chime in. I am in the community. And I have a requirement for it. Mark Birbeck is in the community. He has a requirement for it. Here's my requirement. I need to have my triples be deterministic (modulo the network failing). And I don't want to create an infinite number of cargo cult programmers in order to achieve this. It is insufficient to say authors can declare all the prefixes the want at the top of their documents. That's how we got in this mess in the first place. I want to be able to tell my authors "use this profile and you will be able to embed all the semantics that we need in our environment" and that's it. Moreover, since it IS a profile, and the profile is in RDFa, my authors can LOOK AT IT! It is an HTML document. It defines terms and prefixes. It is self-documenting. My authors can know IMMEDIATELY what they can use in documents they write for me. If the API is hard to implement or might be laggy, change the API. As written today the processing model is straightforward to implement. Many of us have done it. You don't actually need any new event notifications - the existing model is fine. The RDFa API needs to know when the document has been parsed. Once that is done, everything about the document is readily available. Could that take a little time? Sure. But it isn't "overhead" on every HTML document. It is overhead on HTML documents where a script author has made an RDFa API request to retrieve semantic data. And there WILL be overhead to retrieve that data. Will there be MORE overhead on a document that references a profile that has not yet been cached by the user agent? Yep! Once. Then it will be cached. Your implementation doesn't cache profiles it has retrieved? Why not? Fix it. > > At the time profiles were introduced, there was no mechanism for > importing standard prefix definitions in bulk. For almost all cases, a > built-in default profile definition addresses this issue. Going > further to allow for arbitrary profile introduction may be going to > far, at least for this particular version of the standard. See above. I strongly disagree. The time to do this is now. There won't be a next time. We are not requiring that everyone use it. And there won't be a million little profiles out there. But there will be *some*. And we can't envision what those will be or which will succeed. In the absence of this mechanism I will be forced to declare all of the prefixes I care about every time. I cannot rely upon the default profile because IT CAN CHANGE UNDERNEATH ME and I have no announcement mechanism! (I know, there goes Shane, bitching about announcement mechanisms again.) If the working group decides to go this way, there's nothing I can do about it. But it is short sighted. If I wanted to be short sighted, I would have worked on HTML5. -- Shane McCarron Managing Director, Applied Testing and Technology, Inc. +1 763 786 8160 x120
Received on Thursday, 14 July 2011 19:08:42 UTC