- From: Mark Birbeck <mark.birbeck@webbackplane.com>
- Date: Tue, 17 Feb 2009 09:54:47 +0000
- To: Kjetil Kjernsmo <kjetil@kjernsmo.net>
- Cc: RDFa <public-rdf-in-xhtml-tf@w3.org>
Hi Kjetil, My apologies for the delay this time. :) I probably didn't make it clear enough where the distinctions were between our two approaches, for which I apologise. So let's put Fresnel to one side -- I agree it's complicated, and if that was to be part of a solution it would need to be greatly simplified. Also, put aside my jSPARQL technique -- that's just a way for one lot of people to create 'formatters' that other people can use (the HTML author wouldn't need to get involved in that, but that's by the by). So let's simply say that there is _some_ templating language or technique, but we don't yet know what it is...and in fact there may be many. Now, that leaves us with the core of our disagreement, which is that in your technique, you would like to use a 'named graph' to identify the templating rules, whilst I don't think that is necessary. I do think that named graphs are an important concept and that RDFa has the power to get it right. But I don't think that an RDFa document should have the power to place triples into _any_ named graph, for reasons of provenance (which I've mentioned). But anyway, we can return to named graphs later, and for this post I want to stress that I believe that the templating language that is used (whatever it might be), can be accessed by using 'well-known predicates', rather than a named graph. In other words, we don't actually _need_ to use named graphs as the solution. To illustrate, in your example, you do this: <rat:graph xml:id="query1" endpoint="http://dbpedia.org/sparql"> <div about="sub:resource"> <div property="rdfs:label">Resource Description Framework</div> <div property="rdfs:comment" datatype="rdf:XMLLiteral"> <rat:variable name="sub:comment"/> </div> </div> </rat:graph> You want to overload the use of @xml:id (which you'll notice that the RDFa spec was *very* careful to avoid using, for reasons I won't go into here), you want to add some extra elements and attributes to XHTML, and you want to parse @about, etc., in different ways in different contexts. On the use of @xml:id for named graphs, as I said, I think we should discuss that separately. I think the biggest problem here is the use of an extra element, and the fact that it seems unnecessary. The use of an extra element means that a 'version 1' RDFa parser would parse your markup differently to a 'rat-enabled' parser. The latter would realise that it has a template, but the former would just parse @about as normal. My main point, though, is that I don't think we need to do this at all, because if we say that the core of your proposal is the notion of expressing both a query and substitution template in one go, why not just do this: <div typeof="rat:Template"> <span rel="rat:endpoint" href="http://dbpedia.org/sparql"></span> <div rel="rat:pattern" datatype="rdf:XMLLiteral"> ... </div> </div> Then all your processor has to do is to run through the triples obtained from parsing the entire document, find any item of type 'rat:Template", and then begin processing the templates. (That's what my processor does, except it looks for Fresnel rules, but the principle is the main point here.) You could even go further and indicate on an element which template to apply: <div about="[_:a]" typeof="rat:Template"> ... </div> <div about="[_:b]" typeof="rat:Template"> ... </div> <div rel="rat:applyTemplate" resource="[_:a]"> ... </div> <div rel="rat:applyTemplate" resource="[_:b]"> ... </div> (And this would allow you to load templates from a library.) Of course, these triples would be part of the default graph, but in my view that is correct, since it is this document that contains the templates, i.e., they are part of the named graph that the query/substitution belongs to. (To put it a different way, the provenance of the templates is clear.) With all of that as the context, I'll address some of your points directly. Note one rider, though; when I read your template proposal I mentally parsed the named graph support as if it was in line with the document you wrote with Toby: <http://buzzword.org.uk/2009/rdfa4/spec> The latter proposes that a named graph can be 'any URI' which is what I am very much arguing against, for the reasons I've outlined below. However, I realise that the template proposal uses a kind of 'graph collection' approach, which doesn't suffer from the same problems. I'd like to discuss that separately, so as I've said, this email is mainly about the fact that I don't think we need any new features in RDFa to handle templates. > Hi Mark! Thanks for the insightful comments, and sorry for taking so > long to respond. Not at all..now it's my turn to apologise. :) >> I'm not convinced that named graphs are required to support the >> use-case that you describe, and I'd like to show another approach to >> templating that doesn't require them. > > I'm all ears! > >> But the only way this is manifest at the RDFa document level is that >> the URI of the document becomes the named graph. > > Right! > > I think this is valuable too, and so I finally got around to actually > read the excellent spec and found that it mandates a single default > graph, and I would not suggest that this is changed, as it would break > both this useful feature and backwards compatibility. Thus, we are > suggesting that the triples are in the named graphs in *addition* to > the to the default graph. The ability to support additional graphs is of course, by design. This allows RDFa processors to do other things, including adding their own processing rules, whilst still retaining the ability to say in the specification that 'a conforming processor must produce these triples'. For example, in my opinion the 'alt' attibute on the (X)HTML 'img' tag could be regarded as an rdfs:label for the image: <img alt="Me on holiday" src="holiday.png" /> However, in past discussions there has been no agreement on that, and so we were left with a tricky situation; if I was to add that feature to my RDFa processor, then I would be non-conformant because my processor does not produce the set of triples that the RDFa spec says should be produced. But on the other hand, we don't want to stifle innovation, and stop people generating additional triples that they can do clever things with. So by allowing me to add an rdfs:label to some separate graph in my processor, I can still achieve 100% conformance -- because my default graph matches the one described by the spec -- at the same time as allowing me to experiment and try out new things. And of course, if one day some feature that someone has been experimenting with in their processor gets incorporated into the spec, then you can simply move the processing in your processor, so that the triples get stored in the default graph. But I don't think your templates fall into this category. I don't see anything wrong with you storing your template rules directly in the main graph -- after all, from the point of view of 'provenance', then it really is the case that the origin of the triples about the template is the HTML document currently being parsed. > The main argument against this approach is duplication of data, but that > is a minor thing compared to the potential usefulness of the approach. :) I know that people always say that, but for me, if I end up with duplication, alarm bells start ringing and I try to find a more efficient way. But anyway, that's not an argument for, or against, so let's put this to one side. >> So I'm keen to see us preserve a one-to-one mapping between an >> HTML+RDFa document and a named graph. > > Sure, but that we say they need to be in both graphs takes care of that, > right? [These comments are mainly relevant to your named graph proposal, in the situation where the graph attribute has @about-like properties.] No, it doesn't. I'm talking about a one-to-one mapping which means the identity works in both directions; you are talking about a one-to-many mapping, in that each graph could contain triples that come from many different sources. I'll try to explain. Say I have a document with the URI of A, and it contains some triples. I also have a document with the URI of B, and that also contains some triples. If I store them in my triple store, in two named graphs, then it's very easy to keep the triples apart. If I visit A again at some point in the future, I can simply delete all of the triples, and reinsert them again, without having to worry that named graph A has been 'polluted' by triples from somewhere else (i.e., that I am deleting too many triples). All the triples are still usable of course, because I can run a SPARQL query across all graphs at the same time. And of course, I can also query just one graph. (And since a named graph is just a URI, I can also query for 'all graphs created by Kjetil', before then querying those graphs.) Now, what if any HTML document that contains RDFa (which is essentially a named graph) can add triples to any other named graph? That is the model you are describing, and that allows document A to contain triples that will end up in *both* named graphs A and B (the same goes for document B). At the end we still have two named graphs, just like in my scenario, but the problem now is that we can't separate the triples that are in graph A that came from your document, from the triples in graph A that came from my document. You could argue that this is a mere implementation detail, and that you could keep track of the origin of each triple; to some extent that is true, but there are two important issues here. The first is whilst most people are talking about using 'quads' to keep track of their triples (i.e., the triple plus named graph, which is equivalent to origin), your solution would require 'pents' (i.e., triple, plus named graph, plus origin). The second issue is that we already have a mechanism for querying named graphs in SPARQL, and that mechanism could be used when dealing with things like 'give me all statements that came from document A'. But we don't have any way of querying across 'pents', so that would need to be invented. >> TEMPLATING >> >> I'm really excited to see the proposals you've made on templating, >> but perhaps I can explain the approach I've taken to the questions >> that you have raised, to show how I don't think you need named graphs >> to do what you want. >> >> In the library I mentioned, I've taken an approach to templating that >> is based on Fresnel [2]. To be brutally honest, I think Fresnel is a >> bit over-complicated :), but I felt that since it already existed, it >> would probably make sense to start with that, and then add things as >> it became clear what else was needed. > > Yeah, we also looked at Fresnel, and we came to the opposite > conclusion. :-) Which conclusion? I said that it's over-complicated, which I think you agree with. ;) > To explain where I come from: We do mostly ontology engineering, big-O > and little-o, reasoning, SKOS thesauri, search and that kind of stuff. > The web work we do is currently trivial, thus it is not where we'd like > to spend time, and moreover, we'd like to give the styling to someone > else in the company, who might be good at CSS and know a little XSLT, > but we're not there yet. > > I suspect that we might have this in common with some web developers who > only wants to use a bit of data of the Semantic Web with their > relatively simple web pages. > > So, if visualization was important to me, I'd certainly go with Fresnel, > and I think I might find use for the full complexity of it. I've been > advocating that we pick up Fresnel for a long time, but it was hard to > sell. Not the least because it meant that a designer, who we'd might > use for styling the site, would need to learn it in addition to CSS. > So, I was thinking in terms of "as simple as possible, but not simpler" > (I'm no fan of KISS, because it tends to result in things that doesn't > do the job). > > A solution that could let me write the HTML and the designer CSS, would > be the right tool for the job right now. Again, if I was writing > something more advanced, where the designer should control the HTML too > would require a Separation of Concerns regime that would make my RDFa > Template proposal the wrong tool for the job. I don't necessarily disagree, but I apologise again that I've 'hidden' my main point behind a discussion about what the templating language should look like. The solution you have proposed is certainly workable, and my only criticism is that I don't believe that it requires anything more than is already available in XHTML/HTML with embedded RDFa. >> As you're probably aware, the Fresnel format contains a set of RDF >> that describes rules such as 'given an item of this type, add this >> CSS class'. This works quite nicely with RDFa because any triples >> that are queried from an RDFa document have a definite location. For >> example, if you have: >> >> <div typeof="foaf:Person"> >> ... >> </div> >> >> then querying for all items of type 'foaf:Person" leads naturally to >> the div that contains the RDFa, making it easy to set a CSS class on >> it. > > Right, but we have some examples that different foaf:Person's should be > treated very differently in our apps. Hence the use of a SPARQL derivative, to get finer-grained access to the data in the page. But as I said...that's a digression. :) >> So, the Fresnel example I just gave would be expressed in RDFa (and >> jSPARQL), like this: >> >> <div >> xmlns:fresnel="http://www.w3.org/2004/09/fresnel#" >> typeof="fresnel:Group" >> style="display: none;" >> >> <div rev="fresnel:group"> >> <div typeof="fresnel:Format"> >> <div property="fresnel:instanceFormatDomain"> >> select: [ "s", "item" ], >> where: >> [ >> { pattern: [ "?s", "http://ebay.com/item", "?item" ], >> setUserData: true } >> ] >> </div> >> >> <span property="fresnel:resourceStyle" >> datatype="fresnel:styleClass">ebay-item</span> >> </div> >> </div> >> </div> > > I see! But I feel that this is a lot further from RDFa than my proposal. Mmm... with respect, I'm *only* using RDFa. :) Your proposal on the other hand, has additional elements and attributes in your own namespace, has devised a use for @xml:id, where currently one doesn't exist, has added the ability to support named graphs, and seeks to suspend normal processing of @about and other attributes under certain circumstances. ;) But as I say, how the templating language looks is not the key thing; I'm just stressing for now that a templating language can be achieved using current RDFa. > You'd have to understand SPARQL and jSPARQL much deeper to actually use > it, than just use a bit of XML and there you go. Also, it has a lot > more implementation infrastructure behind it. I was also thinking along > those lines for an XSLT-like RDF transformation language, but I > rejected it. If I required that much knowledge about the data, I'd use > some kind of ontology class-OO class mapper and do the work in the > application View. > > But again, that's the thing I would do in an application that had > complex requirements for the Web interface. My current use is for the > applications that only requires a very simple Web interface. > No problem -- it's the named graph v. using current RDFa that we're really talking about. >> Note that this is in the same document as the data itself, and for >> the reasons I gave in the first part of this email, I think that it >> is correct that the formats and the data end up in the same named >> graph. > > Sure! As stated, there is no conflict with the proposal. Except I mean that there is no need for additional named graphs. >> Anyway, the key point I'm driving at here is that there is no need to >> keep the templating rules separate from the main document's graph, >> since they are part and parcel of it. They are much like CSS rules, >> in that they operate on the DOM, but they use semantic selectors, >> rather than DOM selectors. All that is required is to use various >> predicates as the trigger for what to do, rather than segmenting >> things with named graphs. > > Hmmmm, I don't feel you quite demonstrated this... > > Importantly, the same predicates in different parts of the document > could be used in very different ways. So, I'd have to at least use > a "triple fingerprinting" to resolve such problems, I without having > tried, I think that too would fail. For example, in some uses, we have > a foaf:Person that in one case is an author and in one case is the > audience. They are known as different to the page author, thus > identifying them with different named graphs would be trivial, but > their data structure is identical. And the implementation complexity > would be much larger, I fear, and I'm out for something that's really > simple to implement. By all means implement away, but I think you should avoid adding things that would not be conformant with other RDFa parsers. The solution I am suggesting -- not the use of Fresnel, but the general algorithm that you create some 'well-known predicates' which your parser then finds after loading -- would create exactly the same triples in both your parser, and a parser that is unaware of your templating rules. However, the solution that you've outlined in your document would actually produce different triples in your parser. For example: <rat:graph xml:id="query2" endpoint="http://dbpedia.org/sparql"> <tr about="sub:resource"> <td property="foaf:name" datatype="rdf:XMLLiteral"><rat:variable name="sub:name"/></td> <td property="dbo:produced">1973</td> <td property="dbp:firstFlight" datatype="rdf:XMLLiteral"><rat:variable name="sub:first"/></td> </tr> </rat:graph> In this example, in your templating language, @about is not a subject, but a subject that will be substituted. This means that in your parser there would be no triple generated by this, but in another parser there would. > Well, to sum up, my key point is that at this point, it is important to > have several different approaches flourish. I can certainly see that > yours have a very important use (though I would do more in the View and > not inline (j)SPARQL in the page)... But don't forget, having inline rules is merely the simplest example. Since the templates and queries are defined using RDF, then it means that any mechanism that can be used to import triples can be used to import template rules. As I said before, this might be something as simple as @rel="owl:imports". >... but I still feel that it is not the > right solution for us, and also too complex for many web developers > just out to get a little data from the Semantic Web into his > application. Sure...no problem. :) As I said, I blurred the issue by discussing templating solutions, when in fact my problem is with the use of named graphs. Here I'm afraid I cannot be so laissez-faire; using @xml:id and named graphs is something that has to be designed right for all of RDFa, not just one use case. (And as I've tried to show, in this particular use-case I don't believe it is even needed.) > Certainly, one day, all lenses that will ever be needed are written, > which will change the picture, but up to then, I think several > directions should be left open. But by using RDF to define 'lenses' you don't need to imagine that there will be a finite set of lenses. And in fact, the beauty of using RDF to define these kinds of rules is that we can even use reasoning to decide which lenses to deliver to you. CONCLUSION With such a lengthy email -- sorry about that! -- it might not be clear what exactly my conclusions are. The key thing is that I'm all for having a templating language, but would urge that it is done using 'normal' RDFa, rather than adding new features. I don't disagree that there *is* a discussion to be had about named graphs, but I think that should be had separately. Regards, Mark -- Mark Birbeck, webBackplane mark.birbeck@webBackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Received on Tuesday, 17 February 2009 09:55:32 UTC