- From: Sandro Hawke <sandro@w3.org>
- Date: Sun, 09 Feb 2014 21:30:31 -0500
- To: RDF WG <public-rdf-wg@w3.org>
I've been working on what to put at the rdf: and rdfs: namespaces. This is a long email, with three sections. 1. goals for the project 2. what my software will look for in RDF data 3. specific issues with the two RDF namespaces Perhaps the most interesting/novel bit is about HTML and language tags. Plus my idea for handling rdf:_100. :-) This is progress on ACTION-98, my last remaining W3C action item (!). === PART 1: Goals === 1. If someone puts the IRI of a term in the rdf: or rdfs: namespaces into their browser, they'll get some nice documentation on that term. The URL field will continue to show that term (it wont redirect). 2. That documentation will link to other terms. When it does so, clicking will repeat the experience as above: the IRI of the term will be in the URL field of the browser, and the user will see decent documentation. 3. The documentation will be available in multiple languages. We don't need this on day one, but we currently have the dcat schema in English, Spanish, Arabic, Greek, French, and Japanese (thanks for Phil Archer's pushing on that). I'm still learning how to do a multi-lingual webapp: there's an early version at http://www.w3.org/2013/vocabspec/examples/dcat.html -- in that version, you use the gear in the upper-right to change the language. I'm in the process of changing it to use the browser's language setting as a starting point, then allow a simpler selection control. 4. The software will be available so other people can do this with their namespace documents easily enough. 5. The documentation will be entirely driven by RDF triples that one gets by dereferencing the namespace documents while asking for Content-Type text/turtle. In the dcat example above, view source shows the .html file is just a shell; the content is generated at browse-time from the turtle at the dcat namespace. 6. In the real deployment, there will also be content-negotiated static versions at the namespace URL so that search engines and non-javascript browsers can see the content as well. Folks hosting namespace documents, if they want this, will have to run a node.js program to re-generate the static files whenever the turtle is changed. 7. Over time, I want to evolve the code to include social features like crowdsourced translations, stars (aka "like", "+1", "endorsement", "bookmarks"), and links to code and public data sources that use the term. Obviously that will have to be done carefully to avoid detracting from the official documentation. (This part is a research project.) I think that's it. === PART 2: What my software will look for in RDF data === One challenge is that for expressing the documentation in RDF, I don't know of any consensus around a vocabulary or how to use it. Here's my best guess, but I'm making some of this up. Feel free to correct me (but soon, please). The basic predicates: * rdfs:label - a name, usually one or two words; the English version will usually be the same as the end part of the IRI. * rdfs:comment - a descriptive phrase, usually 5-10 words, might be in rdf:HTML, especially if it needs to other terms in the vocabulary * dc:description - a longer, definitional description, usually 1-5 paragraphs (using rdf:HTML for formatting). For rdf and rdfs, I plan to copy the HTML out of RDF Schema 1.1. For example, for rdf:type it'll be the stuff at https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-schema/index.html#ch_type * vann:usageNote - arbitrary length, more practical and less definitional than dc:description. (I don't plan to use this for rdf: and rdfs:, but it's used by dcat and others.) * dc:title - for the title of the namespace document; ignored on terms It'll also show the rdfs:domain, rdfs:range, rdf:type, rdfs:subClassOf, rdfs:subPropertyOf, and other bits I can think of how to include without making things to complicated. Also, rdfs:isDefinedBy linking to the right section of the spec. On language/HTML handing: I think this is how to do it: <some term> dc:description 'description of that term in plaintext English', 'description of that term in plaintext English'^en, '<div lang="en">description of that term in HTML English</div>', 'description of that term in plaintext French'^fr, '<div lang="fr">description of that term in HTML French</div>', ... etc The xs:string is there for non-multilingual apps, and to use as the fallback (with a warning) if no matching languages are found. This approach implies that predicates with natural language expressions as their range MUST be conceptually single-valued. You can't do this: { <s> rdfs:comment "some comment"^en, "some other comment"^en. } I expect I'll have my software display a warning if this kind of thing (two values with the language language) occurs in the data. See [1] for some more discussion of this. I guess I'll treat an HTML literal without a lang attr on the first element as like the xs:string literal -- a fallback for when no available values lang-match the user's preferences. I plan to ignore triples giving domain, range, or subClassOf as rdfs:Resource, since they're meaningless. === PART 3: Specific issues with the two RDF namespaces === * Should we include any dct:creator or dct:contributor triples? It's hard to make that helpful and fair given all the people who've been involved with these namespaces over the years. * Should we leave out the meaningless triples giving domain, range, or subClassOf as rdfs:Resource? There's some pretty odd stuff there now. * What should we do about rdf:_1, etc? I'd think having the first few in the namespace document would make sense, maybe rdf:_1 through rdf:_20. I *could* put in special javascript for arbitrary ones, but that seems kind of goofy. * Can we say *anything* about how investing in implementing reification systems might not be your best bet? Pretty, pretty please? Or do we have to let that wait for the commenting mechanism? * What's the title of the rdf: namespace document? I propose, "The Core RDF Vocabulary" * What formats do we serve the schemas in? They've been just RDF/XML so far. Left to myself, I'd just do Turtle. I'm okay with including any other format for which there's a serializer available for node.js, so I can generate them out of the same system. If someone wants json-ld, could they please write a @context that makes the schema look nice in json? So... thoughts? -- Sandro [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2013Dec/0151.html
Received on Monday, 10 February 2014 02:30:41 UTC