- From: Martin Hepp <martin.hepp@unibw.de>
- Date: Tue, 24 Mar 2015 09:57:43 +0100
- To: Guha Guha <guha@google.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>
- Cc: John Walker <john.walker@semaku.com>, Sandro Hawke <sandro@w3.org>, Ralph Swick <swick@w3.org>, Tim Berners-Lee <timbl@w3.org>
I would also like to support the proposal by a look at resources: Most = people assume that developing Web ontologies / shared schemas is = relatively little work; after all, how much time does it take to write = the definitions for a couple of types and properties? But the reality is = that building Web ontologies requires=20 1. a very well-chosen conceptual model that represents "sweet spots" of = data structures that=20 - are non-trivial to reconstruct by a client from unstructured data, - are easy to grasp by human developers from different cultural and = professional backgrounds from the label and short description alone, - are "markup friendly", i.e. as simple as possible in RDFa and = Microdata, and - can be reliably populated from existing data structures (e.g. match = typical distinctions / structures in back-end databases), and then 2. align that model well with the existing elements in other = vocabularies, namely schema.org (e.g. avoiding the growth of redundant = branches of functionality that already exist and avoiding name clashes), = and 2. writing documentation and examples. Just an estimate: An extension proposal for schema.org of ca. 10 - 15 = types plus 15 - 25 properties takes me, roughly, a year, to design. = Maybe not full-time, but due to the many discussions and stakeholder, it = is really some effort over that period of time. Now this is only the initial proposal. On the sides of the sponsors of = schema.org and other consuming clients, you have to review #1 - #3. In = particular checking that an extension proposal is technically sound with = regard to all subtleties of the schema.org ecosystem, and that it = masters #1 well, is really time-consuming. And it is difficult for = domain experts who do it for the first time to get #1 and #2 right. Extensions beyond ca. 10-20 new types require a really substantial = amount of resources from both the external proposers and the sponsors of = schema.org. This is why I think we urgently need such a mechanism to tap the = potential of the many, many interesting schemas and standards out there = for publishing more structured data on the Web without the need to = channel those through the social and technical process of getting into = schema.org core. Martin -------------------------------------------------------- martin hepp e-business & web science research group universitaet der bundeswehr muenchen e-mail: martin.hepp@unibw.de phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp=20 twitter: mfhepp On 20 Mar 2015, at 16:40, Guha <guha@google.com> wrote: > This is a very important point you bring up, one that goes to the = heart of a lot of schema.org decisions. >=20 > I do agree that it is much 'cleaner' for a term to have a single = namespace. The cost of this is that the webmaster needs to keep track of = namespaces. We estimate on the order of 100s or 1000s of vocabulary = creators. We already have millions of webmasters using schema.org. Most = applications will use only a single extension, which means that under = the proposed scheme, they don't have to worry about namespaces. >=20 > Mixing and matching is of course always open (and welcome). More = technical webmasters will do that. We just don't want it to be a = requirement to start participating ... >=20 > Thanks for the comments. >=20 > guha >=20 > On Fri, Mar 20, 2015 at 1:21 AM, John Walker <john.walker@semaku.com> = wrote: > Hi Guha, > =20 > I have a few questions/thoughts around the proposal that every item = in the core would also be in every extension. > =20 > Would this apply only to the reviewed extensions only, or also to = external extensions? > =20 > I can understand that only using terms from a single prefix lowers the = bar for getting started, but I don't think it's too tricky to get your = head round using multiple prefixes in any of the syntaxes (although some = are easier than others). > IMHO it would be simpler and more understandable to have a single = identifier (URL/URI/IRI) for each term/item rather than multiple = aliases. > (This however would not preclude that two different extensions might = have a different term/item for a very similar concept and hence each has = own identifier) > =20 > Also I expect many practical use cases where users need to mix'n'match = terms from different extensions. > For example the GS1 extension would have many terms for general use = and hopefully enough to cover some specific domains like food and = beverage, but may not fully cover other domains like consumer = electronics. > Admittedly this will not be needed in all cases, but I think there are = enough to warrant giving it some deep thought (i.e. it is far from a = corner case). > =20 > Regards,=20 >=20 > John Walker=20 > Principal Consultant & co-founder=20 > Semaku B.V.=20 > SFJ 4.009, Torenallee 20, 5617 BC Eindhoven=20 > Mobile: +31 6 475 22030=20 > Email: john.walker@semaku.com=20 > Skype: jaw111=20 >=20 > KvK: 58031405=20 > BTW: NL852842156B01=20 > IBAN: NL94 INGB 0008 3219 95=20 >=20 > =20 >> On March 20, 2015 at 12:36 AM Guha <guha@google.com> wrote:=20 >>=20 >> The various discussions around this extension proposal seem to have = reached quiescence. I am hoping this is more because the questions were = answered than because of boredom. >> =20 >> We would like to proceed with the implementation of this proposal. If = there are strong objections, now would be the right time to raise them. >> =20 >> guha=20 >>=20 >> On Fri, Feb 13, 2015 at 1:34 PM, Guha <guha@google.com> wrote:=20 >> =20 >> Schema.org extension mechanism >> =20 >>=20 >> Motivation >> As schema.org adoption has grown, a number groups with more = specialized vocabularies have expressed interest in extending schema.org = with their terms. The most prominent example of this is GS1 with product = vocabularies. Other examples include real estate, medical and = bibliographic information. Even in something as common as human names, = there are groups interested creating the vocabulary for representing all = the intricacies of names. >>=20 >> Outline of solution >>=20 >> There are two kinds of extensions: reviewed extensions and external = extensions. Both kinds of extensions typically add subclasses and = properties to the core. Properties may be added to existing and/or new = classes. More generally, they are an overlay on top of the core, and so = they may add domains/ranges, superclasses, etc. as well. Extensions have = to be consistent with the core schema.org. Every item in the core (i.e., = www.schema.org) is also in every extension. Extensions might overlap = with each other in concepts (e.g., two extensions defining terms for = financial institutions, one calling it FinancialBank and other calling = it FinancialInstitution), but we should not have the same term being = reused to mean something completely different (e.g., we should not have = two extensions, one using Bank to mean river bank and the other using = Bank to mean financial institution). >>=20 >> Reviewed Extensions >> Each reviewed extension (say, e1), gets its own chunk of schema.org = namespace: e1.schema.org. The items in that extension are created and = maintained by the creators of that extension. Reviewed extensions are = very different from proposals. A proposal, if accepted, with = modifications could either go into the core or become a reviewed = extension. >>=20 >> A reviewed extension is something that has been looked at and = discussed by the community, albeit not as much as something in the core. = We also expect a reviewed extension to have strong community support, = preferably in the form of a few deployments. >>=20 >> External Extensions >> Sometimes there might be a need for a third party (such as an app = developer) to create extensions specific to their application. For = example, Pinterest might want to extend the schema.org concept of = =91Sharing=92 with =91Pinning=92. In such a case, they can create = schema.pinterest.com and put up their extensions, specifying how it = links with core schema.org. We will refer to these as external = extensions. >> =20 >> How it works for webmasters >> All of Schema.org core and all of the reviewed extensions will be = available from the schema.org website. Each extension will be linked to = from each of the touch points it has with the core. So, if an extension = (say, having to do with Legal stuff) creates = legal.schema.org/LegalPerson which is a subclass of schema.org/Person, = the Person will link to LegalPerson. Typically, a webpage / email will = use only a single extension (e.g., legal), in which case, instead of = =91schema.org=92 they say =91legal.schema.org=92 and use all of the = vocabulary in legal.schema.org and schema.org. >>=20 >> As appropriate, the main schema.org site will also link to relevant = external extensions. With external extensions, the use of multiple = namespaces is unavoidable. >>=20 >> What does someone creating an extension need to do >> We would like extension creators to not have to worry about running = a website for their extension. Once the extension is approved, they = simply upload a file with their extension into a certain directory on = github. Changes are made through the same mechanism. >>=20 >> Since the source code for schema.org is publicly available, we = encourage creators of external extensions to use the same application. >>=20 >> Examples >>=20 >> Archives example in RDFa >>=20 >> This example uses a type that makes sense for archival and = bibliographic applications but which is not currently in the schema.org = core: Microform, defined as "Any form, either film or paper, containing = microreproductions of documents for transmission, storage, reading, and = printing. (Microfilm, microfiche, microcards, etc.)" >>=20 >> The extension type is taken from http://bibliograph.net/Microform, = (which on this proposed model would move to bib.schema.org) which is a = version of the opensource schema.org codebases that overlays = bibliographic extras onto the core schema.org types. The example is = adapted from http://schema.org/workExample. >>=20 >>=20 >> <div vocab=3D"http://bib.schema.org/"> >> <p typeof=3D"Book" resource=3D"http://www.freebase.com/m/0h35m"> >> <em property=3D"name">The Fellowship of the Rings</em> was = written by >> <span property=3D"author">J.R.R Tolkien</span> and was = originally published >> in the <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"location">United Kingdom</span> by >> <span property=3D"name">George Allen & Unwin</span> >> </span> in <time property=3D"datePublished">1954</time>. >> The book has been republished many times, including editions = by >> <span property=3D"workExample" typeof=3D"Book"> >> <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"name">HarperCollins</span> >> </span> in <time property=3D"datePublished">1974</time> >> (ISBN: <span property=3D"isbn">0007149212</span>) >> </span> and by >> <span property=3D"workExample" typeof=3D"Book Microform"> =20 >> <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"name">Microfiche Press</span> >> </span> in <time property=3D"datePublished">2016</time> >> (ISBN: <span property=3D"isbn">12341234</span>). >> </span> >> </p> >> </div> >>=20 >> Alternative RDFa: >>=20 >> The example above puts all data into the extension namespace. = Although this can be mapped back into normal schema.org it puts more = work onto consumers. Here is how it would look using multiple = vocabularies: >>=20 >> <div vocab=3D"http://schema.org/" prefix=3D"bib: = http://bib.schema.org/"> >> <p typeof=3D"Book" resource=3D"http://www.freebase.com/m/0h35m"> >> <em property=3D"name">The Fellowship of the Rings</em> was = written by >> <span property=3D"author">J.R.R Tolkien</span> and was = originally published >> in the <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"location">United Kingdom</span> by >> <span property=3D"name">George Allen & Unwin</span> >> </span> in <time property=3D"datePublished">1954</time>. >> The book has been republished many times, including editions = by >> <span property=3D"workExample" typeof=3D"Book"> >> <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"name">HarperCollins</span> >> </span> in <time property=3D"datePublished">1974</time> >> (ISBN: <span property=3D"isbn">0007149212</span>) >> </span> and by >> <span property=3D"workExample" typeof=3D"Book bib:Microform"> =20= >> <span property=3D"publisher" typeof=3D"Organization"> >> <span property=3D"name">Microfiche Press</span> >> </span> in <time property=3D"datePublished">2016</time> >> (ISBN: <span property=3D"isbn">12341234</span>). >> </span> >> </p> >> </div> >>=20 >> Here is that last approach written in JSON-LD (it works today, but = would be even more concise if the schema.org JSON-LD context file was = updated to declare the 'bib' extension): >>=20 >> <script type=3D"application/ld+json"> >> { >> "@context": [ "http://schema.org/", >> { "bib": "http://bib.schema.org/" } ], >> "@id": "http://www.freebase.com/m/0h35m", >> "@type": "Book", >> "name": "The Fellowship of the Rings", >> "author": "J.R.R Tolkien", >> "publisher": { >> "@type": "Organization", >> }, >> "location": "United Kingdom", >> "name": "George Allen & Unwin", >> }, >> "datePublished": "1954", >> "workExample": { >> "@type": "Book", >> "name": "Harper Collins", >> "datePublished": "1974", >> "isbn": "0007149212" >> }, >> "workExample": { >> "@type": ["Book", "bib:Microform"], >> "name": "Microfiche Press", >> "datePublished": "2016", >> "isbn": "12341234" >> } >> } >> </script> >>=20 >>=20 >> GS1 Example >>=20 >> <script type=3D"application/ld+json"> >> { >> "@context": "http://schema.org/", >> "@vocab": "http://gs1.schema.org/", >> "@id": "http://id.manufacturer.com/gtin/05011476100885", >> "gtin13": "5011476100885", >> "@type": "TradeItem", >> "tradeItemDescription": "Deliciously crunchy Os, packed with 4 = whole grains. Say Yes to Cheerios", >> "healthClaimDescription": "8 Vitamins & Iron, Source of Calcium & = High in Fibre", >> "hasAllergenRelatedInformation": { >> "@type": "gs1:AllergenRelatedInformation", >> "allergenStatement": "May contain nut traces" >> }, >> "hasIngredients": { >> "@type": "gs1:FoodAndBeverageIngredient", >> "hasIngredientDetail": [ >> { >> "@type": "Ingredient", >> "ingredientseq": "1", >> "ingredientname": "Cereal Grains", >> "ingredientpercentage": "77.5" >> }, >> { >> "@type": "Ingredient", >> "ingredientseq": "2", >> "ingredientname": "Whole Grain OATS", >> "ingredientpercentage": "38.0" >> } >> ] >> }, >> "nutrientBasisQuantity": { >> "@type": "Measurement", >> "value": "100", >> "unit": "GRM" >> }, >> "energyPerNutrientBasis": [ >> { >> "@type": "Measurement", >> "value": "1615", >> "unit": "KJO" >> }, >> { >> "@type": "Measurement", >> "value": "382", >> "unit": "E14" >> } >> ], >> "proteinPerNutrientBasis": { >> "@type": "Measurement", >> "value": "8.6", >> "unit": "GRM" >> } >> } >>=20 >> </script> >>=20 >> This example shows a possible encoding of the GS1 schemas overlaid = onto schema.org. It uses JSON-LD syntax, which would support several = variations on this approach. It is based on examples from GS1's proposal = circulated to the schema.org community recently. >> = (https://lists.w3.org/Archives/Public/public-vocabs/2015Jan/0069.html). = Instead of writing >> "@context": "http://schema.org/", "@vocab": = "http://gs1.schema.org/", it would be possible to simply write = "@context": "http://gs1.schema.org/". >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >=20 > =20 >=20
Received on Tuesday, 24 March 2015 08:58:15 UTC