W3C home > Mailing lists > Public > public-vocabs@w3.org > May 2014

Re: Indicating main entity / primaryTopic - proposal to use 'schema.org/about'

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Tue, 20 May 2014 13:44:34 -0700
Message-ID: <537BBEB2.2040509@kcoyle.net>
To: public-vocabs@w3.org


On 5/20/14, 9:16 AM, Dan Brickley wrote:


> 2.
> If we want a more SKOS-like, bibliographic and nuanced notion of
> 'subject', I suggest we adopt something like Dublin Core's 'subject'
> to do that work.
>
> (DC has "The topic of the resource."/ "Typically, the subject will be
> represented using keywords, key phrases, or classification codes.
> Recommended best practice is to use a controlled vocabulary.", from
> http://purl.org/dc/terms/ )

Dan,

I would advise against adding subject. We have two different properties 
that cover the waterfront:

about = the subject of the CreativeWork (or Thing, as proposed)
keywords = terms assigned to someThing.

Keywords do not necessarily denote "aboutness" - cf. the heavy use of 
the keywords "read" and "to read" in LibraryThing. (Not to mention the 
wonderful ambiguity of "read" in English.) cf. also Flickr "Second try" 
tag.

Therefore, schema.org/about denotes, well, aboutness, the subject of, 
and schema.org/keywords means simply "terms assigned to this Thing" 
without defining the nature of the relationship between the terms and 
the Thing other than "assigned to".

I agree with Simon that "about" is itself problematic, in a very human 
way, but it is also a great source of knowledge. I think that people 
inference over it at their own risk, with possible interesting returns.

kc


>
> The distinction:
>
> if we want to say "This document is about the entity Sweden, i.e. the
> thing that is sameAs http://en.wikipedia.org/wiki/Sweden
> http://www.freebase.com/m/0d0vqn), we would use
> http://schema.org/about   ... i.e. this tells us the main thing that
> the page is about.
>
> but
>
> If we want to say, "This document's topic is “environmental impact of
> the decline of tin mining in Sweden in the 20th century“, we'd be
> going beyond "about" and would want a more bibliographic subject
> description, e.g. using DDC or UDC subject classification codes, SKOS
> etc.
>
> (fictional example, I know nothing about tin mining in Sweden)
>
> My proposal then is that we break out these two use cases, and target
> the 'about' more explicitly on the 'main entity' use case.
>
> 3. Tweak http://schema.org/mentions
>
> We should note that http://schema.org/mentions is a very similar
> notion to http://schema.org/about except that it allows multiple
> different entities to be referenced.
>
> "Indicates that the CreativeWork contains a reference to, but is not
> necessarily about a concept."
>
> I suggest rewording this in terms of entities/things, since we don't
> use 'concept' elsewhere:
>
> "Indicates that the CreativeWork contains a reference to, but is not
> necessarily about some particular thing."
>
> 4. http://schema.org/mainContentOfPage
>
> We already have this strange-looking property. It addresses a
> different use case:
>
> it relates a WebPage to a part of that WebPage,
> "Indicates if this web page element is the main subject of the page."
>
> The wording is awkward. It should be something like "Indicates the
> main element within some Web page." since the expected type is
> WebPageElement.
>
> I'm not convinced that the various types we have under WebPageElement
> ("A web page element, like a table or an image") really work, but the
> important point here is that they address a different scenario. A
> WebPageElement is a piece of markup, like SiteNavigationElement,
> Table, WPAdBlock, WPFooter, WPHeader, WPSideBar. This is a different
> idea to the problem of finding the main *entity* that all this markup
> is describing.
>
> HTML already a <main> element, see
> https://developer.mozilla.org/en-US/docs/Web/HTML/Element/main
>
> "The HTML <main> element represents the main content of  the <body> of
> a document or application. The main content area consists of content
> that is directly related to, or expands upon the central topic of a
> document or the central functionality of an application. This content
> should be unique to the document, excluding any content that is
> repeated across a set of documents such as sidebars, navigation links,
> copyright information, site logos, and search forms (unless, of
> course, the document's main function is as a search form)."
>
> I believe most of the use cases for mainContentOfPage are better
> addressed by <main>.
>
> However <main> does not help us pick out a single highlighted entity:
> the main section of a Web page could still contain microdata/rdfa or
> json-ld mentioning lots of different entities.
>
> It is useful sometimes to know that structured data markup comes from
> footers or boilerplate rather than the <main> section of a page, and
> it is probably worth including some examples of this on the schema.org
> site.
>
>
> 5. Avoiding ratholes
>
> If we can please discuss this without slipping into discussion of
> http://www.w3.org/2001/tag/group/track/issues/14 I'd be happy. There
> are places in schema.org usage where we tolerate an URL for a WebPage
> being used in place of an URL that is more explictly for the
> real-world entity itself. For example in http://schema.org/Person we
> write "<a href="http://www.xyz.edu/students/alicejones.html"
> itemprop="colleague">Alice Jones</a>".
>
> Clarifying the use of 'about' as above could help such pages clarify
> which real world entity they are 'about'. This won't solve every issue
> around entity disambiguation, but it will improve the basic support we
> have within schema.org for stating such distinctions when we want to.
>
> (Sorry this was such a long mail...)
>
> Finally, let's also try not to get stuck on syntax issues at this
> stage. We'll have to find the best patterns in Microdata/RDFa and
> JSON-LD that we can for this, and it may sometimes be tricky. Here's
> an attempt at amending the MusicEvent example by adding a WebPage and
> 'about' - https://gist.github.com/anonymous/cf7e24f6378b176aa010 . We
> might want to discuss a reverse property that could be expressed on
> the entity rather than the page, for example.
>
> cheers,
>
> Dan
>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet
Received on Tuesday, 20 May 2014 20:45:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:41 UTC