Anyone can say anything about anything

I really like Chris Webber's inquiry about using multiple class types in 
AS here:

http://socialwg.indiewebcamp.com/irc/social/2015-08-13#t1439480142472

I think James Snell addressed it eloquently and got Chris going.

I would like to elaborate here in case Chris or others would like to 
learn more. Please pardon me if you are all familiar with the 
primer-like information. So, some background and real-world examples 
which can help you better understand why AS is essentially doing what it 
is doing.

It starts with the idea/fact that "anyone can say anything about 
anything" on the Web, and how it compliments the open-world assumption.

Things (people, places, ... real or imaginary) can be instances of 
multiple classes. That is, a resource can be described by using multiple 
independently developed vocabularies across the Web.

The following example (in RDF Turtle, skipping the prefixes) states that 
http://example.org/note is an ActivityStreams note as well as an 
OpenAnnotation annotation.

<http://example.org/foo>
   a as:Note , oa:Annotation .

In JSON-LD (skipping the context):

{
   "@id": "http://example.org/foo",
   "@type": ["Create", "oa:Annotation"]
}

An example of this is here in the threaded interactions:

http://csarven.ca/webmention#interactions

You can see other examples there where you can mix vocabularies to 
describe different characteristics of a resource. This is because 
presumably no single vocabulary could define all human knowledge.

Similarly, we can of course use multiple classes which may or may not 
come from the same vocabulary using subclasses (e.g., person is a 
subclass of agent):

<http://example.org/person>
   a foaf:Person , foaf:Agent .

or from different vocabularies:

<http://example.org/area>
   a as:Place , gr:Location .

as:Place may or may not have any relations between them. They could 
conceptually define an "area", and they may not. One of them may inherit 
its semantics from the other, or not. For both humans and machines to 
understand that, they simply dereference as:Place or gr:Location to 
discover its own definition. (This sort of behaviour allows us to 
traverse the Web of data [including vocabularies] using the exact same 
mechanism; "follow-your-nose" type of exploration.)

So, there may be a mapping between them (using any vocabulary) e.g:

as:Place skos:exactMatch gr:Location .

"skos:exactMatch is used to link two concepts, indicating a high degree 
of confidence that the concepts can be used interchangeably across a 
wide range of information retrieval applications."

Another example:

<http://worldbank.270a.info/classification/country/CA>
   skos:exactMatch <http://ecb.270a.info/code/1.0/CL_AREA_EE/CA> .

The concept for Canada from the World Bank states that it is pretty much 
interchangeable as the concept Canada from the European Central Bank. Of 
course it is permitted to say that one of them is a subClass of the 
other, but I won't get into data and concept comparability here.

(I don't want to side-track here so I'll drop an example from the wild 
showcasing what something like that contributes towawrds: 
http://stats.270a.info/analysis/worldbank:SP.DYN.IMRT.IN/transparency:CPI2009/year:2009 
is a human and machine-processable federated statistical analysis). Now 
look at this http://lod-cloud.net/ to quench your thirst. BTW, note the 
giant chunk of FOAF profiles from StatusNet ;))

Whether mixing and matching vocabularies is needed or not is entirely up 
to the publisher. The publisher decides how to best offer or make their 
data useful. Using our as:Place and gr:Location from earlier, this plays 
an important part in data discovery e.g., a consumer may only know or 
wants to get a hold of as:Place or gr:Location and vice-versa, or that 
the publisher is using the AS vocabulary to describe some aspects of 
http://example.org/area and the other aspects using OA. AS and OA may 
have some overlap, or not. That's not really a concern.

As the publisher can't conceivably know every possible way in which 
their data may be used (and we know that data ends up being consumed in 
new and creative ways), they try to (because they know their data best) 
strike a balance; who or what type of things are likely to consume this? 
What are my costs for doing that? And, so on. This plays along with the 
world open-world assumption because missing is not wrong. Describe 
things as you see fit; the "pay-as-you-go" concept works. Build and 
extend the graph of things on the Web as you see fit.

This fundamentally allows people to remix "social" data with data from 
other domains e.g, geolocation, demographics, life sciences, research 
data, linguistics, finances. There is nothing hypothetical here. It is 
already happening. We are merely trying to make more explicit relations 
so that both humans and machines can unambiguously discover and put them 
to use.

Having said all of that, it is good practice to follow the definitions 
in the ontologies/vocabularies (e.g., pay attention to what the expected 
domains and ranges are) in order to play along with how other people are 
publishing and consuming data. For example:

<http://example.org/bar>
   a foaf:Person , gr:Location .

doesn't make much sense if you want machines to figure things out on 
their own. Not to mention it is pretty awkward to come up with a 
sensible UI for humans. However, making that statement in and of itself 
is not wrong. Re: anyone can say anything about anything on the Web. We 
try to be nice, and do what we can to have interoperability but there is 
no policing on what we get to say. The trust layer is outside of all 
this. If you trust example.org's statements, or find it useful, that's 
up to you. For example, example.net may make statements about 
example.org. It is up to the consumer to decide (based on its own 
decision-making process) how to handle it.

It is as simple as it gets. Triple statements (containing the atomic 
parts: subject, property, object) are made to describe everything around 
us (data and vocabularies). We walk through the Web / graphs all the 
same. The RDF *language* is essentially the EAV model for the Web. 
Everything else which tries to re-invent this wheel tends to fall short.

Hope the above was of some interest to some of you :)

-Sarven
http://csarven.ca/#i

Received on Friday, 14 August 2015 08:35:15 UTC