Blank nodes for unknown values: was Re: [External] The Joy of NULLs (not)

Another way to think about the “extra structure”, at least conceptually, is from a property graph POV where edges can contain data beyond the predicate label. The subtle namespace difference lets the data consumer decide whether they want to peek inside the edge or not. It seems like a nice compromise for representing property graphs using RDF.

Here are some other Wikibase properties that have at least one occurrence where the property value is blank/unknown:

https://w.wiki/76m


I agree that the statement URI is strange because it only dumps you on the page. It’s especially odd since there is a hash URI variant being used in the HTML that positions browsers directly on the claim itself:

https://www.wikidata.org/wiki/Q8747#Q8747$49454632-42cf-dca0-2ec4-47ece3472edf


Jeff


From: Hugh Glaser <hugh@glasers.org>
Date: Wednesday, August 14, 2019 at 9:52 AM
To: Daniel Hernandez <daniel@degu.cl>
Cc: "semantic-web@w3.org" <semantic-web@w3.org>
Subject: Re: [External] The Joy of NULLs (not)
Resent-From: <semantic-web@w3.org>
Resent-Date: Wednesday, August 14, 2019 at 9:45 AM

Yeah, so I girded my loins and curled up :-)
curl -L -H "Accept:application/rdf+xml" http://www.wikidata.org/entity/Q8747<http://www.wikidata.org/entity/Q8747>

So you get
<wdt:P19 rdf:nodeID="genid1"/>
and
<p:P19 rdf:resource="http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf<http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf>"/>

That is, predicates
http://www.wikidata.org/prop/direct/P19<http://www.wikidata.org/prop/direct/P19>
and
http://www.wikidata.org/prop/P19<http://www.wikidata.org/prop/P19>

It is possibly more interesting to look at P20 (death), as there are two p:P20 triples.

As far as I know, the extra structure is all about a sort of reification, so that you can make ranking and possibly other statements about the facts.
But you want to be able to access the knowledge easily in a triple, so they have the "direct" predicate to take you to chosen value(s) (this they also call the "truthy" property).
I think the browser renderer always uses the reified property, so that it can also give the other data.
And perhaps also confusing is that the URIs of statements (such as "http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf<http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf>" above) on a page all redirect to the main entity, which makes it very, very hard to work out what is going on. You have to SPARQL, as even curling doesn't help.

Back to the use of blank nodes.
I'm not sure why they have them at all.
The only information they convey is an opinion that the subject was born or died, possibly with some other metadata about the statement.
Apparently http://www.wikidata.org/entity/Q1068229<http://www.wikidata.org/entity/Q1068229> was born, but has never died.
And http://www.wikidata.org/entity/Q302<http://www.wikidata.org/entity/Q302> seems to have died, but some people would dispute that.
In the case of Euclid's death, I don't get why they would keep the blank version while having the Alexandria statement too.
I'm sure there are good reasons ;-) By the way, it seems that Wikidata only does this "unknown value" for birth and death stuff, but I haven't dug deep.

Basically, this use of blank nodes just doesn't seem good practice to me, which I guess is why I am worrying at it.

Sorry to have got off the original fun NULL licence plate topic, Mike.

Best
Hugh

PS
If anyone from Wikidata is reading this, did you know that none of the prov:wasDerivedFrom URIs resolve to anything useful, but all go to the Help page?

> On 14 Aug 2019, at 00:19, Daniel Hernandez <daniel@degu.cl> wrote:
>
> It is strange. The problem is the way Wikidata presents blank nodes in tabular output.
> I transformed the query into a CONSTRUCT one (https://w.wiki/74o<https://w.wiki/74o>) to see the blank
> node in a standard RDF syntax. Also, I changed the "Accept" parameter in the request:
>
> Accept: text/turtle, application/rdf+xml
>
> The output is an RDF/XML file including:
>
> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q8747<http://www.wikidata.org/entity/Q8747>">
> <wdt:P19 rdf:nodeID="t517246030"/>
> </rdf:Description>
>
> The blank is presented as a blank node.
>
> --
> Daniel
>
> ---- On Tue, 13 Aug 2019 18:25:58 -0400 Hugh Glaser <hugh@glasers.org> wrote ----
>> Hmmm. Wikidata can be strange.
>> There is a lot of indirection around.
>> And the redesign pages can be very misleading.
>> I'm not sure I see any actual blank nodes there, or at least none getting exposed.
>>
>> I see ?o gets a text value in the SPARQL output - of the form "tnnnnnnn".
>> Whereas for a person for whom more is known about the PoB, it is a URI of a place.
>>
>> If I dig further (still using the SPARQL engine, the underlying real RDF may be different again!):
>> wd:Q8747 p:P19 ?o .
>> gives
>> http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf<http://www.wikidata.org/entity/statement/Q8747-49454632-42cf-dca0-2ec4-47ece3472edf>
>> which doesn't look like a blank node to me.
>> And in fact
>> wds:Q8747-49454632-42cf-dca0-2ec4-47ece3472edf ?p ?o .
>> leads you to "t517245985" via ps:P19
>>
>> So, clearly the ISBLANK does something, so internally it is probably doing what you say, but that is not being exposed.
>>
>>
>> Cheers
>>
>>> On 12 Aug 2019, at 15:05, Young,Jeff (OR) <jyoung@oclc.org> wrote:
>>>
>>> Here’s an example showing blank nodes being used to declare the place of birth is unknown in Wikidata:
>>>
>>> https://w.wiki/6$y<https://w.wiki/6$y>
>>>
>>> In the UI, it is rendered like this:
>>>
>>> <image001.png>
>>>
>>> Jeff
>>>
>>> From: Daniel Hernandez <daniel@degu.cl>
>>> Date: Monday, August 12, 2019 at 9:42 AM
>>> To: "semantic-web@w3.org" <semantic-web@w3.org>
>>> Subject: [External] Re: The Joy of NULLs (not)
>>> Resent-From: <semantic-web@w3.org>
>>> Resent-Date: Monday, August 12, 2019 at 9:37 AM
>>>
>>> As Enrico pointed, blank nodes can be used to represent unknown values.
>>> An example of this use is Wikidata. I don't know another example.
>>>
>>> --
>>> Daniel
>>>
>>> On Mon, 12 Aug 2019 07:36:41 +0000
>>> Franconi Enrico <franconi@inf.unibz.it> wrote:
>>>
>>>> Mike, this could easily happen in an RDF world if you register a
>>>> vanity licence plate with anything starting with "_". Indeed, bnodes
>>>> would be the right way to represent unknown but existing plates. --e.
>>>>
>>>> Il giorno 11 ago 2019, alle ore 23:10, Michael F Uschold
>>>> <uschold@gmail.com<mailto:uschold@gmail.com>> ha scritto:
>>>>
>>>>> This is hilarious. It could never happen in an RDF world! No value,
>>>>> no triple.
>>>>>
>>>>> He tried to prank the DMV. Then his vanity license plate backfired
>>>>> big time.
>>>>> https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/<https://mashable.com/article/dmv-vanity-license-plate-def-con-backfire/><http://flip.it/NIk7FD<http://flip.it/NIk7FD>>
>>
>>
>>
>> --
>> Hugh
>> 023 8061 5652
>>
>>
>>
>
>

--
Hugh
023 8061 5652

Received on Wednesday, 14 August 2019 20:18:16 UTC