Re: Disambiguating Things

On 3 July 2017 at 17:12, Thad Guidry <> wrote:

> No more examples.
> LOL, Dan... My Rabbit hole is not that deep.  This is about comparing lots
> of property values...but holding all of them as a Bitmap that is then
> checksum-ed and stored as a non-unique identifier.
> Its simply that I amalgamate lots of property values from my virtual IoT
> Things and create a long-lived identifier from those (virtual IoT Things
> are spun up dynamically and output values of measurements over their
> lifespan) and so the actual identifier is not held for very long, in fact,
> I don't store it after everything is amalgamated...only a few hours)...but
> my Things Category type and all their properties and their values are
> stored on a DB.  The need is to compare those virtual IoT Things for weeks,
> months, years.  Imagine a VM... but instead... its IoT VM (skunkworks).
> These kind of scenarios are even present now in Industrial domains.
> For now, most folks are just pitching over the fence with
> "text"... but knowing a bit more about textual identifiers themselves, like
> how they are formed, the algorithm used, and other metadata about
> identifiers, etc.. is a present need.  We currently know when we have
> identical virtual IoT Things because we can compare those checksums or
> fingerprints.
> Probably best if I changed the subject of the email thread to
> "Disambiguating 1000's of properties for virtual IoT Things"

Ah, ok I think I'm getting the picture now. Probably-maybe. It might be
related to the habit in RDF/RDFS/OWL of saying that some properties are
"inverse functional"; for any particular value of the property there's at
most one thing in the world that can have that property/value combination.
If that's about it, could you file a github issue?


> -Thad
> +ThadGuidry <>
> On Mon, Jul 3, 2017 at 10:50 AM Dan Brickley <> wrote:
>> I suspect you'll find this gets complex quite quickly as you run into
>> for_Bibliographic_Records -like distinctions.
>> Back in the FOAF project we experimented with a sha1sum property. It
>> turns out two entirely different entities (on one conceptualization) can
>> have the same character/byte content and hence hash.
>> e.g. in a unix-y environment:
>> touch hello_world.txt
>> shasum hello_world.txt
>> ... gives something like -rw-rw-r-- 1 ubuntu ubuntu 0 Jul  3 15:21
>> hello_world.txt
>> shasum hello_world.txt
>> da39a3ee5e6b4b0d3255bfef95601890afd80709  hello_world.txt
>> ... on a different machine you'll have a different datestamp and username
>> but the same empty file. Various parts of have some approach
>> to describing these kinds of distinction for CreativeWorks (e.g.
>> "workExample"; or "encoding" on MediaObject; or "distribution" vs
>> "contentUrl" on Dataset -> DataDownload, ...). And the further you get from
>> bytes, the more tenuous the link back to checksum maths; c.f.
>> dnaChecksum...).
>> I can see value in having enough clarity around MediaObject and nearby
>> that we can talk about checksums more cleanly, but I'm not sure how far
>> that'll get us. It would be interesting for datasets and software
>> applications and so on to have this capability, so that you can look up the
>> right metadata to go with a concrete download / media file. Do you have
>> some more examples we can work through?
>> cheers,
>> Dan
>> On 3 July 2017 at 16:32, Thad Guidry <> wrote:
>>> I now have the need to disambiguate between Things at a deeper level
>>> than just property comparisons.
>>> I'd like to see the use or way of telling my apps that a checksum is the
>>> disambiguatingDescription or identifier property on my Things.
>>> Currently, we have no Type of "Checksum" under Intangible.  That might
>>> be thoughtful in the future.
>>> But we do have PropertyValue available, but it loses the
>>> This need arouse from the recent introduction of "checksum" property in
>>> Wikidata as well...hence my Apps can take advantage of that now but not
>>> without uplifts, since my Apps depend on valid properties in
>>> (insert skunkworks stuff here)
>>> I guess this is an alternative way to perform what I am needing
>>> {
>>>   "@context": "",
>>>   "@type": "Thing",
>>>   "name": "Some IoT Thing",
>>>   "url": "",
>>>   "identifier": {
>>>     "@type": "PropertyValue",
>>>     "alternateName":"checksum",
>>>     "additionalType":"",
>>>     "value": "8044d756b7f00b695ab8dce07dce43e5",
>>>     "unitCode":""
>>>     }
>>> }
>>> Thoughts or ideas or any;thing that I am missing ?
>>> If the above is actually a really good example, then we should probably
>>> add it as an 3rd example on  ?
>>> -Thad
>>> +ThadGuidry <>

Received on Monday, 3 July 2017 17:00:02 UTC