Re: Disambiguating Things

On 3 July 2017 at 17:12, Thad Guidry <thadguidry@gmail.com> wrote:

> No more examples.
>
> LOL, Dan... My Rabbit hole is not that deep.  This is about comparing lots
> of property values...but holding all of them as a Bitmap that is then
> checksum-ed and stored as a non-unique identifier.
>
> Its simply that I amalgamate lots of property values from my virtual IoT
> Things and create a long-lived identifier from those (virtual IoT Things
> are spun up dynamically and output values of measurements over their
> lifespan) and so the actual identifier is not held for very long, in fact,
> I don't store it after everything is amalgamated...only a few hours)...but
> my Things Category type and all their properties and their values are
> stored on a DB.  The need is to compare those virtual IoT Things for weeks,
> months, years.  Imagine a VM... but instead... its IoT VM (skunkworks).
>
> These kind of scenarios are even present now in Industrial domains.
> For now, most folks are just pitching over the Schema.org fence with
> "text"... but knowing a bit more about textual identifiers themselves, like
> how they are formed, the algorithm used, and other metadata about
> identifiers, etc.. is a present need.  We currently know when we have
> identical virtual IoT Things because we can compare those checksums or
> fingerprints.
>
> Probably best if I changed the subject of the email thread to
> "Disambiguating 1000's of properties for virtual IoT Things"
>

Ah, ok I think I'm getting the picture now. Probably-maybe. It might be
related to the habit in RDF/RDFS/OWL of saying that some properties are
"inverse functional"; for any particular value of the property there's at
most one thing in the world that can have that property/value combination.
If that's about it, could you file a github issue?

Dan



> -Thad
> +ThadGuidry <https://www.google.com/+ThadGuidry>
>
> On Mon, Jul 3, 2017 at 10:50 AM Dan Brickley <danbri@google.com> wrote:
>
>>
>> I suspect you'll find this gets complex quite quickly as you run into
>> https://en.wikipedia.org/wiki/Functional_Requirements_
>> for_Bibliographic_Records -like distinctions.
>>
>> Back in the FOAF project we experimented with a sha1sum property. It
>> turns out two entirely different entities (on one conceptualization) can
>> have the same character/byte content and hence hash.
>>
>> e.g. in a unix-y environment:
>>
>> touch hello_world.txt
>>
>> shasum hello_world.txt
>>
>>
>> ... gives something like -rw-rw-r-- 1 ubuntu ubuntu 0 Jul  3 15:21
>> hello_world.txt
>>
>> shasum hello_world.txt
>>
>> da39a3ee5e6b4b0d3255bfef95601890afd80709  hello_world.txt
>>
>> ... on a different machine you'll have a different datestamp and username
>> but the same empty file. Various parts of schema.org have some approach
>> to describing these kinds of distinction for CreativeWorks (e.g.
>> "workExample"; or "encoding" on MediaObject; or "distribution" vs
>> "contentUrl" on Dataset -> DataDownload, ...). And the further you get from
>> bytes, the more tenuous the link back to checksum maths; c.f.
>> dnaChecksum...).
>>
>> I can see value in having enough clarity around MediaObject and nearby
>> that we can talk about checksums more cleanly, but I'm not sure how far
>> that'll get us. It would be interesting for datasets and software
>> applications and so on to have this capability, so that you can look up the
>> right metadata to go with a concrete download / media file. Do you have
>> some more examples we can work through?
>>
>> cheers,
>>
>> Dan
>>
>> On 3 July 2017 at 16:32, Thad Guidry <thadguidry@gmail.com> wrote:
>>
>>> I now have the need to disambiguate between Things at a deeper level
>>> than just property comparisons.
>>>
>>> I'd like to see the use or way of telling my apps that a checksum is the
>>> disambiguatingDescription or identifier property on my Things.
>>>
>>> Currently, we have no Type of "Checksum" under Intangible.  That might
>>> be thoughtful in the future.
>>>
>>> But we do have PropertyValue available, but it loses the
>>>
>>> This need arouse from the recent introduction of "checksum" property in
>>> Wikidata as well...hence my Apps can take advantage of that now but not
>>> without Schema.org uplifts, since my Apps depend on valid properties in
>>> Schema.org... (insert skunkworks stuff here)
>>>
>>> I guess this is an alternative way to perform what I am needing
>>>
>>> {
>>>   "@context": "http://schema.org/",
>>>   "@type": "Thing",
>>>   "name": "Some IoT Thing",
>>>   "url": "http://www.example.com/Some+IoT+Thing",
>>>   "identifier": {
>>>     "@type": "PropertyValue",
>>>     "alternateName":"checksum",
>>>     "additionalType":"https://www.wikidata.org/wiki/Q218341",
>>>     "value": "8044d756b7f00b695ab8dce07dce43e5",
>>>     "unitCode":"https://www.wikidata.org/wiki/Q185235"
>>>     }
>>> }
>>>
>>> Thoughts or ideas or any;thing that I am missing ?
>>>
>>> If the above is actually a really good example, then we should probably
>>> add it as an 3rd example on http://schema.org/identifier  ?
>>>
>>> -Thad
>>> +ThadGuidry <https://www.google.com/+ThadGuidry>
>>>
>>>
>>

Received on Monday, 3 July 2017 17:00:02 UTC