- From: <hans.teijgeler@quicknet.nl>
- Date: Fri, 18 Feb 2022 19:22:41 +0100
- To: "'Hugh Glaser'" <hugh@glasers.org>
- Cc: "'Matthew Lange'" <matthew@ic-foods.org>, "'Semantic Web'" <semantic-web@w3.org>
- Message-ID: <013901d824f4$84afe610$8e0fb230$@quicknet.nl>
Hugh, [HG]......you manage all the data during ingestion by generating a brand new UUID [HT] Only for new template instances, since these are representing new information [HG] ..... you create your own representations of knowledge coming in, and people consuming have to use your IDs. [HT] The users arent using those UUIDs, they keep using their identifiers, and the fetched input information is also UUID-ignorant. The use of "our" IDs is limited to the reference data, because of the adherence to the upper ontology. Users can set up their local RDL extension, using their own identifiers, as long as these are made specializations of the 15926 RDL. CFIHOS is using theirs, and these are subclasses of RDL classes, for example: transmitter id http://data.15926.org/cfihos/30000661 rdfs:label transmitter rdf:type ClassOfFunctionalObject # this a an ISO 15926-2 entity type rdfs:subClassOf instrument equipment # this is a CFIHOS superclass rdfs:subClassOf TRANSMITTER # this is an RDL class skos:definition A physical object that is an element that receives a process variable signal from a sensor and converts it into an output signal The system is wide open because everybody does his/her thing and is unaware of ISO 15926 and yet they can share information via the, oftentimes federated, triple stores. And yes, a public "sameAs store", that you mentioned earlier, is a good idea. When using UUIDs duplicates in that store will, for all intents and purposes, non-existent. But put some strong protection in that store to avoid hackers to create havoc. [HG] Especially (National) Digital Twin stuff - you just can't rely on everyone using the same IDs for everything; and you can't expect to put all the data in a single, curated, store, certainly not at a national scale. [HT] They don't have to, but they need to use an upper ontology that overarches all participating domains. Without that you keep writing and maintaining interfaces till Doomsday. Regards, Hans __________________________________________________________________ -----Original Message----- From: Hugh Glaser <hugh@glasers.org> Sent: vrijdag 18 februari 2022 13:32 To: hans.teijgeler@quicknet.nl Cc: rowen.rathling@gmail.com; Matthew Lange <matthew@ic-foods.org>; Semantic Web <semantic-web@w3.org> Subject: Re: IDs (was EasierRDF) Thanks Hans, It seems to me that your approach is that roughly the UUID is the only strong ID, and any other "ID"s are simply labels to you. And you manage all the data during ingestion by generating a brand new UUID ( <https://xkcd.com/927/> https://xkcd.com/927/ :-) ). So you have a relatively closed system: you create your own representations of knowledge coming in, and people consuming have to use your IDs. Not saying that's bad - this stuff is hard enough without getting involved in the problems I am asking about. BTW, I am aware of ISO15926, DT & CDDB, and somewhat involved in the developing 4D stuff. And I think that Matthew & AlI have a gentle agreement that the standards don't really address the ID management problems, which will need something to do so in due course. Especially (National) Digital Twin stuff - you just can't rely on everyone using the same IDs for everything; and you can't expect to put all the data in a single, curated, store, certainly not at a national scale. Best Hugh > On 17 Feb 2022, at 23:07, <mailto:hans.teijgeler@quicknet.nl> hans.teijgeler@quicknet.nl wrote: > > Hi Hugh, > > First of all our domain is a process plant, and all plant items are declared and get their UUID and a label. > These are stored in the consolidating triple store of the plant. > > Then comes a project to revamp, or to extend the plant with a new unit. In most cases that is handled by a so-called EPC contractor (EPC = Engineering Procurement, Construction). > In the case of a revamp the plant owner must share the relevant data about the existing situation. Ideally this is done by federation, leaving that information under the control of the plant owner since that plant is still in operation for the, say, two years that such a revamp takes (which makes it an update of a moving target). > > In either case the EPC contractor starts to design the new situation, as the design part of a Digital Twin (to be: DTs are making inroads in this field, Messrs Aveva, for instance, are designing a new one based on ISO 15926. At least that is what they told us). Once the design is finished, the procurement has been done, and the plant has been constructed the contractor hands over the design & engineering information to the plant owner. It is this rather complex use case that triggered the development of ISO 15926, also because plant owners and EPC contractors have a 'promiscuous' relationship in most cases, causing endless interfacing. > > Now to your questions: > [HG] How do you manage the UUIDs? > [HT] The issue of UUIDs is not managed because the chances of double occurrence of a UUID is, for all practical purposes, zero. I read: '128-bits is big enough and the generation algorithm is unique enough that if 1,000,000,000 GUIDs per second were generated for 1 year the probability of a duplicate would be only 50%.'. On top of that they are residing in an endpoint of which there are also a gazillion. I use this generator, but undoubtedly there are more of those. > > [HG] For example, when you start to use data from a new DB, do you modify it to re-write the primary keys (or whatever) to the UUID? > [HT] In the treatise I sent to you on Feb. 15th you can read that we intend to map the data of all applications that are used in the context of a plant to ISO 15926-8 in Turtle. In those apps the identifiers (tag numbers) are those dictated by the plant owner. When mapped a check is made by the software to see if that identifier already exists. If not, the technical discipline involved must decide what to do. But that is exceptional, unless the users of the app spelled the identifier incorrectly. In the little diagram I sent an in-between triple store is shown where the responsible discipline can correct or ignore or transfer the triples to the consolidating triple store. Because of the uniqueness of the UUIDs that transfer can basically be done by changing the endpoint of it during the transfer. > > [HG] Or do you add the UUID to the DB, with an internal mapping table to the keys, and then modify everyones existing queries to include the UUIDs? > [HT] The mapping between identifier and UUID is in the declaration, e.g.: > > ex:847931fd-eade-4beb-b07d-a9e889611c19 > rdf:type lci:InanimatePhysicalObject, dm:WholeLifeIndividual, dm: ActualIndiidual, rdl:RDS414674 ; # VESSEL > rdfs:label "HG-ey37" ; > meta:valEffectiveDate "2021-04-13T15:29:00Z"^^xsd:dateTime . > > That mapping is used to fetch the UUID for a human-readable label. When an object has more labels, such as serial number, asset number, maintenance number, etc, each identification is covered by a template instance, for example: > > ex:763c75da-97c1-4b4e-b699-cf616c7c7a5d > rdf:type tpl:ClassifiedIdentificationOfIndividual > rdfs:label "[VESSEL] individual [HG-ey37] has an [IDENTIFICATION BY ASSET NUMBER] [AN-45348832]"@en ; # storage of this label is optional - it could be generated on the fly > tpl:hasIdentified ex:847931fd-eade-4beb-b07d-a9e889611c19 ; # HG-ey37 > tpl:hasIdentifier "AN-45348832" ; > tpl:hasIdentificationType rdl:RDS2221102 ; # IDENTIFICATION BY ASSET NUMBER > meta:valEffectiveDate "2021-09-21T10:24:00Z"^^xsd:dateTime . > > Please note that templates are representing elementary, autonomous, information chunks, not the object(s) where the information is about. Actually an elementary KG. > > [HG] Or is there some wrapping layer around it somehow? > [HT] No > > [HG] How do you make your UUIDs discoverable, in particular in relation to external IDs that come from different DBs? > [HT] In addition to what I wrote above, our Reference Data Library has extensions for a number of standardization bodies, like ASTM, ASME, DIN, BS, IEC, etc. What we do is assigning our own number and making reference to a particular standard class. For instance: > > Transmitter > id <http://data.15926.org/iec/ABA880> http://data.15926.org/iec/ABA880 > rdfs:label Transmitter > skos:definition A <Transmitter> is a <Measuring instrument component> and a <PROCESS VARIABLE TRANSMITTER> that > accepts a process variable and converts it according to a definite law into a standardized output signal. > owl:sameAs <https://cdd.iec.ch/cdd/iec61987/iec61987.nsf/TU0/0112-2---61987%23ABA880> https://cdd.iec.ch/cdd/iec61987/iec61987.nsf/TU0/0112-2---61987%23ABA880 > meta:valEffectiveDate 2021-10-03Z > rdf:type ClassOfFunctionalObject > rdfs:subClassOf Measuring instrument component > rdfs:subClassOf PROCESS VARIABLE TRANSMITTER > > In other cases, where the standardization body has no endpoint, we just refer to a standard, such as: > > FLANGED END RING JOINT ASME B16.5 CLASS 2500 NPS 10 > id <http://data.15926.org/asme/RDS730304> http://data.15926.org/asme/RDS730304 > rdfs:label FLANGED END RING JOINT ASME B16.5 CLASS 2500 NPS 10 > rdf:type ClassOfFeature > rdfs:subClassOf FLANGED END RING JOINT ASME B16.5 > rdfs:subClassOf FLANGED END ASME B16.5 CLASS 2500 NPS 10 > skos:definition A <FLANGED END RING JOINT ASME B16.5 CLASS 2500 NPS 10> is a <FLANGED END ASME B16.5 CLASS 2500 NPS 10> > and a <FLANGED END RING JOINT ASME B16.5> conforming to the > specification for Class 2500, NPS 10 > > You see: small questions - large answers. > > Regards, Hans > 15926.org > > PS This presentation may interest you: > <https://www.youtube.com/watch?v=tRGHBYsz2KM> https://www.youtube.com/watch?v=tRGHBYsz2KM It describes the next > step after ISO 15926: the CDBB Project in the UK: > <https://www.cdbb.cam.ac.uk/> https://www.cdbb.cam.ac.uk/ > ______________________________________________________________________ > ___________________________________________________ > > From: Hugh Glaser < <mailto:hugh@glasers.org> hugh@glasers.org> > Sent: donderdag 17 februari 2022 20:09 > To: <mailto:hans.teijgeler@quicknet.nl> hans.teijgeler@quicknet.nl > Cc: Matthew Lange < <mailto:matthew@ic-foods.org> matthew@ic-foods.org>; Semantic Web > < <mailto:semantic-web@w3.org> semantic-web@w3.org> > Subject: Re: EasierRDF > > By the way, how do you manage the UUIDs? > For example, when you start to use data from a new DB, do you modify it to re-write the primary keys (or whatever) to the UUID? > Or do you add the UUID to the DB, with an internal mapping table to the keys, and then modify everyones existing queries to include the UUIDs? > Or is there some wrapping layer around it somehow? > How do you make your UUIDs discoverable, in particular in relation to external IDs that come from different DBs? > > Or maybe I am thinking of a different world. > Cheers > > > On 17 Feb 2022, at 14:03, < <mailto:hans.teijgeler@quicknet.nl> hans.teijgeler@quicknet.nl> < <mailto:hans.teijgeler@quicknet.nl> hans.teijgeler@quicknet.nl> wrote: > > > > Hi Hugh, > > > > We use UUIDs, because of the long period in time and the many contributors of life-cycle information. > > Next to the UUID we use rdfs:label for easy access. Label can change, the UUID stays lifelong. > > > > Regards, Hans
Received on Friday, 18 February 2022 18:22:58 UTC