Re: [dxwg] Improving profile guidance intro

Here's what the introduction looks like substituting Peter's top paragraphs for the previous first paragraph:

A profile is generally understood as being the outline of the margin of some thing when seen from a specific point of view. Profiling is also the task of distilling the essential aspects or character of something, such as a person, from a specific angle. Also, in the craft domain, a profile is taken by a tool that matches itself in detail to the contours of a 3-dimensional object and returns a 2-dimensional accurate representation from which other formable materials can be constrained and fashioned to that profile, or matched with it to determine how accurately it portrays the original 3-dimensional object from which the profile was taken.

In the same sense then, information entities can be viewed from different perspectives and in order to prepare them for specific uses they are frequently tested for their goodness of fit to some pattern, or the pattern can be provided prior to the gathering of the information to provide some constraint to ensure adequacy and appropriateness of that information asset to the job in hand.

Communities create and use data standards to ensure interoperability for information exchange. Although members of a community may use the same basic standard schema, it is very common for different subsets within the larger community to need some further specification of the data they create to meet their own needs. To continue to support interoperability of their data with others, these community members need to express the specifics of their implementation of the data schema. Profiles serve this purpose. Profiles enumerate vocabulary terms, cardinality, and validation rules, and can also include descriptions of the rules used by creators to make decisions regarding their data elements.

Good metadata practice begins with the builders of vocabularies and ontologies. Builders of vocabularies and ontologies are encouraged to make their work as broadly applicable as possible so as to maximize future adoption. As a result, vocabularies and ontologies typically define a data model using minimal semantics. For example, DCAT [vocab-dcat-2] defines the concept of a dataset as an abstract entity with distributions and data services as means of accessing data. It is silent on whether a distribution should be in a particular serialization, or set of serializations. It is also silent on how data services should be configured. While it states that the value of dcat:theme should be a SKOS concept, it does not specify a particular SKOS [skos-reference] concept scheme, and so on. Other vocabularies such as Dublin Core Terms [DCTERMS] are equally parsimonious in their prescriptions of how they should be used. This means that data models and methods of working can be applied in different circumstances than those in which the original definition work was carried out and, in that sense, these promote broad interoperability.

In addition to addressing the needs of a specific community, a profile may also apply to a single system. Any individual system will be designed to meet a specific set of needs; that is, it will operate in a specific context. It is that context, and the individual choices made by the engineers working within it, that will determine how a vocabulary or set of vocabularies will be used. For example, a system ingesting data may require that a specific subset of properties from a range of vocabularies is used and that only terms from a defined code list are used as values for specified properties. In other words, where the 'base vocabulary' might say "the value of this property SHOULD be a value from a managed code list", a specialized profile will say "the value of this property MUST be from this specific code list".

This document is about how to formulate and communicate profiles.

-- 
GitHub Notification of comment by kcoyle
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/417#issuecomment-427553889 using your GitHub account

Received on Saturday, 6 October 2018 07:38:43 UTC