W3C home > Mailing lists > Public > public-csv-wg@w3.org > June 2014

RE: Attempted example CSV metadata document and template

From: Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk>
Date: Tue, 24 Jun 2014 10:34:28 +0000
To: Ivan Herman <ivan@w3.org>
CC: Dan Brickley <danbri@google.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <2624871D9A05174691BD59F8EFD68AE20884C7C4@EXXCMPD1DAG3.cmpd1.metoffice.gov.uk>
> -----Original Message-----
> From: Ivan Herman [mailto:ivan@w3.org]
> Sent: 23 June 2014 17:35
> To: Tandy, Jeremy
> Cc: Dan Brickley; W3C CSV on the Web Working Group
> Subject: Re: Attempted example CSV metadata document and template
> 
> 
> On 23 Jun 2014, at 18:03 , Tandy, Jeremy
> <jeremy.tandy@metoffice.gov.uk> wrote:
> 
> >> -----Original Message-----
> >> From: Ivan Herman [mailto:ivan@w3.org]
> >> Sent: 21 June 2014 08:38
> >> To: Tandy, Jeremy
> >> Cc: Dan Brickley; W3C CSV on the Web Working Group
> >> Subject: Re: Attempted example CSV metadata document and template
> >>
> >> Jeremy,
> >>
> >> one thing that I was wondering about was that the simple naming
> >> mechanism for the various microsyntaxes may not work out. Consider
> >>
> >> 	"columns" : [
> >> 		{ "name" : "datetime",
> >> 		  ...
> >>                  "microsytax": [
> >> 			{ "name" : N1,
> >> 			  "regexp" : "...."
> >> 			},
> >> 			.....
> >>                  ]
> >> 		},
> >> 		{ "name" : "anothercolumn",
> >> 		  ...
> >> 		  "microsyntax"
> >> 			{ "name" : N1,
> >> 			  "regexp" : "...."
> >> 			},
> >> 			.....
> >> 		}
> >>
> >> 	]
> >>
> >>
> >> When working through the cells in a row, what would 'N1' refer to?
> >> Unless we want to require the unicity of the microsyntax names, we
> >> may hit an issue. And I do not think requiring a unique name is a
> >> good idea; if the metadata becomes big, this may become a nuisance.
> >
> > Agreed. I made the assumption that all instances of "name" within a
> given metadata document would need to be unique. I had not considered
> any mechanisms to make this easy for users; e.g. using the "name" from
> an enclosing object to automatically _namespace_ sub-names.
> >
> > We could leave it to the user to ensure uniqueness (easy for us; adds
> load to the end user which is less good); in which case the example
> above would fail to validate.
> >
> > Alternatively, we could apply a form of name-spacing; e.g.
> "datetime/N1" and "anothercolumn/N1" within your example above.
> >
> >>
> >> What this means is that the syntax becomes more complicated.
> >> Something like {datetime:N1} or something similar (which raises the
> >> issue of escape characters, too:-(
> >
> > Agreed! I chose a different separator character to you, but the same
> issue applies.
> >
> >>
> >> As for the conditionals: mustache has some syntax for this which is
> a
> >> bit different
> >>
> >> {{#bla}}
> >>   .. any template here
> >> {{/bla}}
> >>
> >> although the mustache semantics is a bit different (afaik it relies
> >> on the existence or not of a key in an object). We could use the
> >> mustache semantics but we probably need something more, too, like
> "if
> >> 'bla' is a microsyntax name and is true if the value of the cell
> >> matches the regexp then it is true".
> >
> > Syntax-wise, we want our metadata document to be valid JSON, so we
> would need something different to mustache. However, I agree that our
> use cases call for similar semantics. Perhaps the syntax might be
> something like:
> >
> > "condition: {
> >    "operator": "if ({bla})",
> >    "template": {
> >        "name": "2010_Occupations-csv-to-ttl",
> >        "description": "Template converting CSV content to SKOS/RDF
> (expressed in Turtle syntax).",
> >        "type": "template",
> >        "path": "2010_Occupations-csv-to-ttl.ttl",
> >        "hasFormat": "text/turtle"
> >    }
> > }
> >
> > In this case, I'm trying to say that the template will be triggered
> if the value of {bla} is true / not null etc. ... the value of {bla} is
> taken by evaluating the column (or microsyntax element) with "name" =
> "bla" for the row being processed. Like you say: """it relies on the
> existence or not of a key in an object"""
> >
> > (I don't really like the syntax; I guess that others can come up with
> > better.)
> 
> Ouch, you are right, I forgot about the fact that we want templates for
> conditionals:-(
> 
> But before getting into the boring issue of syntax we have to decide
> whether we need them...

Syntax, boring ... no never! FWIW, it occurs to me that the conditional match might do better inside the "template" object, but more on that below.

> 
> >
> >>
> >> But I agree that the conditional complicates the templates a lot.
> >> Here is where our use cases may have to switch in: do our use cases
> >> justify the need for conditionals (remembering that, though we are
> >> discussing turtle here, I do not see any difference between
> >> generating turtle and generating XML or JSON through the same
> mechanism).
> >
> > The requirement is ["R-ConditionalProcessingBasedOnCellValues"][1],
> motivated by the ExpressingHierarchyWithinOccupationalListings use
> case. This use case gives us two requirements:
> >
> > i) triggering a template if a value of a cell is not null; e.g. to
> generate the SKOS concept scheme from the SOC structure ...
> >
> > 15-0000,,,,Computer and Mathematical Occupations,,,,,
> > ,15-1100,,,Computer Occupations,,,,, ,,15-1110,,Computer and
> > Information Research Scientists,,,,, ,,,15-1111,Computer and
> > Information Research Scientists,,,,,
> >
> > Here we can see that I only want a ex:SOC-MajorGroup entity created
> on the first row shown above (where col 1 is populated).
> >
> > ii) triggering a template if a value of a cell equates to a
> particular string (or the opposite); e.g. when the value of "onetsoc-
> occupation" = "00" as shown in the example shown [earlier in this email
> thread][3]. ...
> >
> > "operator": "if ({onetsoc-occupation} == '00')"
> >
> > Perhaps there are cases for more complex operations? I don't know.
> Perhaps this is where call-back functions or promises could be used to
> parse a row and provide a Boolean response as to whether the template
> should be triggered? Again, I don't know ... and some considerable
> thought would be required to work out the details of such.
> 
> For me these seem to be convincing that we need something. My
> preference would be, though, to avoid all the issues about defining
> 'if'-s and 'else'-s and comparions operators, etc, etc, and fall back
> on regular expressions ('match'-'not match') simply because regular
> expressions are used elsewhere already. Would that be enough?

I think that this would provide sufficient functionality for the two example requirements I listed.

Below, I've tried to provide worked examples for each of these requirements showing how such regexp conditional matching might be implemented ...

1) triggering a template if a column in the row being processed is not empty (or null):

data snippet (from [soc_structure_2010.csv][1]):
---
Major Group,Minor Group,Broad Group,Detailed Occupation,,,,,,
,,,,,,,,,
{snip}
15-0000,,,,Computer and Mathematical Occupations,,,,,
,15-1100,,,Computer Occupations,,,,,
{snip}
,,15-1190,,Miscellaneous Computer Occupations,,,,,
,,,15-1199,"Computer Occupations, All Other",,,,,
{snip}
---

[1]: http://w3c.github.io/csvw/use-cases-and-requirements/soc_structure_2010.csv 

Let's assume that I want to trigger a template to create "Detailed Occupation" entities - I only want to trigger this when the 4th column is populated. Note that I have used "conditional-match" within the template blocks to provide a REGEXP that is assessed against the _ENTIRE_ row to determine if the template is triggered. Again, I'm not wedded to the names or syntax - just trying to express the idea. 

(Aside 1: in creating this example, I have blundered into the challenges of wanting to repeatedly use same "name" within microsyntax blocks ... I got around the need for uniqueness using "/" as a pseudo path separator, but it feels clunky and ends up with long names!)

(Aside 2: I also noticed that my REGEXP weren't valid when embedding them in JSON as the "\" character needed escaping - hence the use of "\\" below ... I am assuming that any JSON processor will parse the literal _before_ trying to process the REGEXP)

Here's the metadata description for the resource:

---
{
    "name": "soc-2010",
    "title": "Standard Occupational Classification (2010)",
    "publisher": [{
        "name": "US Bureau of Labor Statistics",
        "web": "http://www.bls.gov/ "
    }],
    "resources": [{
        "name": "soc-2010-csv",
        "path": "soc_structure_2010.csv",
        "schema": {"columns": [
            {
                "name": "soc-major-group-code",
                "title": "Major Group",
                "type": "string"
            },
            {
                "name": "soc-minor-group-code",
                "title": "Minor Group",
                "type": "string",
                "microsyntax": [{
                    "name": "soc-minor-group-code/major-group-element",
                    "regexp": "^(\\d{2})-\\d{4}$"
                }]
            },
            {
                "name": "soc-broad-group-code",
                "title": "Broad Group",
                "type": "string",
                "microsyntax": [
                    {
                        "name": "soc-broad-group-code/major-group-element",
                        "regexp": "^(\\d{2})-\\d{4}$"
                    },
                    {
                        "name": "soc-broad-group-code/minor-group-element",
                        "regexp": "^\\d{2}-(\\d{2})\\d{2}$"
                    }
                ]
            },
            {
                "name": "soc-detailed-occupation-code",
                "title": "Detailed Occupation",
                "type": "string",
                "microsyntax": [
                    {
                        "name": "soc-detailed-occupation-code/major-group-element",
                        "regexp": "^(\\d{2})-\\d{4}$"
                    },
                    {
                        "name": "soc-detailed-occupation-code/minor-group-element",
                        "regexp": "^\\d{2}-(\\d{2})\\d{2}$"
                    },
                    {
                        "name": "soc-detailed-occupation-code/broad-group-element",
                        "regexp": "^\\d{2}-\\d{2}(\\d)\\d$"
                    }
                ]
            },
            {
                "name": "soc-title",
                "title": "",
                "type": "string"
            },
            {"name": "empty(1)"},
            {"name": "empty(2)"},
            {"name": "empty(3)"},
            {"name": "empty(4)"},
            {"name": "empty(5)"}
        ]},
        "template": [
            {
                "conditional-match": "^\\d{2}-0{4},{4}\\.*",
                "name": "major-group-template-ttl",
                "description": "Template converting Major Group content from SOC structure CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "major-group-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            },
            {
                "conditional-match": "^,\\d{2}-\\d{2}0{2},{3}\\.*",
                "name": "minor-group-template-ttl",
                "description": "Template converting Minor Group content from SOC structure CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "minor-group-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            },
            {
                "conditional-match": "^,{2}\\d{2}-\\d{3}0,{2}\\.*",
                "name": "broad-group-template-ttl",
                "description": "Template converting Broad Group content from SOC structure CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "broad-group-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            },
            {
                "conditional-match": "^,{3}\\d{2}-\\d{4},\\.*",
                "name": "detailed-occupation-template-ttl",
                "description": "Template converting Detailed Occupation content from SOC structure CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "detailed-occupation-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            }
        ]
    }]
}
---

(Apologies if the REGEXP has errors - not one of my strengths!)

My "detailed-occupation-csv-to-ttl-template.ttl" would be:
---
ex:{soc-detailed-occupation-code} a ex:SOC-DetailedOccupation ;
    skos:notation "{soc-detailed-occupation-code}" ;
    skos:prefLabel "{soc-title}" ;
    skos:broader ex:{soc-detailed-occupation-code/major-group-element}-0000, 
                 ex:{soc-detailed-occupation-code/major-group-element}-{soc-detailed-occupation-code/minor-group-element}00, 
                 ex:{soc-detailed-occupation-code/major-group-element}-{soc-detailed-occupation-code/minor-group-element}{soc-detailed-occupation-code/broad-group-element}0 .
---

Thus, given the input row below:
---
,,,15-1199,"Computer Occupations, All Other",,,,,
---

... the "detailed-occupation-template-ttl" should be triggered, based on the conditional match REGEXP, and provide the following TTL snippet:
---
ex:15-1199 a ex:SOC-DetailedOccupation ;
    skos:notation "15-1199" ;
    skos:prefLabel "Computer Occupations, All Other" ;
    skos:broader ex:15-0000, 
                 ex:15-1100, 
                 ex:15-1190 .
---

2) triggering a template given a specific value within a microsyntax element:

data snippet (from [2010_Occupations.csv][2]):
---
O*NET-SOC 2010 Code,O*NET-SOC 2010 Title,O*NET-SOC 2010 Description
{snip}
15-1199.00,"Computer Occupations, All Other",All computer occupations not listed separately.
{snip}
15-1199.03,Web Administrators,"Manage web environment design, deployment, development and maintenance activities.[...]"
{snip}
---

[2]: http://w3c.github.io/csvw/use-cases-and-requirements/2010_Occupations.csv  

This time I want to trigger a one template if the Occupation is a main category (e.g. Code = "15-1199.00"), else I want to trigger a different category. A main category is denoted with the final two digits of the code being "00".

(Aside 3: of course, as these two files are likely to be packaged together, I could have had just a single metadata description describing _both_ resources!)

(Aside 4: I've assumed that the conditional match is assessed against the entire row; whilst it's not impossible to deal with, I note that the need to potentially escape fields to count the columns is an added complexity!)

Here's the metadata description for the resource:

---
{
    "name": "2010_Occupations",
    "title": "O*NET-SOC Occupational listing for 2010",
    "publisher": [{
        "name": "O*Net Resource Center",
        "web": " http://www.onetcenter.org/ "
    }],
    "resources": [{
        "name": "2010_Occupations-csv",
        "path": "2010_Occupations.csv",
        "schema": {"columns": [
            {
                "name": "onet-soc-2010-code",
                "title": "O*NET-SOC 2010 Code",
                "description": "O*NET Standard Occupational Classification Code (2010).",
                "type": "string",
                "required": true,
                "unique": true,
                "microsyntax": [
                    {
                        "name": "soc-major-group",
                        "regexp": "^(\\d{2})-\\d{4}.\\d{2}$"
                    },
                    {
                        "name": "soc-minor-group",
                        "regexp": "^\\d{2}-(\\d{2})\\d{2}.\\d{2}$"
                    },
                    {
                        "name": "soc-broad-group",
                        "regexp": "^\\d{2}-\\d{2}(\\d)\\d.\\d{2}$"
                    },
                    {
                        "name": "soc-detailed-occupation",
                        "regexp": "^\\d{2}-\\d{3}(\\d).\\d{2}$"
                    }
                ]
            },
            {
                "name": "title",
                "title": "O*NET-SOC 2010 Title",
                "description": "Title of occupational classification.",
                "type": "string",
                "required": true
            },
            {
                "name": "description",
                "title": "O*NET-SOC 2010 Description",
                "description": "Description of occupational classification.",
                "type": "string",
                "required": true
            }
        ]},
        "template": [
            {
                "conditional-match": "^\\d{2}-\\d{4}.00,\\.*",
                "name": "soc-occupation-category-template-ttl",
                "description": "Template converting SOC occupation category CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "soc-occupation-category-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            },
            {
                "conditional-match": "^\\d{2}-\\d{4}.(?!00),\\.*",
                "name": "onet-soc-occupation-subcategory-template-ttl",
                "description": "Template converting O*NET SOC occupation sub-category CSV content to SKOS/RDF (expressed in Turtle syntax).",
                "type": "template",
                "path": "onet-soc-occupation-subcategory-csv-to-ttl-template.ttl",
                "hasFormat": "text/turtle"
            }
        ]
    }]
}
---

My TTL templates would be:
---soc-occupation-category-csv-to-ttl-template.ttl
ex:{onet-soc-2010-code} a ex:SOC-DetailedOccupation ;
    skos:notation "{onet-soc-2010-code}" ;
    skos:prefLabel "{title}" ;
    dct:description "{description}" ;
    skos:exactMatch ex:{soc-major-group}-{soc-minor-group}{soc-broad-group}{soc-detailed-occupation} ;
    skos:broader ex:{soc-major-group}-0000, 
                 ex:{soc-major-group}-{soc-minor-group}00, 
                 ex:{soc-major-group}-{soc-minor-group}{soc-broad-group}0 .
---

---onet-soc-occupation-subcategory-csv-to-ttl-template.ttl
ex:{onet-soc-2010-code} a ex:ONETSOC-Occupation ;
    skos:notation "{onet-soc-2010-code}" ;
    skos:prefLabel "{title}" ;
    dct:description "{description}" ;
    skos:broader ex:{soc-major-group}-0000, 
                 ex:{soc-major-group}-{soc-minor-group}00, 
                 ex:{soc-major-group}-{soc-minor-group}{soc-broad-group}0,
                 ex:{soc-major-group}-{soc-minor-group}{soc-broad-group}{soc-detailed-occupation} .
---

Thus, the input row below:
---
15-1199.00,"Computer Occupations, All Other",All computer occupations not listed separately.
---

... would generate the following TTL snippet:
---
ex:15-1199.00 a ex:SOC-DetailedOccupation ;
    skos:notation "15-1199.00" ;
    skos:prefLabel "Computer Occupations, All Other" ;
    dct:description "All computer occupations not listed separately." ;
    skos:exactMatch ex:15-1199 ;
    skos:broader ex:15-0000, 
                 ex:15-1100, 
                 ex:15-1190 .
---

And this row:
---
15-1199.03,Web Administrators,"Manage web environment design, deployment, development and maintenance activities.[...]"
---

... would generate this TTL snippet:
---
ex:15-1199.03 a ex:ONETSOC-Occupation ;
    skos:notation "15-1199.03" ;
    skos:prefLabel "Web Administrators" ;
    dct:description "Manage web environment design, deployment, development and maintenance activities.[...]" ;
    skos:broader ex:15-0000, 
                 ex:15-1100, 
                 ex:15-1190,
                 ex:15-1199 .
---

And I think that just about wraps it up.

Jeremy

> 
> Ivan
> 
> >
> > Jeremy
> >
> >
> >
> > [1]:
> > http://w3c.github.io/csvw/use-cases-and-requirements/index.html#R-
> Cond
> > itionalProcessingBasedOnCellValues
> > [2]:
> > http://w3c.github.io/csvw/use-cases-and-requirements/index.html#UC-
> Exp
> > ressingHierarchyWithinOccupationalListings
> > [3]:
> > http://lists.w3.org/Archives/Public/public-csv-wg/2014Jun/0127.html
> >
> >>
> >> My 2 cents...
> >>
> >> Ivan
> >>
> >>
> >>
> >>
> >> On 19 Jun 2014, at 14:36 , Tandy, Jeremy
> >> <jeremy.tandy@metoffice.gov.uk> wrote:
> >>
> >>>> -----Original Message-----
> >>>> From: Dan Brickley [mailto:danbri@google.com]
> >>>> Sent: 18 June 2014 12:46
> >>>> To: Tandy, Jeremy
> >>>> Cc: CSV on the Web Working Group
> >>>> Subject: Re: Attempted example CSV metadata document and template
> >>>>
> >>>> On 12 June 2014 12:57, Tandy, Jeremy
> >>>> <jeremy.tandy@metoffice.gov.uk>
> >>>> wrote:
> >>>>> All -
> >>>>>
> >>>>> I've just uploaded to [GitHub][1] a rework of the "Simple Weather
> >>>> Observation" example. I've tried to create a CSV metadata document
> >>>> following the rules in the [Metadata Vocabulary for Tabular
> >>>> Data][2] and [Generating RDF from Tabular Data on the Web][3]
> documents.
> >>>>>
> >>>>> I would be particularly interested in:
> >>>>>
> >>>>> - corrections to errors!
> >>>>> - comments on additional proposed properties in the metadata
> >>>>> document ("short-name", "template", "microsyntax")
> >>>>> - use of "hasFormat" to specify the Content-Type associated with
> a
> >>>>> Template
> >>>>> - use of a REGEXP within a URI Template to convert ISO 8601
> syntax
> >>>>> to a simplified form
> >>>>
> >>>> I don't completely understand this mechanism yet, but do you think
> >> it
> >>>> could be stretched to address the SKOS/codes issue in
> >>>> http://w3c.github.io/csvw/use-cases-and-requirements/#UC-
> >>>> ExpressingHierarchyWithinOccupationalListings
> >>>> where we'd want to explode strings like "15-1199.00", "15-1199.01"
> >>>> and emit triples like 'broader' when certain patterns matched?
> >>>>
> >>>> Dan
> >>>>
> >>>
> >>> OK ... let's have a go.
> >>>
> >>> Here's the header and a line of data:
> >>>
> >>> ---
> >>> O*NET-SOC 2010 Code,O*NET-SOC 2010 Title,O*NET-SOC 2010 Description
> >>> 15-1199.03,Web Administrators,"Manage web environment design,
> >> deployment, development and maintenance activities. [...]"
> >>> ---
> >>>
> >>> Here's a guess at the CSV metadata description in which I am using
> >> the ["multiple regexp each extracting a single value" pattern][1]:
> >>>
> >>> ---
> >>> {
> >>>  "name": "2010_Occupations",
> >>>  "title": "O*NET-SEC Occupational listing for 2010",
> >>>  "publisher": [{
> >>>      "name": "O*Net Resource Center",
> >>>      "web": " http://www.onetcenter.org/ "
> >>>  }],
> >>>  "resources": [{
> >>>      "name": "2010_Occupations-csv",
> >>>      "path": "2010_Occupations.csv",
> >>>      "schema": {"columns": [
> >>>          {
> >>>              "name": "onet-soc-2010-code",
> >>>              "title": "O*NET-SOC 2010 Code",
> >>>              "description": "O*NET Standard Occupational
> >> Classification Code (2010).",
> >>>              "type": "string",
> >>>              "required": true,
> >>>              "unique": true,
> >>>              "microsyntax": [{
> >>>                      "name": "soc-major-group",
> >>>                      "regexp": "/^(\d{2})-\d{4}.\d{2}$/"
> >>>                  },{
> >>>                      "name": "soc-minor-group",
> >>>                      "regexp": "/^\d{2}-(\d{2})\d{2}.\d{2}$/"
> >>>                  },{
> >>>                      "name": "soc-broad-group",
> >>>                      "regexp": "/^\d{2}-\d{2}(\d)\d.\d{2}$/"
> >>>                  },{
> >>>                      "name": "soc-detailed-occupation",
> >>>                      "regexp": "/^\d{2}-\d{3}(\d).\d{2}$/"
> >>>                  },{
> >>>                      "name": "onetsoc-occupation",
> >>>                      "regexp": "/^\d{2}-\d{4}.(\d{2})$/"
> >>>                  }
> >>>
> >>>              ]
> >>>          },
> >>>          {
> >>>              "name": "title",
> >>>              "title": "O*NET-SOC 2010 Title",
> >>>              "description": "Title of occupational
> classification.",
> >>>              "type": "string",
> >>>              "required": true
> >>>          },
> >>>          {
> >>>              "name": "description",
> >>>              "title": "O*NET-SOC 2010 Description",
> >>>              "description": Description of occupational
> >> classification.",
> >>>              "type": "string",
> >>>              "required": true
> >>>          }
> >>>      ]},
> >>>      "template": {
> >>>          "name": "2010_Occupations-csv-to-ttl",
> >>>          "description": "Template converting CSV content to
> SKOS/RDF
> >> (expressed in Turtle syntax).",
> >>>          "type": "template",
> >>>          "path": "2010_Occupations-csv-to-ttl.ttl",
> >>>          "hasFormat": "text/turtle"
> >>>      }
> >>>  }]
> >>> }
> >>> ---
> >>>
> >>> You can see that I've used the `microsyntax` object to capture the
> 5
> >> independent elements of the O*NET-SOC code each with its own regexp:
> >> "soc-major-group", "soc-minor-group", "soc-broad-group",
> >> "soc-detailed- occupation" and "onetsoc-occupation". Whether this is
> >> the _best_ way to do, I don't know ... it's just an idea to get us
> >> talking about possibilities and options!
> >>>
> >>> The template (prefixes etc. intentionally left out) might then be:
> >>>
> >>> ---
> >>> ex:{onet-soc-2010-code} a ex:ONETSOC-Occupation ;
> >>>   skos:notation "{onet-soc-2010-code}" ;
> >>>   skos:prefLabel "{title}" ;
> >>>   dct:description "{description}" ;
> >>>   skos:broader ex:{soc-major-group}-0000,
> >>>                ex:{soc-major-group}-{soc-minor-group}00,
> >>>                ex:{soc-major-group}-{soc-minor-group}{soc-broad-
> >> group}0,
> >>>                ex:{soc-major-group}-{soc-minor-group}{soc-broad-
> >> group}{soc-detailed-occupation} .
> >>> ---
> >>>
> >>> However, this does not help when we look at the required
> >>> _conditional
> >>> behaviour_: when the value of "onetsoc-occupation" = "00" this is
> >>> identical to the term from the SOC taxonomy, and the template
> should
> >>> be more like
> >>>
> >>> ---
> >>> ex:{soc-major-group}-{soc-minor-group}{soc-broad-group}{soc-
> detailed
> >>> -
> >> occupation} a ex:SOC-DetailedOccupation ;
> >>>   skos:notation "{soc-major-group}-{soc-minor-group}{soc-broad-
> >> group}{soc-detailed-occupation}" ;
> >>>   skos:prefLabel "{title}" ;
> >>>   dct:description "{description}" ;
> >>>   skos:broader ex:{soc-major-group}-0000,
> >>>                ex:{soc-major-group}-{soc-minor-group}00,
> >>>                ex:{soc-major-group}-{soc-minor-group}{soc-broad-
> >> group}0 .
> >>> ---
> >>>
> >>> It occurs to be that we may wish to trigger different templates
> >>> based
> >> on a conditional response - or even whether we wish to trigger a
> >> template at all for a given line!
> >>>
> >>> Thinking out of the box (is that a euphemism for "making it up as I
> >> go along"?), it would seem that each "template" block in the CSV
> >> metadata might have a "condition" statement that tells it when to
> >> fire
> >> - using values of column names or microsyntax element names? e.g.
> >>>
> >>> ---
> >>>      "template": {
> >>>          "name": "2010_Occupations-csv-to-ttl",
> >>>          "description": "Template converting CSV content to
> SKOS/RDF
> >> (expressed in Turtle syntax).",
> >>>          "type": "template",
> >>>          "path": "2010_Occupations-csv-to-ttl.ttl",
> >>>          "hasFormat": "text/turtle",
> >>>          "condition": "if {soc-detailed-occupation} != '00'"
> >>>      }
> >>> ---
> >>>
> >>> Default behaviour (if no "condition" statement included) would be
> >> _always_ to trigger the template for each row.
> >>>
> >>> However, looking at this, I am immediately concerned that including
> >> if-then-else blocks and comparison operators hugely increases the
> >> complexity of our work. Perhaps this is a good point to "bug out" to
> >> some external agent (e.g. call-back function or promise).
> >>>
> >>> Jeremy
> >>>
> >>> [1]:
> >>> https://github.com/w3c/csvw/blob/gh-pages/examples/csv-metadata-
> and-
> >> te
> >>> mplate-for-simple-weather-obs-example.md#multiple-regexp-each-
> >> extracti
> >>> ng-single-value
> >>>
> >>>>
> >>>>> - thoughts about a way to describe that microsyntax format within
> >>>>> the
> >>>> metadata document (see CellMicrosyntax requirement][4]), e.g. to
> >>>> define the sub-elements within the microsyntax that may be
> >>>> extracted for use later - see [Parsing cell microsyntax][5].
> >>>>>
> >>>>> Comments welcome.
> >>>>>
> >>>>> Jeremy
> >>>>>
> >>>>>
> >>>>> [1]:
> >>>>> https://github.com/w3c/csvw/blob/gh-pages/examples/csv-metadata-
> >> and-
> >>>> te
> >>>>> mplate-for-simple-weather-obs-example.md
> >>>>> [2]: http://w3c.github.io/csvw/metadata/index.html
> >>>>> [3]: http://w3c.github.io/csvw/csv2rdf/
> >>>>> [4]:
> >>>>> http://w3c.github.io/csvw/use-cases-and-requirements/#R-
> >>>> CellMicrosynta
> >>>>> x
> >>>>> [5]:
> >>>>> https://github.com/w3c/csvw/blob/gh-pages/examples/csv-metadata-
> >> and-
> >>>> te
> >>>>> mplate-for-simple-weather-obs-example.md#parsing-cell-microsyntax
> >>
> >>
> >> ----
> >> Ivan Herman, W3C
> >> Digital Publishing Activity Lead
> >> Home: http://www.w3.org/People/Ivan/
> >> mobile: +31-641044153
> >> GPG: 0x343F1A3D
> >> WebID: http://www.ivan-herman.net/foaf#me
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> WebID: http://www.ivan-herman.net/foaf#me
> 
> 
> 
> 
Received on Tuesday, 24 June 2014 10:35:02 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:40 UTC