A uniform approach to modelling similar/repeating events

Hi,

Imagine the following scenario:

We want to monitor a group of people every day for a certain period (let’s say 3 months) and measure the following on a daily basis:

  *   Weight (with a unit e.g. kg, st, or …)
  *   Height (with a unit e.g. cm, ft, or …)
  *   Waist circumference (with a unit e.g. cm, …)
  *   Amount of Exercise (with a unit e.g. minutes)
  *   Units of fruit and vegetables (possibly no unit as it is a numeric value)

And this list could go on. The main idea is that we could potentially think of them as events that

  *   Always has a type
  *   always have an amount
  *   sometimes have a unit

The other constraint is that the data comes in at various points in a day and complement the previously added data. So we measure weight and height in the morning while the amount of exercise and fruit and veg consumed would be added at the end of the day.

For creating an easy to digest and query model that is also performant, I can see the following approaches (you might see more, and I would appreciate your thoughts):

Approach 1 (in pseudo turtle):
Study123 hasSubject Subject123 (defining a person belonging to a study)
Subject123 hasProcess Hash234 (a uniquely generated number based on the date and the id of the person)
Hash234 hasDate "2017-08-21T00:00:00Z"^^xsd:dateTime
Hash234 hasHeight hash_height_subject123_day1
hash_height_subject123_day1 hasAmount 165
hash_height_subject123_day1 hasUnit cm
Hash234 hasWeight hash_weight_subject123_day1
hash_weight_subject123_day1 hasAmount 60
hash_weight_subject123_day1 hasUnit kg
…

This can go on for every item I defined in first set of bullet points

Approach 2:
Study123 hasSubject Subject123
Subject123 hasProcess Hash234
Hash234 hasDate "2017-08-21T00:00:00Z"^^xsd:dateTime
Hash234 hasEvent hash_height_subject123_day1
hash_height_subject123_day1 rdf:type Height
hash_height_subject123_day1 hasAmount 165
hash_height_subject123_day1 hasUnit cm
Hash234 hasEvent hash_weight_subject123_day1
hash_weight_subject123_day1 rdf:type Weight
hash_weight_subject123_day1 hasAmount 60
hash_weight_subject123_day1 hasUnit kg
….

One of the reasons I am not using blank nodes (actually the most important one) is that we are getting supplementary data and blank nodes wouldn’t allow that.

I have tried to highlight the differences between the two approaches in colour. So my questions are:

  1.  Is there any standard out there that attempts to solve this issue?
  2.  which of these approaches is better (more efficient, easier to query, more uniform)?
  3.  Do you have a better solution than the ones above?

Many thanks,
Artemis

This email (and any attachments) is confidential. It contains information which may be privileged. It is meant only for the individual(s) or entity named above and should not be used by anyone who is not the original intended recipient. Disclosing, copying, distributing or using this information is prohibited. If you have received this email in error please inform the sender and delete it from your mailbox or any other storage mechanism. You should not retain, copy or use this email for any purpose, nor disclose all or any part of its contents to any other person. We may monitor all email communications through our network. Whilst every reasonable precaution has been taken, we cannot accept liability for any damage that you sustain as a result of software viruses that may be contained in, or attached to this email. Map of Agriculture Group Limited (09866989), Map of Agriculture Limited (09467966) and Map of Ag Analytics Limited (02977800) are all registered in England and Wales with their registered office at 15-16 Deben Mill Business Centre, Old Maltings Approach, Melton, Woodbridge, Suffolk IP12 1BL, UK.

Received on Tuesday, 5 June 2018 10:54:48 UTC