W3C home > Mailing lists > Public > public-prtbl-prsnl-prefs@w3.org > August 2021

Minutes from 29 Jul 2021 Meeting

From: Joseph Scheuhammer <clown@alum.mit.edu>
Date: Wed, 11 Aug 2021 10:31:32 -0400
To: public-prtbl-prsnl-prefs@w3.org
Message-ID: <df904871-3c13-bfe9-c371-a509423428e6@alum.mit.edu>
Here are the minutes from the 29 Jul 2021 meeting.

We used a google doc for taking minutes. The participants had an 
opportunity to amend those minutes and correct any errors after the 
meeting.  As a result, the sense of a back-and-forth dialogue was 
reduced and the amended minutes are closer to notes.

Present: Jutta Treviranus, Alison Paprica, Joseph Scheuhammer.



  * several different ways to go about this

  * there are a confusing number of preference interfaces which all
    differ in the choices given and the language used to express
    preferences -- it is the wild west

  * the better approach is to standardize this and have one platform and
    one protocol

  * this includes a better (accessible, easy to understand) user
    interface for the user to express their preferences

  * another way: capture the data

      o what are the types of health data that we want to capture?

      o what are the abuses?

      o what are the uses of data related to health?

  * What is the health data we need to capture?

  * What are the dangers of sharing that data?


  * Complicated but not complex, from a tech perspective, it should be
    possible to have our information follow us and be respected and
    acted on by data collecting organizations (e.g., a set of flags on
    my data file that allow, prohibit, or set conditions around specific
    data uses)

  * Practically, aim should be to start with small scale implementation
    that spreads

  * The idea of essential vs non-essential data from marketing probably
    doesn’t apply to health data in the same way

     a. not binary; almost all data can be useful in some way

     b. Researchers and health system planners work with whatever they
        have and could do more if more data were accessible

     c. Interestingly, the data that people talk most about missing from
        existing health data repositories is income data not health data
        (income is a key social determinant of health)

  * A desired future state could therefore be one in which data users
    understand that the more sensitive data they request, the fewer
    people will allow access, hopefully resulting in users tailoring
    their requests to be more acceptable and within social licence to
    more people

  * In terms of categories of health data that people focus on, some
    ones to consider are

     a. Admin data - population-wide data obtained through mechanisms
        like OHIP billing that has a little bit of data about the whole
        population. By its nature we have the most information about
        hospital/institutional care and other acute services, and
        relatively little information about people who are well or who
        are living while managing chronic conditions like diabetes..
        These data are far from perfect (e.g., might limit you to having
        information about only one condition per visit for a person that
        has multiple chronic conditions) but have the advantage of
        covering the entire population and being longitudinal

     b. Primary care data - i.e., data from your family doctor’s office.
        There is a push to get more of this information because it will
        allow us to understand what is happening BEFORE people need
        hospital or institutional care and develop and plan policies and
        programs that are pro-active. For example, instead of basic
        knowledge about someone with diabetes (e.g., whether they are
        taking government funded diabetes drugs, whether they are
        participating in wound care programs, whether they had to have
        an amputation), we can know much more, how outcomes are related
        to blood sugar levels, height, weight, smoking). There is
        variation in terms of what kinds of data people would work with
        and have access too. In some cases, datasets contain basic
        information about a small number of variables. Other datasets
        can include detailed doctor’s notes, but then extra processing
        is required to ensure identifying information isn’t disclosed,
        e.g., redact the name “Mrs. Parkinson” if it occurs in doctor’s
        notes, but leave in Parkinson’s as a condition.

     c. Wearables data - there is a lot of interest in this because (i)
        it would allow us to have continuous, or at least more frequent
        data, compared to what is captured when people see their family
        doctor every few months and (ii) with AI we can do things with
        this data that we expect will lead to new discoveries. Some
        private firms are already doing interesting things with
        wearables data and also a few researchers. Demand for this data
        is likely to increase and there is a good opportunity to set
        some guardrails around use now before the data are out all over
        the place (maybe this is one of the places that you start)

     4. Genomics data - I didn’t mention this during the call, but many
        people think that a combination of genomics data with admin and
        primary care data will launch the next wave of genomics
        discovery. Basically the idea is that we’ve already learned a
        lot about genetic conditions that we can recognize immediately,
        e.g., hereditary cancers, but there are many other phenotypes
        that we don’t even know have a genetic origin. Large linked
        datasets that include genetic information and information about
        chronic conditions, longevity, physical and mental health,
        responsiveness to medications can open up the possibility of new
        discovered and treatment plans, particularly when AI/ML used to

     d. Deep clinical data - I didn’t mention this during the call, but
        the MIMIC database is an example. Essentially it focuses on
        consented (or waived consent in the case of MIMIC) data about
        patients that are in environments like the ICU where a ton of
        data is collected. Similar to wearables and genomics, ML is
        opening up new possibilities in terms of the ways these data can
        be analysed.

     e. Social determinants of health aka health-related data -
        has good information about why information about income, gender,
        language, employment, education, geography etc. are important to
        understanding and supporting health and well-being

     e. Other data - just noting here that there are all kinds of data
        and the list above is not exhaustive. For example, data about
        prescription medications taken, home care services received,
        data about healthcare providers and institutions, and the list
        goes on.,

  * Considerations for establishing and implementing an operability standard

     a. Do not give the false impression that data are only used when
        people consent (there are special cases when data are used
        without consent - see

     b. In the health field, the first step before data sharing is
        removing or coding any identifying information which means that
        people using the data are studying groups, populations,
        sub-populations vs. individuals, you probably want to
        incorporate that practice (or at least have it be one of the
        options, see e) below)

     c. Consider the steps you’ll take to develop a standard that aligns
        with what members of the public are willing to do or specify -
        this is partly about having different user interfaces, but also
        meant to get at the fact that early adopters (who are also the
        people most likely to be involved in any pilot projects) may
        have more knowledge and more interest in the details that most
        other people

     d. Alison personally recommends that the work focuses on unpacking
        vague “third-party” clause in privacy agreements

     e. Overall the standard and way forward should aim to establish a
        virtuous cycle where members of the public can easily set out
        their preference (perhaps very high level categories to start
        (e.g., unlimited, unlimited as long as data are de-identified,
        non-commercial use only, commercial use with notification,
        etc.), with option to refine generally or for specific studies
        and the organizations that respect and align with social licence
        are the ones that flourish because they get more


'The only reason for time is so that everything doesn't happen all at once.'
                                - B. Banzai -
Received on Wednesday, 11 August 2021 14:32:06 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 11 August 2021 14:32:09 UTC