[minutes] eGov IG F2F2, Day 2, 13 March 2009

All,

Minutes of second day of F2F2 in Washington DC are available at:

  http://www.w3.org/2009/03/13-egov-minutes

and text version below.

Thanks again Karen for outstanding scribing.

Summary report coming up early next week.

-- Jose


--------------------

    [1]W3C

       [1] http://www.w3.org/

                 eGovernment Interest Group F2F2 - Day 2

13 Mar 2009

    See also: [2]IRC log

       [2] http://www.w3.org/2009/03/13-egov-irc

Attendees

    Present
           see [3]list of registered participants

       [3] http://www.w3.org/2007/eGov/IG/wiki/F2F2#Participants

    Regrets
    Chair
           kevin, john

    Scribe
           karen, josema

Contents

      * [4]Topics
          1. [5]Multi Channel
          2. [6]Identification and Authentication
          3. [7]Long Term Data Management
          4. [8]Future
      * [9]Summary of Action Items
      _________________________________________________________

    <josema> scribe:Karen

Multi Channel

    [Kevin welcomes and recaps agenda for today]

    [Jose Alonso opens discussion on Multi-channel delivery]

    Jose:Governments are starting to think about mobile access to
    government services more seriously
    ...challenges around integrating all these different systems from
    home computer or mobile device
    ... a number of different interfaces that people use voice
    interface, Web interface, etc.
    ... every different way of delivery of info has its own challenges
    ... if you start using eGov services using a given channel
    ... and then you lose connection
    ... the information you already entered should be stored in the
    sytem somewhere
    ... there are a lot of pieces for this topic
    ... front end is about mobile phones
    ... W3C has a Mobile Web Initiative that has been developing best
    practices
    ... to deveop applications for mobile phones
    ... on back end, many activities in XML, Semantic Web, data
    integration
    ... so that's my personal view

    [10]http://www.w3.org/2008/06/MWBP-WG-charter.html

      [10] http://www.w3.org/2008/06/MWBP-WG-charter.html

    ChrisJ: you also want to consider bandwidth differences

    Kevin: also want to mention mobile devices in developing countries

    Oleg Petrov, World Bank

    Oleg: mobile Web is an important topic
    ... I worked on infrastructure development in Russia
    ... at The World Bank we haven't done work on mobile yet
    ... have organized some events
    ... we are looking at events in specific countries around topics
    like health
    ... a month ago, this topic was emphasized
    ... we are being pushed to explore different applications
    ... single window centers like Canada
    ... we should not just focus on mobile for mult-channel
    ... although mobile is most powerful and recent
    ... in Spain there was a seminar
    ... mobile is the latest 'baby
    ... in the tool kit
    ... not forget others, but mobile does require a special focus
    ... it will need funding as well as focus
    ... Mobile Health Alliance joint event on mobile health, maybe in
    September
    ... we have to do more work; welcome W3C partnership on
    multi-channel delivery topic

    Kevin: thanks for summary
    ... I agree there are multi-channels and multiple opportunities for
    access
    ... mobile is taking highest position in people's conversations
    ... outside US, that's the main interface point for consumers
    ... interesting challenge
    ... I was working with Telecom Italia and Vodafone
    ... one of biggest challenges is the competitive space
    ... browser and device independence not as high on their lists
    ... need to content with commercial interests to sell their devices

    Jose: We ran workshops at W3C
    ... we started group on Mobile Web for Social Development
    ... they are finding issues about how to develop applications for
    those countries
    ... the interfaces from developed world don't often work in
    developing world
    ... the most important services in those countries seem to be public
    services
    ... some of unsolved challenges are that we have not yet identified
    what those services are
    ... the ones to provide the most value
    ... there are things like microfinancing that's important
    ... this other group is working on this
    ... another channel in Europe is Digital TV

    [11]http://www.w3.org/2008/10/MW4D_WS/

      [11] http://www.w3.org/2008/10/MW4D_WS/

    Jose: Digital TV requirements
    ... but it's not interactive
    ... in the TV boxes for their homes
    ... when there is interaction, the interfaces are poor
    ... not a lot of things you can do
    ... so it's another channel to consider; W3C is not doing anything
    here at the moment
    ... In South America, in Paraguay, there is very little Internet
    connectivity except through mobile
    ... so how global a view should we have

    Chris Testa, Holocaust Museum

    Chris: I would encourage this group to take global perspective
    ... we are interested in sharing data across int'l and cultural
    boundaries
    ... would be applicability in Federal gov't

    Karen: In a conversation with State Dept rep yesterday, I learned of
    her interest in international communications

    Jose: do we agree to include this in the issues paper?

    Kevin: perhaps in the multi-channel section
    ... we talk about focus on mobile first

    Ken: closed captioning is also of concern to gov't
    ... big on video
    ... so this is another area related to multi-media

    Jose: there are some other groups at W3C working on this
    ... we could put something there and build a connection with this
    group

    Kevin: 508 requires anything that cannot be classified can be
    grandfathered in
    ... it's a huge issue and a cost issue
    ... lack of specs for user presentation of things
    ... affects policies and funding in agencies

    Suzanne Acar, FBI/DAS

    Suzanne: big legacy environment, not designed

    Ken: if gov't uses YouTube, are they selecting one company over
    another
    ... is there a standard way to have a repository
    ... a big issue in terms of gov't use of social media
    ... for example, here's a list of videos in a standard format

    John: from UK perspective, gov't's use of YouTube
    ... do you support this well-known big service
    ... element of reach
    ... but aspects in terms of service you don't like
    ... you just out-sourced another service to an arm of Google
    ... not a lot of great choices out there right now
    ... especially if you want to reach young people
    ... I have heard it said that there is a generation that wants to
    receive video and not text
    ... they don't want to get back a text Web page
    ... if YouTube is the only mass way to achieve reach
    ... this raises some important issues around interoperability
    ... also looking at location-aware devices
    ... and how this changes the game
    ... I am recently using an iPhone
    ... it's great technology
    ... what's significant
    ... with browser devices
    ... Yahoo and FireEagle
    ... how your experience of Web completely changes when people know
    where you are
    ... Twitter people who are the same train with me
    ... it's transformative in how we envisage service delivery
    ... but if I'm in an area, I want to know how safe the locale is
    ... where public conveniences are
    ... having access to information
    ... a whole bundle of issues to explore in this section of the
    document
    ... not sure how much we can say

    Kevin: dotMobi has the cities initiative
    ... Google has an intitiative on mobile
    ... Yahoo as well, focused more on European market
    ... maybe it's a reference point for perspective and information

    Suzanne: I heard someone in gov't say
    ... there is or will be a policy to geo-engable their databases

    Kevin: back to YouTube
    ... we successfully negotiated a partner channel agreement for gov't
    needs
    ... rationale was that we could not have done that without a
    significant investment
    ... yes, there are multiple vendors and sites

    Diane: there are different embeds for these vendor offerings
    ... would be good to have one embed statement that would work for
    any platform
    ... that would be huge for me

    David Brunton, LOC

    David: I put old newspapers online
    ... two competing trends
    ... one is proliferation of channels
    ... delivering via the Internet
    ... we are delivering via three or four channels
    ... Twitter, etc
    ... so treating Internet as one is dubious
    ... proliferation of channels
    ... no work on our part, people can get huge images of old
    newspapers
    ... even on mobile devices
    ... becomes most meaningful to talk about proliferation of channels
    in terms of what access looks like

    Kevin: so what would be helpful from a standards perspective?
    ... standardization of APIs?

    David: the sheer number of APIs that we have to support
    ... and the poor job we're doing getting data into multiple channels
    ... open search has done it for search and text, but that's a small
    drop in big ocean

    JohnS: difficult to pick a vendor's product
    ... have to make a choice to support a particular vendor's way of
    doing stuff
    ... do we back Google, iPhone? That's always a problem

    Chris: it's easy to define as syndication of services
    ... if you produce video and make available on Web site
    ... syndicate out to multiple places
    ... so you expose audiences to multiple platforms
    ... you can address accessibility concerns

    Kevin: an eGov focus can provide value outside of 508
    ... or reviewing work with another W3C work
    ... what else do we want to bubble up?
    ... are there other particular government needs?
    ... We may or may not have communicated special requirements

    Diane: from a financial reporting, there is "as reported data"
    ... when a gov't entity submits a filing, they prepare report to
    look a certain way
    ... which paragraphs come first
    ... so when you repurpose data across multi-channels
    ... and I want to show a snip on mobile, etc.
    ... then how do you deal with this is how it was when I reported it
    ... how do you authenticate it was the "as reported view
    ... put some seal on it

    Greg Elin, Sunlight Foundation

    Greg: Access to information in a timely fashion in its original form
    is critically important
    ... you have to trace back to that
    ... photo journalism industry dealth with whether images were
    manipulated or not

    Diane: we can embed provinence URIs into documents

    Greg: what is role 5-10 years down the road
    ... with notion of the document being more electronic in the first
    place
    ... I was reflecting about electronic contracts in the first place
    ... rather than digitizing other documents

    Daniel: I studied film and video
    ... it's weird to talk about this in light of work of [] Morris
    ... and his most recent documentaries about Abu Graab
    ... He talks about what photographs show as proof
    ... hard to get meaning
    ... you can imbue meaning into something to change what you are
    seeing
    ... captions for example
    ... become more important to give context
    ... in terms of versimilitude
    ... cameras often have GPS devices
    ... time codes, date, GPS
    ... you can preserve the angle
    ... but it doesn't tell you a lot
    ... you do certain things
    ... that make it seem as if something is happening, when it is not
    ... if we did a video of what is happening here
    ... a videographer would probably do b-roll of the building
    ... in movies, they shoot different exteriors from the interiors
    ... this is a mind path
    ... Should encourage people who do this to preserve the time codes

    Kevin: perhaps something we provide as a validator

    Diane: on financial reporting, there are legal implications
    ... for the order in which items appear in report, the presentation
    of material
    ... the accountants world
    ... we recognize that the data is what's really important in this
    ... so people can manipulate it
    ... the accountants want to say that this is the canonical rendering
    of the document stored on SEC Web site, etc.

    Joe:when we move files to any device
    ... the performance is equal
    ... if Internet is used as a platform
    ... like Google docs editing
    ... like on your computer
    ... then these issues may become difficult
    ... can I really compare census data with something else on my
    mobile phone
    ... other challenges with multichannel as Internet evolves to
    another phase
    ... from file server to using applications
    ... I'm Joe Carmel from the phone yesterday
    ...retired federal employee
    ... did project for bills in XML in US House of Representatives
    ... and Senate

    Greg: I would like to clarify
    ... there is not the canonical version
    ... there is a context that it is relevant to
    ... I have footnotes and filing
    ... as we look at Recovery.gov
    ... you get into this recursive model
    ... what's original info I have; how far down chain does it go
    ... the accurate representation in one medium is different in
    another medium
    ... like accountants' rules
    ... so set up in metadata

    Diane: yes, we have this debate all the time
    ... trying to explain to authors that their documents may be viewed
    in multiple ways
    ... with a financial doc submitted to SEC
    ... they are allowed to submit a rendered version
    ... interesting dialogue with regulators as well
    ... may need to be a use case or standard approach
    ... I'm Diane Mueller, JustSystems and XBRL Int'l Consortium

    Kevin: anything more on this topic?

    Joe: this is a bit off
    ... there are a lot of PDF files
    ... government doesn't realize they can embed within PDF
    ... would be richer than pulling out directly
    ... a way to embed in Acrobat
    ... which is pretty good
    ... page number, specific point in that process
    ... it would be nice if agencies would include the source
    ... then you could package it together so long-term know that source
    document
    ... so you don't have to figure it all out

    Greg: I did not know that
    ... thinking in terms of conventions and practices
    ... not just standards
    ... promulgating ideas
    ... by convention
    ... also breaking down from document version to documents
    ... REcovery.gov has links to graphs and maps
    ... a simple pattern
    ... like if you have a visual rep of tabular data, have a link
    ... or provide link to download in CSV or other format
    ... particular practices or conventions below the document level in
    clustered elements
    ... to drive document toward a set of practices

    Diane: in XBRL we have a best practices board
    ... for reporting
    ... Ihav personally experienced
    ... URI back to source, which is embedded in the document
    ... you can get right back to the canonical
    ... the source file
    ... gets back to conversation of Internet as source file

    Joe: Look at Homeland Security
    ... their domain has changed; URIs are fine as long as they continue
    to be there
    ... I think LOC has adopted a handle based system
    ... to mitigate that

    Daniel: one of processes in signing something
    ... is the ceremony around it
    ... preserving that in context of the record is interesting idea
    ... how electronic signatures should be done
    ... Where we see it is on click-through contracts
    ... show small print contracts
    ... you scroll through it
    ... then you go to the next step
    ... not sure how they know you did that step
    ... record that you signed
    ... so that you knowingly signed it

    Kevin: so we seem to be moving into authentication and
    identification space
    ... not sure we drew conclusion to multi-channel other than focus on
    mobile
    ... Let's take a short break

    <josema> [back from break]

Identification and Authentication

    Daniel: I worked for member of Congress and addressed issue of
    electronic signatures
    ... Electronic signatures vs. digital signatures

    <josema> [Daniel reviews the paper
    [12]http://www.w3.org/TR/egov-improving/#idauth]

      [12] http://www.w3.org/TR/egov-improving/#idauth

    Daniel: I'd like to focus on role for gov'ts
    ... other aspect is nature of what we could do
    ... once we tried to map it into the digital form
    ... think about how we authenticate people, or could
    ... using GPS, etc.
    ... Let me go through list
    ... mapping onto digital world, here are some main problems
    ... privacy concerns, costs -- what we should be doing as we map
    into digital world
    ... keep same types of easy physical things that become barriers
    ... if someone gets a fishing license in person may not need ID
    ... but may need a digital token or do unline; may be an unnecessary
    burden
    ... avoid unnecessary levels of authentication
    ... when to pre-authenticate people
    ... authentication often done by what you have
    ... you have to give them something prior to the action they take
    ... a form, for example
    ... avoiding focing identityf to be divulged when unecessary or
    counter to the purpose
    ... such as voting
    ... keep that in mind
    ... and avoiding reliance on outside parties to supply
    authenticating credentials as the sole means of authentication
    ... for example, use SSL
    ... there are a handful of providers the major browsers have made
    standard
    ... mostly commercial
    ... government, in order to get digital certificate, was going to
    get from a commercial service
    ... I wasn't sure what that meant in terms of authentication
    ... to rely upon an authentication source in another country, for
    example
    ... consider when making decision about who supplies credentials;
    root authority
    ... SSL is big in Web; but we should not gloss over
    ... that security is provided by non-governmental agencies
    ... Whole bunch of uses for this
    ... understand what forms that can take; see list
    ... assertion and assumption
    ... when I met you I put on a name tag, for example, without an ID
    ... assertion is an aspect of authentication
    ... assumption is another aspect
    ... this happens frequently on the Web
    ... and more traditional things, about what you know
    ... such as in the financial world; your PIN, mother's maiden name
    ... what you are; biometric devices being embedded
    ... what you have; those are tokens
    ... I have an RSA security card, for example
    ... what time it is
    ... in the US there are services provided
    ... that provide time information; can also be used for identity
    ... the person or event
    ... who knows you
    ... an idea circulating about the Web of Trust
    ... idea where people ID based on who their friends are
    ... a growing possibility, such as friends on FaceBook
    ... quality and quantity of attempts
    ... a person may mistake once about their identity but not 10K times
    ... off-line response or vouching
    ... hard to get arms around
    ... things take place and then afterwards we meet up with them
    ... in terms of privacy release, the threshold for ID should be low
    because that was beginning process
    ... at another point, there would be chance for out of band; call up
    the person
    ... based on number they provided
    ... this is used by credit card companies to authenticate
    ... Not trying to make obvious answers, but think W3C group should
    let people know about some of the standards they offer in these
    areas
    ... Use of XML which stores information
    ... URIs and URLs are important
    ... Mail to or HTTP addresses for saying who they are
    ... being used within databases
    ... rely on domain system for big piece of that
    ... Joe mentioned earlier when one department becomes another
    department
    ... I like open id; it uses and HTTP URL for identification
    ... you can get a v-card and microformatted information about who I
    am
    ... but does not nec. supply third-party information
    ... that is also being used for wiki
    ... XForms is one of main standard set from W3C for allowing for
    transactions
    ... also consider in work about preserving ceremony and protocols of
    signing and authenticating documents and contracts
    ... XForms used to preserve
    ... Then weirdness in XML
    ... of not having something with tags that goes beyond what's inside
    the tags
    ... important in context of validity and non-repudiation
    ... putting in the hash in the XML document; a fingerprint of the
    whole document
    ... describes what's outside of it
    ... seems to be valid
    ... to preserve non-repudiation
    ... consider when a person signs a contract or form
    ... happened at a certain time; but if no way to preserve in a way
    using a hash, then it makes it less dependable
    ... Can put something up on a Web site

    Greg: Is there a possibility to follow pattern like Internet
    archive?
    ... like a third party authenticating?

    Daniel: an interesting question
    ... if you can go back to the thing you trust
    ... if you put up URL and someone later repudiates it with a false
    copy, you can go back to that one
    ... a question of preserving in the Web over time
    ... I don't see a problem but it's a consideration
    ... doesn't address privacy issues when you don't want things on the
    net

    Greg: but could a pattern be in gov't
    ... that the archive or other authority is actually managing that
    digital archive and certification service

    Daniel: yes

    Ken: identify who you know; social interactions are ancient
    ... if you go to India, you sign documents that say, "I am Ken, son
    of so-and-so"
    ... it's an ancient way of identifying people

    Daniel: I don't disagree
    ... when PKI was on the scene
    ... there was that other form of securing email
    ... based on Web of Trust
    ... PGP, thanks
    ... that has been used a bit
    ... in context of gov't interesting when you do things online
    ... move toward providing a registering authority
    ... different from who you know
    ... in terms of contracts, where two parties trust each other
    ... I didn't mean to say it's unusual in world
    ... but about how things are being mapped to digital world

    Mark Thomas

    MarkT: OpenID suffers from same problem
    ... myopenid.com suffers from similar problem

    Daniel: OpenID is not a third party, it's a standard
    ... they have open source software you can download

    MarkT: My open id used by W3C wiki...

    Daniel: you can use any open id providers for W3C wiki
    ... separate from what open id is

    MarkT: So it doesn't rule out idea of OpenID.gov?

    Daniel: I believe that who served it was really important
    ... but that's counter to idea of open id
    ... may be used as third party authentication
    ... because URL is a unique identifier, another place can store
    ... and point as a verification
    ... this has been done by VeriSign and Better Business Bureau
    ... If you click a link from a Web page and go to the VeriSign page,
    they will say that the URL you came from has been verified through
    us
    ... so you can trust they are who they say they are
    ... so transaction should be valid, to avoid spoofing
    ... will specify the URL on their page
    ... that is a simple thing to do in terms of open id
    ... government could verify people based on their OpenID
    ... maybe there would be a time limit for renewal or response
    ... to ensure you are still in control of it
    ... then that would supply that third party without supplying that
    source

    Peter Alderman, from Govt

    Peter:I don't know why I want to do OpenID.
    ... why we want to rely on other people's credentials

    Daniel: I agree
    ... I have been dealing with Congress
    ... there is a question of authenticating people when they bring
    things to Congress
    ... I don't think that's needed
    ... county government is thinking of open id for nonrepudiation, not
    authentication
    ... so only you can come back to get that information
    ... So no one else can come back into that system
    ... which is different from authentication, but it's about the
    series of transaction
    ... perhaps ID on-going transaction

    Peter: open id is just another technology
    ... the appropriateness of using it for any transaction depends upon
    the risk level, assurance of identity
    ... what concerns me is reasoning from the technology backwards
    rather
    ... I wrote the legislation

    Peter: And I had to make it work

    JohnS: In the UK, it cited authentication need, so we set up
    government gateway
    ... to access public services, you have to have a gov't gateway
    number, PIN
    ... you don't have that same system in US
    ... I wanted to get some sense
    ... what is the shape of gov't int'l; are states bound to construct
    systems
    ... for example authenticate who is filing tax return

    Jose: My opinion based on what I have seen
    ... In Spain we have an electronic ID card I use to access a
    government service
    ... I can comment on a gov't blog, I don't need it
    ... Strong cultural difference in many countries
    ... In Singapore, which is eGov savvy
    ... they do opposite and don't ID their citizens online
    ... most of time, they don't care
    ... when they get a lot of complaints or anonymous emails, they
    focus on the substance
    ... they don't care who is saying what
    ... they do have id cards, but used for specific cases
    ... In Latin America, there are few id systems

    Daniel: the idea of identity is a flexible thing
    ... nature of identity is different culturally
    ... we have strong dispositions in US

    Kevin: is there an opportunity for us to take a position on
    standards, or do we need to be general
    ... recognizing issues, but not drawing conclusions
    ... It's not going to be one-size-fits-all

    Daniel: yes, pattern of everything; we should point to what W3C
    offers

    Peter: there are many other organizations in this space; so you may
    want to base line what others are doing

    Greg: I look at technology in context of its practice
    ... open id is a distributed system at its core
    ... allows people to offload identity to place that has done another
    level of authentication
    ... looking at a distributed system rather than an authority

    <OAmbur> For .gov agencies, the most meaningful aspects of their
    "IDs" are their missions, visions, values, goals, objectives, and
    stakeholders, i.e., their strategic plans. Depending upon one's
    point of view, the same might be said of individuals.

    Peter: but it's not the only one by a long shot
    ... depends upon the need
    ... as said earlier, one size does not fit all

    Joe: I'd like to step back
    ... this area involves two aspects
    ... individual coming to gov't to authenticate themselves
    ... and the government authenticating its official documents
    ... I like the assertion concept
    ... government asserts that document has not changed since they
    published it
    ... they distinguish between authentic and official
    ... recognize once they publish on Web, another entity can publish;
    so what's the provinence
    ... how does gov't publish data if it gets replicated
    ... And then when I as indiv does business with govt'...
    ... we had federal PKI initiative
    ... I was pleased we could piggyback onto federal PKI program
    ... so we could use their facilities
    ... made things simpler
    ... for lobby disclosure
    ... that's one technology, but federal gov't has to some extent
    ... where interactions are at that level, the DoT has that service

    Daniel: I think they gave people certificate

    Peter: yes, four certificates
    ... at DoD

    Daniel: DoD has been leader in authentication adoption
    ... not get into real ID

    Ken: we will see a lot of aggregated data that claims to be
    government
    ... it would be good to have link to open source to verify data
    ... we do semantic web
    ... data.gov and recovery.gov will have a number of mash-ups that
    will be unclear
    ... we could produce a standard about how we aggregated the data
    ... that could provide a level of trust
    ... Second comment on ID and data
    ... I wrote about suject...can we have a privacy wall
    ... so gov't cannot see your transactions in other parts of
    government

    Peter: depends upon the gov't application area
    ... if they care about an attribute, then your id can be masked, but
    it's situationa

    Ken: May be case with IRS that needs to know
    ... but agent may only need to know your soc sec number was verified
    ... I am suggesting a system that a person in gov't cannot query
    every other transaction and where they have been
    ... knowing there is a hidden layer
    ... so you can only be looked up based on that relevant transaction

    Peter: two comments
    ... the need to know your identity
    ... is situational
    ... depends upon application and purpose; we are agreed that our
    policy statements say you should only ask and keep info you
    absolutely need
    ... and if you ask for ID, we have to protect and ensure
    ... our systems are audited anually for compliance
    ... that is not well know because it's geek stuff
    ... Secondly, the identity architecture we have deployed
    ... does not aggregate data
    ... there is no capability or interest on the civilian side to
    aggregate data
    ... I work on civilian side
    ... we are not interested in tracking you across cyberspace

    Daniel: I would like to go back to Ken's first point
    ... trusting the information in a mash-up and exposing the methods
    that a mash-up used
    ... if there are links to the originating documents
    ... that idea I have been working on with repository schema
    ... idea is either with XQuery or SPARQL, you can preserve method by
    which something was pulled out in an auditable way

    Ken: I am saying something more general; not everyone will use
    XQuery or SPARQL
    ... when you put view source and show how your Java Script compiles
    that application
    ... it will avert problems
    ... if people try to put up fradulent aggregated data
    ... If we have a standard, that would negate the problem since they
    are not following the standard

    Daniel: I'm just saying that the W3C already has some standards

    Joe: I feel like replication is a problem, period.
    ... That's a tough statement; there are lots of mash-up sites
    ... I would claim that I cannot trust any mash-up side
    ... If I find an error in a GPO page, I can contact them
    ... and they can correct that error
    ... But all of the mash-up sites that used this data, will have that
    error
    ... the assumption on mash-up site is that was historical data that
    will never change
    ... so using data by reference rather than by application.
    ... You end up with a better quality of data

    Ken: My concern is that change.gov or change.org
    ... showing what people want to have happen; for example marijuana
    came up to top of list
    ... people who disagree with Obama on will push issues
    ... Web site purports to be the voice of the American people
    ... in transparency, we are trying to get more people in public
    sector to make the data more relevant
    ... as we launch this, I would like to see some standard adopted so
    we don't hurt the brand of open data
    ... so people won't believe anything that says open gov't data

    Peter: That's why GPO approach is the right way; if gov't signs
    those documents
    ... anyone can go click to see if the certificate is valid

    Daniel: Let's move on
    ... all important points

    Greg: So put this on white board
    ... identification of organizations, company parent structures,
    understand who the contractor is
    ... have that be consistent
    ... fits in with identity, but identity of an organization
    ... it's a real need we have
    ... we don't have a way to get the corporate ownership tree from the
    government

    Daniel: important point
    ... especially people going after recovery and stimuls funds

    Joe: gov't identification of their data, and how citizens
    authenticate themselves

    Kevin: good conversation that brought fourth some issues
    ... I'd like to ask Chris Testa to lead long-term data conversation
    now
    ... then continue after lunch

Long Term Data Management

    <josema> scribe:josema

    [scribing will suffer a bit from new scribe ability, more
    summarized, please jump in as you like]

    [back to conversation this morning, how can you point to a You Tube
    source vs. authoritative source]

    [discussed yesterday, published once and syndicate]

    Suzanne: if goal is to make data discoverable, available,
    accessible, we need to think differently
    ...how we manage our data

    <OAmbur> One issue related to records management is the importance
    of presentation, i.e., exactly what did the individual see when he
    or she did what he or she did.

    Suzanne: there are some experimental practices
    ... Virtual Alabama is one of them
    ... whole idea is to make whole info available in the event of a
    crisis
    ... so others could have an idea on what the situation is and plan
    accordingly
    ... they decided to take responsibility of information down to the
    source
    ... when asking what's the quality of the information
    ... in practice they have SLAs
    ... one example of we might be able to surface in general

    Diane: access data in crisis management is very important
    ... EPA gave a presentation on [] project recently on managing this
    information

    [scribe missed a comment]

    Joe: I think it's practice in gov agencies not to keep trail
    ... gov is made of people, people make mistakes
    ... sometimes errors are significant
    ... this is a problem, long term
    ... once it's published and it's distributed, there's no trail
    ... error is replicated

    Kevin: there's pressure to keep all the copies, even when only one
    character has changed
    ... it has legal implications

    Diane: we had to create methodologies of all our taxonomies, i.e.
    for versioning
    ... what was the version in force at a given time
    ... this related to what Joe and Daniel showed yesterday
    ... how you archive the versions, maintain differences, etc is a big
    problem for us

    Chris: at LOC we looked at a document holisticaly
    ... was difficult to reference to just a single piece that was being
    changed

    David: I don't think anybody here has published so much wrong data
    as I did
    ... OCRing newspaper information, posting to Flickr makes it better
    ... the non auth version is better than the auth one
    ... data reference is an important piece of the big picture
    ... but any single point of failure is the enemy
    ... you can't ever be sure everybody is using the same copy
    ... we are putting data online that in cases has been hidden for
    hundreds of years
    ... someone can download info from our site and even if the site
    disappears, info will survive somewhere else
    ... there's also need of useful pointers to pieces of information

    Daniel: wild ideas you brought up
    ... when a site goes down, people go to Google cache to find the
    info
    ... this is an example that just by viewing information there is
    fingerprint
    ... another view of preservation, just by viewing the info
    ... the more people views it, is an act of preservation

    ChrisD: if you already published data and you find error on it two
    years later
    ... should you fix the version there or preserve it as is and
    release a revised one?
    ... in that case you may want that permanent URI to be a pointer to
    list of versions

    Joe: I tend to think that stuff that goes to National Archives won't
    be touched
    ... you don't want a bill there to be changed
    ... imagine if you got a facsimile of a bill, you were going to
    check it before the original and you'd find differences
    ... there's a reason why the President sign it
    ... according to US code, the printed version by GPO is the law

    David: most of information dated before 1895? is wrong, misspelled,
    based on rumours
    ... you digitize it, OCR doesn't work perfectly, etc., in our case
    we wanted to preserve what was present in the original
    ... if there was a mistake originally you want it to be there
    ... it's a record of what people saw

    [Joe gives example on biographies that changed over time because of
    mistakes and the language used was different from era to era]

    joe: the way old historians wrote some of those may be not
    politically correct today
    ... but I don't want to see them changed

    [scribe missed comment from Kevin on history of bills]

    joe: what we are doing today is to put what we have in paper in
    electronic form
    ... but the electronic universe is different, do we really need the
    paper form? the paper law version?
    ... there are people and organizations devoted to that paper version
    but may or may not be needed
    ... in the end, we all are on this together

    Diane: what we need to figure out is what we can do from a W3C POV
    to improve current situation
    ... maybe a BP mechanism to connect the documents (OCR, original,
    other versions)?

    Daniel: in 10,000 years from now, how could we reconstruct what is
    being electronically archived, e.g. in XML
    ... is this an adquate way? is the metadata helpful to preserve
    meanings?
    ... is CSS a method of preservation?
    ... versioning issues and authenticity are important, but these
    ones, too
    ... will these documents be easily reproducable?

    [John presents a piece on "referencing legislation"]

    <OAmbur> With reference to the need to preserve the presentation of
    the data, XFDL is a candidate XML specification:
    [13]http://en.wikipedia.org/wiki/Extensible_Forms_Description_Langua
    ge

      [13] http://en.wikipedia.org/wiki/Extensible_Forms_Description_Language

    John: in ?? the online legislation in the UK had already the same
    legal status as the paper version
    ... all over EU this is happening, about half countries already have
    same status

    <OAmbur> XPS is another candidate XML specification addressing the
    presentation of data:
    [14]http://en.wikipedia.org/wiki/XML_Paper_Specification

      [14] http://en.wikipedia.org/wiki/XML_Paper_Specification

    [john explains the features and the URI scheme in place]

    [URIs allow to tackle many of the problems we have discussed]

    Daniel: so you have resolved how to cite a piece of legislation that
    exists as a whole document
    ... we've being thinking in terms of smaller pieces, too, and using
    XPointer to go deeper

    John: exactly what we are also enabling, pieces have anchors, too
    ... for us, these are identifiers, what they resolve is another
    question
    ... we don't consider the XPointer as part of the URI
    ... we built the whole scheme
    ... from a SW perspective we rather might have the # anchor, but we
    are talking here on how to identify the piece
    ... forget to document for a moment, it's all about an identifier
    scheme

    Greg: I think this is terrific
    ... I really want an identifier to anything I might want to lInk to

    Daniel: it's an interesting solution but I'm surprised at not seeing
    XPointer there

    joe: I can understand this being used as an identifier, e.g.
    protects the system in something does not exist in the future??

    Greg: I understand this might not qualify as a URI, but as a Web
    description mechanism

    [scribe missed comment from Joe]

    Greg: people who invented software didn't gave thought to search
    ... then the people who invented search, had to build on what was
    available
    ... building a bridge between both worlds is an interesting and
    useful thing to do

    Daniel: you need an schema describing how the URIs are constructed

    Greg: if there were a machine readable version of that, for me
    that's a step towards the SW

    [breaking for lunch]

    [still waiting for people to come back from lunch]

    <mib_9cb9y50b> karen myers is teaching diane mueller irc chatting

    [starting in 5 minutes]

    <Karen> Kevin: So let's get started

    <scribe> scribe:Karen

Future

    Kevin: We have had some good discussions up until lunch
    ... now let's look at what we have done so far
    ... and where we want to focus and prioritize
    ... So looking back to yesterday's discussion on Web 2.0
    ... One of issues is how gov't employees can take advantage of
    social media
    ... Second, reference models...(see agenda)
    ... Are there some tangible take-aways to refocus that conversation
    ... Reflecting upon George Thomas' comments
    ... related to social networking; lots of groups have been beating
    that horse
    ... so what does W3C contribute to the discussion, or should we let
    it go

    Daniel: I think W3C should be part of the discussion
    ... rules in US and UK are different
    ... people have different hats they wear
    ... So a member of Congress can say certain things under one hat,
    such as fund raising vs. as the rep
    ... how their speech would be identified
    ... So the idea of providing different identification for people
    ... they can have a public persona
    ... they speak on their thing on the record
    ... or if they identified themselves differently it would be another
    identity

    Kevin: So are we enabling social media networking

    Daniel: I think it's a piece

    Joe: Policies drive technologies
    ... as a former federal manager, this is a tough area
    ... I can already sense what is going on
    ... the gov't is concerned about who speaks for the gov't and what
    is said
    ... the media does not distinguish between the hats that they wear
    ... So I can say this now because I'm not employed by gov't
    ... So it's a cultural issue
    ... I get a lot out of Twitter
    ... like to partipate in the discussions
    ... It would be useful to them
    ... They may want to watch and observe rather than be participative
    ... I understand why I don't see gov't folks with their full names
    on twitter
    ... It's sad but that may be the reality for now
    ... Not sure how to overcome that

    Kevin: Are there standards to use if and when the culture is ready?

    Daniel: When you make a statement on a form and sign your name
    ... you are under legal recourse
    ... if you go your FaceBook page and say something that's untrue,
    under another identity
    ... except under the terms of service contract
    ... technology is hard to map
    ... difficult but doesn't mean we cannot deal with it

    ChrisJ: If it is difficult to take a stance, maybe we list the known
    issues in order for gov't to incorporate
    ... we could contribute this much

    Diane: If gov't isn't ready for social networking
    ... maybe we need to look at multichannel delivery
    ... and social networking is platform for that content
    ... so maybe W3C looks at standardizing content across the platform

    Joe: That's great
    ... but start talking about gov't employees
    ... So maybe gov't has its own twitter

    Daniel: There is a standard that Twitter uses
    ... could recreate the same technology

    Suzanne: I know there are early adopters in the US gov't
    ... and there is also an inertia to overcome
    ... and some employees don't understand the tech, or what policies
    they will violate
    ... some employees don't care or don't understand legal implications
    ... there have been people arrested for stuff they put on the Web
    was used as evidence against them
    ... people have to take time to think
    ... We are social animals, not machines
    ... I am puzzled why we aren't socializing with the tools that allow
    that capability
    ... So should W3C build some scenarios to educate government and
    help transform
    ... Issue of culture
    ... in US gov't we are a culture of "need to know"
    ... I will share if you need to know and I trust you
    ... So it's a new day with new admin and leadership
    ... and our performance will be measured by how well we engage
    ... we will be forced to shift our culture
    ... we may be more creative or innovative
    ... and start to think pro-actively and to create our future
    ... So the issue of trust and a strategy to find those early
    adopters
    ... Show what was done with little money
    ... make it ok to rethink our policies
    ... Agencies are bounded by their missions
    ... they have their chain of command, buildings, money
    ... In a connected world
    ... the boundaries are blurred
    ... forced to join up with other folks to solve same problems
    ... their roles and authority are blurred
    ... Some advice is not to come up with a single standard
    ... but form a community around a common concern
    ... so that formation of community will stand the test of time
    ...new world in ways our wars are fought
    ... for example, combatants dressed in non-combat uniforms
    ... so they formed a community and partnership with non-combatant
    organizations in the waterworld
    ... shipping industry, port authority
    ... so the common problem is about identifying ships, crews, and
    cargo

    <josema> I tend to believe once again, everything starts from
    Government Data, and will try to make my point

    Suzanne: So that's an example of a common problem beyond the usual
    enterprise
    ... They are empowered now about what information to share
    ... That's an example of a practice
    ... The W3C could look at all these communities
    ... and ask if there are standards to put in place to normalize
    these communities
    ... The other thing is metadata is always going to be an issue
    ... Can W3C enable something around metadata standards
    ... forgive my ignorance if they exists
    ... can we come up with principles that communities adopt when they
    are formed

    Jose: When we started preparing charter we discussed this
    ... and back in October F2F in France
    ... we discovered it was hard to scope the work
    ... so many "e-*", e-Health, e-Democracy, e-Voting...
    ... hard to find what to do in so many areas
    ... My idea was having government data as the core of everything we
    are working on
    ... every time we talk about whatever issue
    ... we end up talking about the data
    ... all sorts of ramifications that lead us to topics of common
    interest
    ... So if the core is about gov't data
    ... then you have to focus on some core things
    ... make it representable to humans and machines
    ... usable for people with disabilities, etc.
    ... and devices
    ... And ensure the data has provenance attached to it
    ... and authentication
    ... so there are issues that cut across everything
    ... So if we focus here, that would be enough for us to work on
    ... The discussions are very interesting
    ... We have limited resources so far
    ... That is my main concern so far

    JohnS: the issue of participation and engagement we need to discuss
    to play back a degree of relevance
    ... and pick out a number one issue where we can make a real
    contribution
    ... For me, that issue for W3C is open data
    ... it's an issue that is well aligned
    ... gov'ts will look to W3C for how to do this
    ... allows us to explore some other areas that policy makers are
    interested in
    ... Data is the closest issue to W3C standards and specifications

    Kevin: When you have decision makers needing to communicate in a
    higher level way
    ... and now they are tasked with decision-making

    Suzanne: In the data world
    ... projects need money to thrive and move forward
    ... there are a number of opportunities for data projects
    ... but there is fear among decision-makers about data projects
    ... I have gotten some money
    ... especially in areas of data quality best practice

    Joe: the maritime motivation was probably not alone

    Suzanne: yes, when Navy realized that its partner were not just
    military

    Joe: So all these mechanisms are about bringing data together
    ... open data so it can interoperable and then obtain the benefits
    ... So back to maritime, was there an ROI at all, or was it
    crisis-driven

    Suzanne: In this case, it's an argument of what we avoid: loss of
    life, property, prevent terrorist activities
    ... It was recognized that everyone had pieces to the puzzle
    ... create a forum where people can bring their pieces together

    Joe: I think the gov't gets much of this
    ... I guess I agree with John that the open data helps
    ... good way to publish this data

    Suzanne: So create scenarios around that?

    Joe: yes
    ... The Web today, amazing number of 'proof of concept' experiments
    ... no real decided way to stitch stuff together
    ... Library of Congress has preservation of metadata initiative; how
    does this fit into other areas?
    ... So many standards and efforts
    ... I created this page and this gov't person asked me how to put it
    up in XML

    Suzanne: yes, there are a lot of competing standards
    ... I don't think gov't should make a decision around a single
    standard, rather let the communities decide
    ... and have W3C help to enable the types of standards to adovcate
    ... right now it's an issue of discovery
    ... Maybe dialogue will switch to access
    ... Can't get people know I exist unless I show up somewhere; same
    with my agency
    ... Let's solve discovery, then hone in on access

    Jose: One thing we have discussed is how to find a given taxonomy or
    ontology
    ... Semantic Interoperability Center in Europe is an example
    ... people are sending their schemas for gov't projects
    ... to this repository
    ... then a clearing process
    ... you get the schema, and they propose some changes, document, and
    publish it to benefit others
    ... Usually a package for XML schema, documentation, license you can
    download free
    ... There still are problems
    ... I am not a SemWeb expert, but we are compiling use cases based
    on RDFa
    ... first thing I thought was that we need a small ontology to
    describe those use cases
    ... no idea where to find it, how to look for it

    Diane: One issue we have had to resolve is a taxonomy recognition
    process
    ... People go out an build different taxonomies
    ... We have various groups
    ... doing conformance checking
    ... to XBRL Int'l and we have an in-house process for conformance to
    specifications
    ... So we have become a de facto clearing house

    <josema> and again, I'd love W3C could go ahead with Ontaria
    sometime in the near future [15]http://www.w3.org/2004/ontaria/

      [15] http://www.w3.org/2004/ontaria/

    Diane: Maybe somehow incorporating best practices for ontology
    recognition
    ... maybe not so broad

    [heads nod]

    Daniel: We are in US, so there is term ROI
    ... I think we should steer clear of it
    ... idea in Canada and other places of good government
    ... We can confront issue of limited resources; we are looking for
    ways to maximize resources
    ... and we can explain the costs
    ... A lot of gov'ts value good government
    ... We are using term open government data
    ... looking at best practices and new technologies and putting in
    use caes
    ... We are also helping to build the tools and software
    ... One thing we can think about is not just the data but the
    techniques, best practices, and tools we are putting out there

    Kevin: Yes, this brings us back to mission of group

    Ken: One concern I have, is at what point do the standards become a
    barrier to adoption?
    ... According to Yashir Bashir, he has suffered from making the
    adoption of standards to be a barrier
    ... So this has to be adopted by developers
    ... We should voice a concern that standards could become a barrier
    ... the most important thing is the start of adoption; perfect
    should not be enemy of the good

    Suzanne: To the point of not focusing on ROI, but there is a focus
    on agility
    ... private industry is about agility and willingness to adapt to
    change
    ... I'm trying to bring this into gov't; part of it is educating
    gov't leaders
    ... If you don't change, the citizens are going to solve their own
    problems
    ... their solutions will put you out of a role

    Daniel: Are you saying that gov't could be co-opted successfully?

    Suzanne: I read an article about a small township in the UK; they
    used social media to solve a common problem that everyone wanted
    solved
    ... Then gov't realized they needed to make themselves relevant
    ... My job is to advise senior leaders; to use emotional
    intelligence to motivate them to learn

    Kevin: Beth stressed yesterday the role of e-rulemaking
    ... you are opening yourself up to comment
    ... now you can comment so quickly
    ... that may create some tension
    ... will put different spins on things

    Daniel: you have joined W3C eGov to save your government [laughs]

    Joe: So what does W3C do for gov't agencies
    ... There is a Web with file server; we're talking about open data

    <Daniel_Bennett> thanks Karen for that

    Joe: would be great to use a standard for gov't agencies
    ... so a single file that pointed to these data files or data
    repositories
    ... how do I find this stuff
    ... I go to Google, file type colon, XML
    ... that's really not fair
    ... that works
    ... I would love it if W3C would say, here are the XML files, or the
    open data we are publishing

    Diane: So here is another example
    ... SEC put up RSS feeds for new exhibits
    ... for filings
    ... other thing is that at 3:00am they post to an FTP site
    ... the previous days' actions
    ... they are working on some Web services architecture
    ... but today there are some really simple things you can do today
    so that we can find it

    Daniel: FTP uses file discovery; URL discovery is being done to some
    extent through RSS
    ... by use it's restricted
    ... Something called site map
    ... idea of repository schema
    ... to allow people to do full discovery of gov't data

    Joe: This is a need
    ... we go to a Web site and we are confronted with the home page
    ... terrific for me as a human
    ... but I cannot find what gov't is posting for open government data
    ... whether we use this standard or not

    Kevin: So are there any examples out there yet?

    JohnS: So thinking about history
    ... this is the kind of question that we can put to the rest of W3C
    ... we are busy trying to figure out how to do open gov't data
    ... particularly this discovery piece
    ... It would be helpful if W3C could say something useful
    ... becomes a question for the SemWeb guys
    ... that's exactly the question we could pitch

    Joe: So maybe there is a hierarchy of this
    ... We already know what's coming when it comes to the SemWeb folks
    ... Maybe there should be three ways to do it
    ... Level one

    Diane: We call that an adoption model

    Joe: Gov't folks are really busy
    ... so find a way to have them be conforming without spending six
    months

    Kevin: I have expressed my thoughts on why RDF is not an ideal
    solution for all gov't data

    Joe: Idea of transforming XML into RDF
    ... that's an important thing to think about, but it's not the holy
    grail
    ... gov't just wants to get data out on the Web

    Diane: So that may be a use case, how to get data out on the Web

    Jose: Something that happens all the time?
    ... Why do you think W3C is selling Semantic Web alone? We are
    working on other things, too.
    ... We have been discussing within W3C a long time. It's what is the
    right tool for the job
    ... maybe we need best practices to identify some cases

    Daniel: So this brings me back to discovery of things
    ... the ICAAN
    ... there was a point that was an rfc
    ... general rules that domain should be given out
    ... that was an issue; it moved to GSA to make rules
    ... dotgov is a US domain
    ... a lot of issues around that
    ... in US we have a crazy point where .gov is non-federal
    ... in terms of discovery, if we assume a lot about the domains
    ... we are investing a lot in it
    ... it has not been brought up in the conversation and it's not a
    small thing
    ... we are just now questioning it
    ... In US there is a semi-official use of .org
    ... quasi type organizations
    ... go to USPS.org and USPS.gov
    ... we should talk about it

    Kevin: Most localities were stuck with .us
    ... we were buying .orgs to differentiate the Web sites between gov
    and commerce
    ... working with GSA they did a federal register process ruling
    ... there needed to be an identity

    Daniel: We should talk about a best practice
    ... people come to trust these things; some regulated and some less
    regulated more domains, top-level and unrelated (unfortunate in my
    opinion)

    ChrisJ: I know this group is chartered by this May
    ... so do we need to focus realistically?

    Kevin: True
    ... There are certain strategies and issues in the paper
    ... If we can get solutions and best practices, wonderful
    ... but there may be year two and three work

    ChrisJ: I would be hesitant to know whether things we ID for May are
    really best practices
    ... maybe we look at what things can best satisfy; what are the
    criteria

    Suzanne: I think there are pockets of knowledge we can leverage
    ... have not reached the tipping point yet, but there are early
    explorers who have gained some success
    ... tangible examples others are looking at and can take back to
    agencies

    Diane: So maybe surface those in US gov't
    ... like Andrew [] at EPA
    ... some good use cases and can work to substantiate within other
    agencies

    Kevin: Yes, like the open site map initiative
    ... it's not a crown jewel, but is working well
    ... when we talk about architecture
    ... the World Bank initiative is about helping to build an
    architecture
    ... but they cannot invest money in big architectures
    ... keep limited solutions in context

    Suzanne: examples at state and federal levels like
    virtualalabama.gov
    ... whatever TSA stands for under DHS
    ... they engage their citizenry on the policies for screening at the
    airports
    ... they were early explorers of engaging citizens on how to improve
    screening at the airports
    ... story is that there was a lot of vulgarity at the start
    ... lesson learned is to let people vent and screen the responses
    for the good ideas
    ... ignore the bad and hone in on the good
    ... they identified good suggestions and tested at pilot sites
    ... now there is more of a partnership approach and the citizens
    responded favorably

    Ken: NAPA collaboration project has examples of social media

    Suzanne: I am an agency member to that group
    ... I learned about by sitting in on collaboration project meetings

    Kevin: So going back under the open gov't data section to bring this
    to closure and review

    [Kevin re-reads document]

    scribe: What are the thoughts?

    Jose: We compiled a number of issues we thought should be discussed
    here

    Diane: I would keep XBRL in there, but it's a use case under
    multichannel and versioning, much like London Gazette
    ... as illustrations of how one standard has done it
    ... then look at legal XML

    Ray: What do you mean by metadata standardization?

    Jose: This came up when talking about if there is a need to have an
    open sitemap for metadata protocol
    ... idea is that it might be some standard
    ... point people to sources that are for machines and not humans
    ... help discover data sources

    Ray: so not standardizing the metadata but the protocol

    Joe: But there are a number of metadata standards
    ... I used XMP for example
    ... a wrapper for RDF, a wrapper for Dublin Core
    ... XBRL is same; have own metadata standards that map to Dublin
    Core
    ... but don't use dc title; don't replicate it
    ... a lot of metadata standards at low level
    ... even data has different formats for date
    ... would be nice to have an adoption model to help people figure
    out

    Ray: Hasn't ISO been trying to do this for 15 years?
    ... maybe that gets to how that statement is problematic
    ... when I was dealing with standards at OASIS
    ... so how do we standardize those things
    ... like Dublin Core doesn't talk about what you put in it
    ... I would break it into standards for how to show you are using
    metadata and how you fill it
    ... break those two apart

    Ray: W3C joined LOC ten years ago because of standardization of
    metadata
    ... but it means nothing more than RDF to the W3C

    Joe: LOC preservation of metadata goes beyond Dublin Core
    ... why wouldn't we coordinate this with you?

    Ray: yes, when you brought it up
    ... I wanted to mention that premise does represent...LOC does
    coordinate it...

    <mib_9cb9y50b> PREMIS

    [PREMIS committee]

    Ray: if standardizing on presevation metadata, that's a good start
    ... good example of a mature data dictionary
    ... could take to W3C

    Joe: yes, hundreds of people who are experts

    Ray: 2.0 is latest version

    Joe: Yes, we should look at that
    ... XMP doesn't know anything about PREMIS
    ... so why not build up front rather than after thought

    Daniel: I remember another conversation
    ... that had XSLT
    ... between mods and compel report
    ... my point is that W3C provides standard for XSL

    Ray: but mapping form mods to Dublin Core is useless
    ... what are you trying to describe?
    ... catalogue a book or describe something?

    Daniel: Pointing out you are using XML for the metadata standard,
    which is a W3C standard

    Ray: I'm not seeing that

    Daniel: LOC moved to XML standard

    Joe: We want to get more specific than just say use XML

    Ray: LOC invented basis character format
    ... that didn't conform to any syntax
    ... existed for 40 years
    ... ten years ago we converted it to XML

    <mib_9cb9y50b> diane is using mibbit

    Joe: Are there search engines that operate against Dublin Core
    ... Google ignores metadata
    ... people don't do it because it's not searchable for people

    Ray: You can search by Dublin Core on LOC data
    ... using search protocols

    Joe: I would like a URL; that would be helpful
    ... a lot of people know know that
    ... if people saw value of putting in data, they will do it
    ... obvious long-term value to have info available

    Kevin: What is happening with [?]

    Ray: not sure

    Daniel: bring use cases
    ... and be higher level
    ... doesn't mean we should not show the weeds
    ... wonder about the tool sets
    ... are we building standards or pointing them out

    Diane: one of things because of popularity of RDF
    ... and Search Monkey and Semantic searches
    ... I am working on XBRL
    ... trying to somehow create a use case of generating RDF from your
    content

    Joe: GRDDL

    Diane: We are using some of that
    ... to generate XBRL triples
    ... at the bleeding edge
    ... so we can make XBRL repositories searchable
    ... but for data.gov content, it's not searchable
    ... in a way anyone has an engine
    ... I would like to see some recommendations around that model
    ... and use RDF

    Daniel: that was issue of the first question
    ... should things be in raw data

    Diane: that is a good standing question
    ... if you have raw data, how do you parse through it
    ... yes, please put up raw data, but what's the next step
    ... notification is first step

    Daniel: XSL and GRDDL point to the answers
    ... you get to everything else you want
    ... someone else can lay over whatever semantic they want

    Diane: Who gets the semantic layer on top of it?
    ... the gov't or outside vendors?
    ... a big question

    Joe: If we had one file that was a pointer to all the data on your
    Web site
    ... then maybe some of metadata needs to be on the Web site
    ... There is data out there
    ... file out there for Senators
    ... container for each Senator
    ... no metadata, no schema, in XML
    ... they don't want to change that file
    ... could have this other file
    ... where real metadata could live
    ... go back to that adoption model
    ... I would like a title for that, vs PREMIS data at high end
    ... have stuff running
    ... long term get to what we really want

    Kevin: two more comments

    Diane: Who generates the semantics
    ... as an outside vendor, who says my triples are right?
    ... there is a missing layer

    Joe: human brains
    ... you cannot just automate that understanding

    JohnS: my comment is that gov'ts spend a lot of time defining what
    is meant by certain things
    ... a public notice is a legislatively defined concept
    ... a parking place is a a legislatively defined concept
    ... placed onto the physical world is the road
    ... I don't see anyway legislative bodies can escape
    ... the responsibility to describe the semantic concepts they use
    ... governments govern
    ... legislators legislate
    ... lawyers disagree
    ... question of figuring out what to do
    ... put to this group
    ... useful work is finding these design patterns
    ... try to tailor some things
    ... if you are trying to achieve this, and are starting from here,
    then here is some stuff that could work out
    ... Provide standards, tools you can try
    ... Surface what some of those patterns look like
    ... We have five or six already
    ... XBRL, RDFa; a range using W3C specs
    ... starting from here to get there
    ... that would be a good process to do

    ChrisJ: There are so many types of gov't data
    ... environmental, financial, legislative, books, etc
    ... as a group do we want to weigh into domain-specific
    ... or non-domain specific
    ... that's what I was wondering

    Suzanne: I would be thrilled if an agency would cough up a data set
    ... with data.gov just around the corner
    ... let's see what they do with it
    ... let activities inform how W3C can facilitate
    ... an opportunity of timing
    ... data.gov
    ... Vivek wants this thing done yesterday
    ... we are not going to wait a long time
    ... to see what agencies do
    ... Regarding data reference model
    ... I don't think comment fits within the scope
    ... number four [?]
    ... When DRM first came out, the agencies didn't understand it right
    away
    ... but they weren't sure how to implement
    ... we came up with a framework
    ... explain business problem you are trying to solve
    ... understand why you exist
    ... this will inform informatoin for the community
    ... that informs the standards
    ... moving from structured sources to unstructured sources
    ... then push onto engineering those services
    ... a three-legged implementation strategy
    ... OBM didn't publish the implementation strategy
    ... Agencies asked all these implementation questions
    ... So I started talking about it
    ... and we published the draft doc on the wiki
    ... once they got that guidance
    ... Everyone then understood what the DRM was about after they
    understood implementation strategy
    ... Now latest talk is all those reference models
    ... so I can build my business cases
    ... Taxonomies for data, services
    ... not quite the same things
    ... around context for information
    ... XBRL won't work for medical community
    ... can't have a single taxonomy for gov't
    ... It's a framework for how to plan for an execute it

    Jose: Moving on
    ... Coming up with a common schema
    ... so gov't agencies can get information
    ... see if they can feed some of their needs
    ... my opinion is that this is very difficult to achieve to do this
    one
    ... I don't think it will happen soon
    ... Issue is whether you think this could be useful

    Ray: It's difficult to achieve, and is a bad idea
    ... would make standards more difficult to understand
    ... and I say that with 25 years' of standards-making
    ... all have different templates for standards

    Daniel: germ of idea that is important
    ... points out another piece this gets to
    ... documentation and specifications
    ... even if you cannot come to an agreement
    ... at least provide the documentation to explain what you have done
    ... tell them that if you put stuff up, try to use standard ways to
    document what you ahve done
    ... at least make that attempt
    ... Point of order
    ... Somewhat brought up data.gov
    ... my understanding is that this is a Web site
    ... Is there anyone from data.gov here?

    Suzanne: My name is on the internal letter that went out to internal
    agencies
    ... It happened quickly
    ... my name is on something I don't quite understand
    ... there will be an open meeting hosted by Federal CIO council
    ... coming up soon

    Daniel: Would they like W3C to participate?

    Kevin: They are meeting with a variety of organizations about what
    happens next
    ... contact information being given, separate site
    ... we will post standards for them to evaluate
    ... I understand that they are putting up data sets
    ... I think they will use standards

    Suzanne: We were given a Web link from DC.gov to look at
    ... like create a catalogue of info for sharing

    <josema> [16]http://data.octo.dc.gov/

      [16] http://data.octo.dc.gov/

    [Kevin pulls up data.gov]

    Greg: this is to a number of data feeds

    Joe: Seems like a bad practice

    Kevin: It's only a portal

    Greg: The data catalogue is just contents
    ... what I like is the tabular representation of the data sets
    ... it's not semantic yet
    ... So I can see where the metadata is
    ... and here are some formats
    ... looks like a version 1.0 of what a data catalogue could be
    ... Next issues about getting live streams, feeds, not worked out
    yet
    ... pagination of data
    ... developers add by default

    Joe: Probably thinking they don't want to give you five megs of data
    ... If I want the year, you aren't going to get unless you go day by
    day
    ... that is not opening raw data

    <OAmbur> On the issue of whether anyone is using metadata to support
    queries, AIIM's Interoperable Enterprise Content Management (iECM)
    Committee plans to conduct a demo of the Content Management
    Interoperability Services (CMIS) protocol at the AIIM conference in
    Philadelphia.

    Daniel: the 800 pound gorilla is not in the room

    Suzanne: one of strategies we are recommending
    ... we are ok with idea of being messy up front
    ... also want the opportunity to learn
    ... look at the UK
    ... they are engaging the consumers by asking them for feedback
    ... I don't have data for a year
    ... We want to enable that feedback so we can meet an umet
    expectation
    ... we won't put out a perfect catalogue at first go
    ... put out what you know today
    ... gain some experience and grow it incrementally

    Diane: Encourage you to talk with SEC and their EDGAR system
    ... how to build a data repository
    ... and impact on servers, strategies for modernizing their systems
    ... there is an office of interactive data

    Kevin: Time check
    ... Talk about interoperability
    ... I think I have a fair grasp of what to document

    [JohnS head nods]

    Kevin: let's continue on to multichannel

    Joe: One point on interoperability
    ... I really think we need to use the Web for indiv to interoperate
    data sets
    ... rather than mash-ups guessing what they want to see
    ... that's the critical issue with interoperability
    ... personal data
    ... I don't think it exists much today
    ... Census Bureau, you cannot get to raw data

    Kevin: Multichannel delivery
    ... What we decided this morning is to focus on mobile delivery
    ... as most important first channel
    ... and site some of the other channels such as TV
    ... any particular standards that exist or don't exit?

    Daniel: I assume that would be XSL and CSS

    Diane: Maybe SVG

    Daniel: Good point
    ... brings up issue of Flash and PDF
    ... downloadable fonts
    ... my Google phone has downloadable fonts
    ... I can see how things are supposed to look
    ... because it's CSS I can change on the fly

    Diane: Other thing is what is the status of XHTML 2.0 and where that
    is going

    Jose: What I mentioned this morning also is the mobile Web best
    practices

    Diane: Yesterday we talked about HTML and valid HTML
    ... where the gov't stands on putting up valid pages
    ... encourage well-formed HTML, XML, and name spaces
    ... possibly XHTML

    Kevin: Yes, I strongly agree
    ... putting our best face forward
    ... is a necessity

    Ray: Anyone else who is on the wwwTAG list serve?
    ... you may want to note that concept of well-formed HTML is not an
    agreed-up goal
    ... within W3C at large
    ... very controversial
    ... it's an uphill battle

    Daniel: I disagree with one aspect
    ... what we don't care is people to lie about what they use
    ... if they say it's strict XHTML, that it needs to be that thing
    ... some untruthful gov't sites out there
    ... be truthful about what you are saying

    Diane: I bridge between wanting content and wanting well-formed
    stuff
    ... intention of getting it to the next outlet

    Kevin: Absolutely

    Joe: May be dangerous to ask browser manufacturers
    ... to not load pages

    Diane: Then you get Google's home page

    Daniel: They have error things on it

    Joe: I don't think they would do it
    ... not in their interest
    ... over next 50-10,000 years
    ... that's why I like XML

    Ray: Have you seen XML 5 proposal?
    ... It was put forth, but not well received at W3C

    Joe: I would like to use document object model based tools
    ... so we need to have the file well formed
    ... otherwise we are back to screen scraping

    JohnS: So we could make a case for having well formed XHTML
    ... so instead of saying this is an end in themselves
    ... talk about them being a means to an end

    <josema> I agree, it's not a matter of MUST but of SHOULD and the
    benefits it brings

    JohnS: use examples
    ... take care over these things
    ... so people know quality to expect
    ... argument and case needs to presented against an end goal they
    are trying to achieve

    Kevin: you are better prepared to do more if you follow this
    approach
    ... Identification and Authentication
    ... I think we are good here
    ... what else?
    ... anything else to cite in the paper?

    Joe: DITA
    ... Darwin Information Technical Architecture

    Diane: good example of an open tool kit

    Daniel: If we go down that road, we should mention other things
    ... say what's out there

    Joe: Whole issue of W3C vs OASIS standards

    Kevin: We should cite work that has already been done

    Jose: Unless it's proprietary or the licensing model is in conflict
    with what we propose

    ChrisJ: On Thursday I mentioned issue of authentication
    ... verifying that someone is a person vs a machine
    ... are there any best practices for dealing with spam comments?

    <josema> I think it's not OASIS vs. W3C but OASIS + W3C + IETF +...

    Daniel: One of most troublesome areas
    ... what happened in Congress
    ... They said that idea of what makes humans better than computers
    is pattern recognition
    ... but not all humans can detect (if you have disabilities)
    ... because section 508 deals with that
    ... they decided for plain text logic models
    ... kind of funny
    ... they were pretty simple
    ... but it's a huge issue
    ... when I was coming up with legislation
    ... there was issue of whether you want agents to authenticate
    themselves
    ... we don't have a good answer
    ... a lot of people are using open id
    ... one of points of it
    ... domain key issues bring it up
    ... hard to id who is human or not
    ... but rather audit back where they came from
    ... and be sure they are who they say they are
    ... so then you can block in future
    ... so get to it afterwards and have an audit trail
    ... a huge issue

    Kevin: We covered a lot
    ... we talked about cookie privacy thing within authentication
    ... still continue the non-use of cookies
    ... to understand who is using and viewing

    Daniel: the Web is stateless; so session id
    ... client, server, IP
    ... saying we are opening ourselves up to a lot of problems

    Kevin: any tangible focus on long-term data management?
    ... We cited a lot of stuff there

    Jose: I was still thinking about repository schemas
    ... whether a need to have a mechanism for machine to understand
    ... how Web site URLs are being built
    ... goes in line with idea of a site map protocol for metadata
    ... I'm brain storming right now

    Diane: Some back reference to Tim's article on cool URLs
    ... as another use case of how they are referencing in the UK
    ... JohnS's example
    ... help inform people
    ... structure things better

    Ray: You are talking about URL templates?

    Daniel: Weird thing
    ... a difference between URL templating and URL discovery
    ... hard to discover based on template
    ... cannot always describe a repository of documents
    ... purely by URL template

    Ray: Are you referring to IETF work?

    Daniel: A little bit

    Ray: May want to list that
    ... the IETF
    ... they have a standard in development on URL templates

    Kevin: anything else?

    ChrisJ: We didn't talk about multimedia
    ... I think there may be different issues that arise
    ... the formats are all proprietary

    Kevin: A bit of an action plan
    ... is the future year two
    ... What I committed to do is a 2-3 page summary
    ... of action items and next steps
    ... which will be posted on W3C site
    ... Some people volunteered at dinner to take some sections
    ... Any sub-committees that want to take owner ship of these pieces
    ... and try to draft something
    ... Our comment close date is 26 April
    ... publish date in May
    ... So bring to closure by third week in April
    ... if any group or individual wants to run with it

    Diane: I don't have time to do a whole section
    ... maybe break out a use case
    ... maybe on the multichannel delivery and on long-term data
    management

    Suzanne: I can contribute
    ... but not write a whole section
    ... contribute to the long-term data management
    ... and global management
    ... changing way we think about data as a global asset
    ... who owns the data
    ... change of thought
    ... maybe part of the shift is from ownership to stewardship
    ... look at people who steward that the data as being responsible
    for sharing
    ... Keep sharing in mind up front
    ... for consumers known and unknown

    Kevin: This is a blank slate right now
    ... Chris Testa has drafted some work in long-term data-management
    ... This morning's discussion embellished upon the thinking
    ... So that could be a collaborative effort

    Jose: Process wise
    ... So for those who want to contribute
    ... you should formally joined the group
    ... and you need to go through two to three steps
    ... fill out a couple forms

    Diane: Please send the links

    Kevin: It's off the wiki
    ... So I'll get that out by Monday

    Ken: I am not a member, but would be interested in the social media
    aspects
    ... not sure what next steps

    Jose: So it's easy to do
    ... please send me an email since I will be traveling
    ... remind me, those of you who want to contribute
    ... and I will send you IE form
    ... group participation is open

    Kevin: This has been a great two days
    ... We have talked about a lot of different things

    <Daniel_Bennett> link to the Repository Schema info
    [17]http://advocatehope.org/tech-tidbits/repository-schema

      [17] http://advocatehope.org/tech-tidbits/repository-schema

    Kevin: We have addressed many areas
    ... of what is accomplishable short-term
    ... What about long term

    Suzanne: Grow the group

    Karen: Yes, we have some recruiting and outreach work to do
    ... some of attendees are at policy levels, not as technical
    ... we will follow-up to invite the right people from those agencies
    to participate

    Kevin: yes, and we have our day jobs

    ChrisJ: In the second year
    ... I could see us formulating more specific technical questions
    ... and then we could bring people in from the outside
    ... some more targeted efforts to develop best practices

    Ken: Could I suggest invite someone else
    ... Josh Tolber

    Kevin: He is on the list

    Jose: So process wise
    ... how W3C works
    ... the plan forward
    ... is that Kevin will produce a summary report
    ... after then we will work on sections of the report
    ... through 26 April
    ... we are also getting comments on the public mailing list
    ... The group is chartered through end of May 2009
    ... Then we need to decide if we are done (which we are not)
    ... We need to propose a charter for the next one or two years
    ... Membership approves the charter
    ... that's how things work
    ... I know there are members in the room
    ... I don't expect there to be a problem
    ... I don't work for W3C personally
    ... My position is funded by CTIC, I am a fellow
    ... and I need to renew my fellowship
    ... and I don't envision a problem either
    ... We are challenged to resource the work
    ... more people
    ... more funding
    ... so we can organize events, travel

    Kevin: I spoke with Tim in January
    ... he understands importance of linked gov't data
    ... feel that we will get his strong support
    ... Also with data.gov and other initiatives
    ... they know they cannot do it themselves
    ... reference to possible funding to do some work
    ... to help groups get to their goals
    ... that is a possibility for this group and others
    ... Other thing to throw out
    ... We had a collaboration with World Bank and OASIS
    ... a one-day workshop on April, 17th
    ... Oleg Petrov introduced himself
    ... they have invited anyone here to participate
    ... Please let me know and I'll get you the invite
    ... It is going to be broadcast
    ... and streamed on demand
    ... given the focus of the World Bank
    ... will be a variety of locations
    ... and then they will build their events
    ... It has been a bit bumpy
    ... OASIS resurrected its eGov activity after we started
    ... and World Bank is getting its footing here
    ... So I look forward to a productive year

    Jose: Anything else?

    Kevin: Give you my personal thanks for your commitment and
    participation
    ... Great energy in the room yesterday and all day today
    ... Intellectually stimulating and productive
    ... moving to a really great product to help governments

    Jose: Lastly but importantly, I want to thank Karen

    <josema> thank you so much, Karen! :)

    <josema> [ADJOURNED]

Summary of Action Items

    [End of minutes]
      _________________________________________________________


     Minutes formatted by David Booth's [18]scribe.perl version 1.133
     ([19]CVS log)
     $Date: 2009/03/13 23:28:07 $

      [18] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
      [19] http://dev.w3.org/cvsweb/2002/scribe/

Received on Friday, 13 March 2009 23:38:00 UTC