Re: New BP (?) - Using standards for distributing datasets for specific domains or applications

This seems as much in scope as standard vocabularies. What is the 
# argument for considering formats out of scope? Item 2 in our mission is 
to provide *guidance to publishers* that will improve consistency in the 
way data is managed, thus promoting the re-use of data;
If we can suggest that they use standardized data models and formats, we 
do exactly that. If some of us feel that this is too much like telling 
people how to build their dataset, I would argue that we are doing the 
same thing by telling them what terms to use. Neither thing is 
constrained to the job of publishing the data on the web, but both are 
important for that and are good promoters of reuse.
-Annette

On 3/4/16 7:44 AM, Laufer wrote:
>
> Hi All,
> I do not know if this should be a new BP, if it could be incorporated 
> to the BP about standardized terms, or should be thought as an 
> extension included in a BP document of another group. Or none of them.
>
> The inspiration came from GTFS 
> (https://developers.google.com/transit/gtfs/), a standard way of 
> defining timetables.
>
> Here are some extractions from the GTFS site:
>
> “The General Transit Feed Specification (GTFS) defines a common format 
> for public transportation schedules and associated geographic 
> information. GTFS "feeds" allow public transit agencies to publish 
> their transit data and developers to write applications that consume 
> that data in an interoperable way.”
>
> “A GTFS feed is composed of a series of text (csv) files collected in 
> a ZIP file. Each file models a particular aspect of transit 
> information: stops, routes, trips, and other schedule data. A transit 
> agency can produce a GTFS feed to share their public transit 
> information with developers, who write tools that consume GTFS feeds 
> to incorporate public transit information into their applications. 
> GTFS can be used to power trip planners, time table publishers, and a 
> variety of applications, too diverse to list here, that use public 
> transit information in some way.”
>
> It is more than the vocabulary used. It is also a specific way of 
> distributing the dataset. Could we call this a kind of standard 
> dataset type?
>
> Does it makes sense?
>
> Cheers, Laufer
>
> -- 
>
> .  .  .  .. .  .
> .        .   . ..
> .     ..       .
>

-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory

Received on Friday, 4 March 2016 17:40:54 UTC