W3C home > Mailing lists > Public > public-architypes@w3.org > April 2017

Re: Archive Collection and Archived Item

From: Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
Date: Tue, 11 Apr 2017 15:21:33 +0000
To: Richard Wallis <richard.wallis@dataliberate.com>
CC: Giovanni Michetti <giovanni.michetti@ubc.ca>, public-architypes <public-architypes@w3.org>
Message-ID: <CCD4EFF6-3A84-4A2C-994D-42080E1E242D@jisc.ac.uk>
I do hark back to our Locah work, that Pete Johnston led in terms of the data modelling and RDF output point of view. (The blog is still now being used!) And I think one reason our approach worked is that we were pragmatic in using existing vocabularies wherever possible. We tried not to end up going out on an archival limb, so to speak. This is sometimes ignored in linked data work, but it is an important part of the linked data ethos of, well, of linking by using common vocabularies. 

I guess I’m very used to compromise and being pragmatic because when you run an aggregator you have to be,  and I’m also very keen to share/use things in common. So I think there needs to be a good reason to go ‘bespoke’. I’m thinking of schema.org for SEO and potentially for linking, so I want the search engines to slurp it up, and I think that will work better if we use the standard, widely used terms, as much as possible. I assume Richard would concur with this.  

Giovanni, you say:

> I don't see why we can't try and create an Extension that would fit better our (archival) perspective

But I’m not convinced we want to do that. I think this is about discoverability across the Web, where people don’t have an archival perspective. We should keep that perspective in terms of our own descriptive standards, but I suggest we should think about modifying it when we want to be part of the global information community.  The key thing here, I think, is what are ‘our needs’? We need to be very clear what schema.org is for, and what its benefits are. I remain convinced that we get the benefits most by sharing common vocabularies. 


> On 11 Apr 2017, at 15:59, Richard Wallis <richard.wallis@dataliberate.com> wrote:
> It’s great to see my straw man proposal has created some activity.
> I agree I think we are moving towards maybe updating the proposal a little.
> Jane your description of “the concept of a part of an archive, kind of thing” demonstrates wonderfully how difficult the naming of types and properties can be, especially when they need to make sense to data consumers in the world outside archives.
> I am all for improving the proposal.  Whilst doing that however we always need to take into account that the goal is to get the proposal adopted as an extension to the large generic vocabulary, by the Schema.org community.  To this end we need to wherever possible reuse terms that are already in the vocabulary; follow similar modelling patterns; and make it simple to understand by those who do not understand the inner workings of archives.
> I have sympathy with Owen for example that ArchivedItem ideally would be better subtyped from Thing.  Pragmatically however the chances of such a proposal being accepted would be virtually zero.  New subtypes and properties for Thing are rare occurrences as the community try to maintain a clean structure at the very top of the vocabulary structure.
> ~Richard
> Richard Wallis
> Founder, Data Liberate
> http://dataliberate.com

> Linkedin: http://www.linkedin.com/in/richardwallis

> Twitter: @rjw
> On 11 April 2017 at 15:49, Giovanni Michetti <giovanni.michetti@ubc.ca> wrote:
> Jane,
> I appreciate your practical approach, but I don't see why we can't try and create an Extension that would fit better our (archival) perspective. Like you, I'll use what is there, but I see here an opportunity to change things, and accomodate the model to our needs. In my opinion, it's too early to say "let's go for the best fit". After all, there's been very little discussion about this extension, so we are still in time to amend it, integrate it or even re-think it. I guess we are all looking for the best solution. The Extension as it is now is just a straw-man proposal, as Richard wrote. Let's see if we can improve it.
> So, back to your final statement, I'm not sure I agree. Please don't misunderstand me, I'm not saying I want to create "our own higher level classes just for archives". I'm just saying I don't know, I'd like to think a bit of it and see whether a better Extension can be designed. If this - from my point of view - would mean creating higher level classes just for archives, I'll make a proposal, and try to explain why I think that would be a better solution. Perhaps together we may find a third, different option that would satisfy us all.
> Giovanni
> Il 11/04/2017 16:29, Jane Stevenson ha scritto:
> I did worry about this to start with, but I don’t think it matters that is is classed under ‘intangible’ as I don’t think that impacts in terms of the role of schema.org.  I think its a case of trying to use what is there and that’s the best fit, even if its not an entire fit. We don’t want to create our own higher level classes just for archives.
> the definition reads "an item in an archival collection", so it is an item indeed.
> I think maybe that’s misleading. I don’t think its an item I think its the concept of a part of an archive, kind of thing :-)
> I would definitely vote for changing the name and the definition, to avoid confusion.
> cheers
> Jane
> On 11 Apr 2017, at 15:08, Giovanni Michetti <giovanni.michetti@ubc.ca> wrote:
> I don't know, Jane. Somehow I get the vague idea behind this solution, anyway it doesn't seem just a problem of names--the definition reads "an item in an archival collection", so it is an item indeed. Since it is under Intangible, it is intangible too. Which leads to a further doubt--where should we put the tangible archival item?
> The overall picture is a bit confusing...
> Giovanni
> Il 11/04/2017 15:55, Jane Stevenson ha scritto:
> I think I do get ‘ArchivedItem’ now - I just didn’t for a while because I kept equating it to a real archival item.
> If I forget the name and just think of it as ‘X’ then I can see that it's just something to hang properties from that we think might be specific to archives. I think its kind of as simple as that….?  But that’s why its probably best to drop ‘item’ - I know the name doesn't matter, but I think its confusing.
> Jane.
> On 11 Apr 2017, at 14:46, Giovanni Michetti <giovanni.michetti@ubc.ca> wrote:
> Hi Richard,
> thank you for further explanation.
> I'm sorry, but I still don't get your point.
> ArchivedItem is "an item in an archival collection", so it is included in an archival collection by definition. Putting ArchiveCollection as a sub-class of ArchivedItem, means that ArchiveCollection is a type of ArchivedItem, which is not consistent with the definition of ArchiveCollection ("A collection and/or archive of physical or digital items").
>  From your words, I understand that your choice was driven by the need for specific properties. If that's the case, I wonder why we can't simply extend the properties of Thing, or find anyway some other solution.
> Giovanni
> Il 11/04/2017 14:45, Richard Wallis ha scritto:
> Hi Giovanni,
> Your view of the generic nature of ArchiveCollection (/Therefore, a fonds, a series, a subseries, a collection, a set of sparsed objects may all be subsumed under ArchiveCollection according to the its definition/.) is what I had in mind when I made the original proposal.
> Both Jane and you express confusion as to why ArchiveCollection is a sub-class of ArchivedItem, which is initially understandable.  The reason I proposed it that way is to make pragmatic use of the way Schema.org is constructed.
> ArchivedItem <http://archive.sdo-archive.appspot.com/ArchivedItem>, when added as an additionalType of any other Thing (CreativeWork, Product, whatever) effectively makes available properties to describe attributes of its membership in an archive (provenance, accessAndUse, itemCondition, location, transfer, etc.).   If the Type of Thing is unknown ArchivedItem could potentially be used as the only Schema Type.
> When looking to describe an ArchiveCollection, the majority of those properties would also be of use in its description.  To achieve this the proposal could have either individually added these properties to ArchivedCollction or, as I proposed, just make it a subtype of ArchiveCollection.
> ~Richard.
> Richard Wallis
> Founder, Data Liberate
> http://dataliberate.com

> Linkedin: http://www.linkedin.com/in/richardwallis

> Twitter: @rjw
> On 11 April 2017 at 13:06, Giovanni Michetti <giovanni.michetti@ubc.ca <mailto:giovanni.michetti@ubc.ca>> wrote:
>     Hi Jane,
>     I would stick to the definition of ArchiveCollection, which is "A
>     collection and/or archive of physical or digital items."
>     (http://archive.sdo-archive.appspot.com/ArchiveCollection

>     <http://archive.sdo-archive.appspot.com/ArchiveCollection>).
>     The Archival Extension doesn't define what an archive is (as a set
>     of objects--an archive is either an institution or an organization,
>     according to the definition of Archive). However, it is quite clear
>     that the definition of ArchiveCollection intends to cover any
>     aggregation of items, that is, the term 'archive' in the definition
>     is used in a very generic sense. Therefore, a fonds, a series, a
>     subseries, a collection, a set of sparsed objects may all be
>     subsumed under ArchiveCollection according to the its definition.
>     Using a single class to identify any type of aggregations (including
>     no aggregation at all) is consistent with the most relevant archival
>     standards: ISAD uses "Unit of description" and EAD uses "Component".
>     Recently, ICA proposed a draft model (RiC) where they identified two
>     classes, Record and RecordSet (along with RecordComponent), which is
>     a bit different from the other models, yet is based on a single
>     class identifying any aggregation--that is, no need for fonds,
>     series, etc.
>     We can discuss whether we need to distinguish between the single
>     item and its aggregations, or it is better to just stick to a
>     simpler model, ie "Component" like in EAD. However, going to your
>     questions, I don't see any problem in considering both your examples
>     as being instantiated under ArchiveCollection. The same for the
>     properties.
>     I don't understand very well why ArchiveCollection is a sub-class of
>     ArchivedItem in the Extension, so I share your doubts.
>     As I wrote in some earlier message, I have many doubts about this
>     model. For this reason, I started investigating it further with some
>     colleagues of InterPARES Trust, in order to provide some systematic
>     comments on the Archival Extension. My aim is to share the comments
>     in a month.
>     Regards
>     Giovanni
>     Il 11/04/2017 11:16, Jane Stevenson ha scritto:
>         Hi there,
>         I had a huge email written as I was working this out, but I’ve
>         tried my best to distill it down to one essential question…..
>         There is a type ‘ArchiveCollection', which has ’super types’ of
>         CreativeWork’ and ‘ArchivedItem’ with properties we can use to
>         describe our thing(s).
>         To take an example, let’s say I wanted to have schema.org
>         <http://schema.org> markup attached to:
>         A collection or ‘top level’ description:
>         https://archiveshub.jisc.ac.uk/data/gb2607-ec/1-12

>         <https://archiveshub.jisc.ac.uk/data/gb2607-ec/1-12>
>         A lower level description:
>         https://archiveshub.jisc.ac.uk/data/gb2607-ec/1-12/ec/7

>         <https://archiveshub.jisc.ac.uk/data/gb2607-ec/1-12/ec/7>
>         All I know about these are that one is ‘top level’ so that there
>         are no parent levels above it, but there may be child levels.
>         The other is lower level, so it has at least one parent level.
>         Can I just treat the lower level ’thing(s)' as
>         type=ArchiveCollection? So, I can I use the properties from
>         CreativeWork and ArchivedItem for both the top level and lower
>         level group of stuff?
>         I don’t want to distinguish between collection and item actually
>         within the archive; I just want to apply schema.org
>         <http://schema.org> markup using the appropriate types and
>         associated properties.
>         Richard defined Collection:
>         “ArchiveCollection: The collection/grouping/assemblage of
>         archived items. Descriptive properties reference the collection
>         as a whole.”
>         I want to separate this out from what archivist thing of as an
>         archive collection, and treat it simply as a ‘group of things’
>         or even just one thing if that represents a stand-alone
>         collection. Is this correct?
>         The archive.schema.org <http://archive.schema.org> defines
>         ‘ArchivedItem’ as ‘an item in an archive collection’. But I
>         thought it was a ‘type' that is applied to ArchiveCollection? I
>         didn’t think it actually related to ‘item’ meaning a single thing.
>         I think there is some confusion in the documentation between the
>         term ‘ArchivedItem’, which I understand to be a type that can be
>         applied to an ArchiveCollection, with properties of
>         ‘archive-ness’,  and an actual item in a collection (and we
>         don’t usually describe single items anyway). It maybe doesn’t
>         help that the properties within ArchivedItem are ‘item’ - e.g.
>         itemDescription, itemLocation, itemProvenance. Can I see them as
>         archiveunitDescription, archiveunitLocation, archiveunitProvenance.
>         NB - that’s why in EAD we use ‘unit’ and not anything like
>         ‘item’  - because we can only know that it is a unit within a whole.
>         cheers
>         Jane
>         Jisc is a registered charity (number 1149740) and a company
>         limited by guarantee which is registered in England under
>         Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered
>         office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203
>         697 5800.
>         Jisc Services Limited is a wholly owned Jisc subsidiary and a
>         company limited by guarantee which is registered in England
>         under company number 2881024, VAT number GB 197 0632 86. The
>         registered office is: One Castle Park, Tower Hill, Bristol BS2
>         0JA. T 0203 697 5800.

Received on Tuesday, 11 April 2017 15:22:09 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 August 2018 13:28:59 UTC