- From: Jacob Jett <jjett2@illinois.edu>
- Date: Tue, 23 Jun 2015 15:24:20 -0400
- To: Doug Schepers <schepers@w3.org>
- Cc: Paolo Ciccarese <paolo.ciccarese@gmail.com>, Robert Sanderson <azaroth42@gmail.com>, Ivan Herman <ivan@w3.org>, Web Annotation <public-annotation@w3.org>
- Message-ID: <CABzPtB+w06SL7u_d=CpqV9oWLwKa4Q4n7XJGXF9WfhdPSP+Czw@mail.gmail.com>
Hi Doug, Let me see if I can offer an intelligible counter-point. On Tue, Jun 23, 2015 at 12:55 PM, Doug Schepers <schepers@w3.org> wrote: > Hi, folks– > > Forgive me for (still) not understanding some of the subtleties of the > issues here; I'll try to make a cogent argument anyway. > > I'm strongly against the notion of restricting the number of bodies (or > targets) in an annotation. > > I look at it from the perspective of an annotator (that is, the end-user): > > Abby selects some text (the word "Julie"); she selects the "annotate" > option from some menu (e.g. context-menu, sidebar, popup, menu-bar, > keyboard shortcut, whatever). A dialog pops up, giving her the option of > leaving a comment, offering a suggested change, adding tags, and so on. She > types the comment, "Julie should be Julia, as mentioned in paragraph 2"; > she types the suggested change, "Julia"; she adds the tags, "#typo", and > "#personalname", and "#sigh". > > The resulting annotation has a single target (the word "Julie"), and 3 > bodies (the comment, the replacement text, and the tags). > > The thing is, what happens behind the scenes, in the depths of the annotation tool which should be completely opaque to the end user. > A machine thinks that all these bodies apply to the target; it knows that > the replacement text is meant to substitute for the selection text (the > target); it knows that each of the tags should somehow be indexed for > search with this target and body. But it doesn't know what any of the > content /means/. > > I'm not sure I understand how the machines knows that the replacement text is meant to be substituted for the selection text, especially if it doesn't know what the content *means*. That content could be anything (I think that is the point you were going for) but somehow the replacement text is an exception. I don't really follow how this can be... > The machine doesn't know that Abby referred both to the target and to the > instance of "Julia" in paragraph 2; it only knows about the explicit link > to the target, "Julie"; a human can use the information in the content > body, but the machine can't (unless it's a smarter machine than we're > talking about today). > > It could (and arguably should) know this. The entire concept of specific resource and selectors is predicated on making this kind of functionality possible. It seems a bit odd that somehow the tool in the example handles a use case the model isn't designed for (recognizing that some content is replacement text) but then doesn't exercise the specific resource + selector structure that is relatively basic to the model... > The machine doesn't know that Abby added the tag "#typo" as a signal for > the kind of correction she was offering, or that she added the tag > "#personalname" as a note for herself for a different project she's working > on, or why she added the tag "#sigh"; in fact, another human probably > wouldn't know what the tag "#sigh" means… was she bored? is she irritated > at all the typos, in which case the tag "#sigh" is actually kind of an > annotation on the tag "#typo"? was it a wistful sigh because she loves > Julia? > > I get that it doesn't know what to make of these hastags but I'm still stuck on how it does know what replacement text is. It seems unreasonable that our engineers have figured out how to identify and make machine-actionable some random content meant to be replacement text but then can't figure out what to do with hashtags... None of this matters to the machine, which only needs to perform a set of > tasks: > 1) present the human reader/editor with the information, and let the human > decide if they want to accept the change; > 2) provide an affordance (say, a button) to change the selection text with > the replacement text; > 3) if the human decides to make the change, perform the change operation. > > That's it. There are other ancillary tasks, like letting users to > whole-text searches or tagged-index searches, and so on, but for the core > task we're talking about, the machine doesn't need to be any smarter than > that. > > Right, the existing model can do this when applied correctly (i.e., by separating the superflous annotations from the editorial annotation). If I'm understanding correctly, you basically want to construct an annotation system that identifies body content that conforms to the pattern of "replacement text" but does nothing with any other kind of body content. This begs the question of what purpose the other body content serves. Now, in the example you've illustrated, the data entry system has a way separate the different content types into separate bodies. However, if we only care about the replacement text content type, then why not just pitch the other bodies into the big circular file in the sky? They have no role in the devised annotation system, so why waste any system resources on them? Presumably, we might want to actually preserve this other annotation content and (re)serialize it for consumption by exterior contexts. The best, most interoperable way to do this would be to represent the hashtags as distinct annotations targeting the original editorial annotation. Nothing about this use case actually requires support for annotations with multiple bodies (possessing multiple motivations). > The idea of separating out this annotation into its constituent parts > seems like overkill. I think it would surprise Abby to find that once she's > published what she saw as a single annotation, that it's broken up into > multiple annotations that have to be shared or used separately, and she > can't find her suggested change because the tag body wasn't indexed with > the replacement-text body or the comment body, and so on. To her, it was a > single act of creation, and it should be modeled that way; the only thing > we know about her intent was that she made a single annotation, and that > should be preserved. > This isn't overkill at all. This is simply the dichotomy between (user) perceptions and system design. Let's take your digestive track as an analogy. You don't typically think about it but, it breaks your food up into the different kinds of things and pumps them through to the various places best suited for their digestion. The user never really thinks about this. They simply consume the food and sometime later they poop. They don't think about what happens in between, the entire process is opaque to them. We engineers don't have this luxury. For us, we have to figure out how the food is going to be broken down and processed so that nutrients can be absorbed. First it will need to be ripped up and crushed so we design the jaws and teeth a certain way. Then it needs some chemical bathing to break down carbohydrates into sugars. Before it can get to the chemical bath though, it has to be transferred to the place where bathing is going to happen -- the stomach -- so we design a muscular tube, the esophagus, for this task. By now you see where I'm going with this analogy. The user is going to deal with the interface that we design for them to cope with. How the annotation data is processed inside is a completely separate matter. Whether their annotation is functionally treated as four different annotations or a single annotation makes no difference to them, they don't care what is being digested where. We must be careful not to conflate presentational issues with data / process / workflow issues. The model doesn't constrain how the front end should look or how it should present information to the end user. It constrains how the data is to be interpreted internally. If an edit is accompanied by some hashtags that provide context for human users then there are two ways of dealing with the situation. Either the entire blob is a single document - an annotation with one body or, we have a way to distinguish between content types. If we can distinguish between content types, then we should represent the user's annotation as multiple annotations internally because it will let us get the most mileage out of the information gained from being able to distinguish between the content types. At no time would (or should) the end user have to cope with this directly. > Maybe another annotation interface might offer different, discrete options > that elicit a different act of creation from the user, but the data model > shouldn't constrain that. > The data model doesn't. As argued before, there is ambiguity in this kind of annotation… > > The ambiguity arises in part because we have made a data structure that is > easy to generate and manipulate, so it is "lossy" with regards to all the > expressiveness and inter- and intra-linkages it could have, but those would > come at the price of complexity of format and stringent requirements on the > user to disambiguate intent via the UI. > > The ambiguity mainly arises because of the nature of humans, who generate > and detect complex patterns of behavior, and who have limited means to > express their thoughts or intents. > > We can't solve either of these issues. We can only decrease the ambiguity > a bit. > I'm not really following. Sure the meaning of the hashtags may be ambiguous but, you already found a way to separate the replacement text from the rest of the document. The replacement text is unambiguous. There's nothing to gain and everything to lose by mushing the replacement text back into a single annotation with the ambiguous hashtags. Far better to annotate the replacement text with the ambiguous hashtags. One could even aggregate all of the hashtags into one tagging annotation. Then this becomes a two annotation solution that the preserves the unambiguous replacement text. We have an existing pattern for this in the model. Surely this would be the better way to proceed than to warp the model to accommodate this one implementation dependent use case. Shifting motivation to the individual bodies is going to seriously complicate the implementation of a large number of other use cases. > Maybe another annotator, Beth, is far more precise in her annotations, > such that there is almost no ambiguity; she separates out her annotations > and is always exactly on point, she replies to her own annotations if there > is any potential ambiguity; that's even easier for machines to > "understand". But maybe another annotator, Chuck, is far more ambiguous in > his annotations, suggesting irrelevant and irreverent changes, and adding > comments and tags that are unhelpful or even contradictory. > > Web Annotations should allow for this full range of expression, even at > the expense of machine comprehension. > > Please, let's try to keep the model simple by default, and slightly more > complex for more complicated scenarios, and limit the concessions we make > for machines when humans are the real end-users. > I agree but the humans you're describing are developers and not the real end-users. Real end-users don't actually care how we make the sausage. They only care that it tastes delicious. The alternate pattern is only marginally more verbose, already exists in the model, and doesn't needlessly complicate the other existing use cases. > To Paolo's points about motivations vs roles, or how we structure the > annotations, or having different serializations for JSON and JSON-LD, I'm > open to any of these suggestions; I suggested "motivation" because it > seemed like it met a need, but if it has to be modeled a different way, > that's okay, too. > > > Finally, I want to suggest that if we go down a path of architectural > purity and complexity, the data model is far less likely to be adopted by > authoring tools, so let's keep that in mind. > > Regards– > –Doug Again I'm not sure I follow. Sure we could reinvent things like Twitter using the annotation model but at the end of the day the real purpose of the model is to link resource A to resource B. It's as simple as that. Everything else, like determining the content types in the resources, is extra. The editorial workflow use case is extremely complicated because it has more than the average number of extras. Not only do I have to determine the content in A and B but I also have to figure what to do with the content in A (and once I've done that, then what?). That's a lot more functionality than an annotation should be expected to have. It requires a lot more effort and more robust architecture to accommodate that. There's no easy way to accommodate the editorial use case. Ultimately this proposal is a -1 from me. Adopting it will completely throw away the core feature of the model -- that we may *reliably expect bodies to be "about" targets*. Regards, Jacob _____________________________________________________ Jacob Jett Research Assistant Center for Informatics Research in Science and Scholarship The Graduate School of Library and Information Science University of Illinois at Urbana-Champaign 501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA (217) 244-2164 jjett2@illinois.edu > On 6/21/15 9:17 PM, Paolo Ciccarese wrote: > >> I personally think the problem is originated by the overloaded meaning >> that ‘motivatedBy’ gained with time. Originally we were using types and >> we were subclassing Annotation to specify the desire annotation type >> (for instance Comment). To avoid the types proliferation and potential >> incompatibility, we move away from that construct and we introduced >> ‘motivatedBy’. >> >> "The Motivation for an Annotation is a reason for its creation”, why we >> created an annotation is not necessarily describing how the annotation >> is shaped. The ‘motivatedBy’ for an edit is “oa:editing” weather or not >> one or more description, tags, link to existing documents are provided. >> I always thought that assuming that given a ‘motivatedBy’ I should know >> exactly how to ‘read' the annotation is a bit of a stretch… it never >> worked for me and as the current discussion proves, it does not work for >> other use cases. >> >> I’ve always considered the bookmark in Firefox as a good example. A >> bookmark consists of a URL, a description and tags. The motivation is >> still ‘bookmarking’ and the multiple bodies allow to connect all of that >> in one single annotation. It is true though that in this specific case >> we don’t have interpretation issues as the Tags are modeled with a >> specific construct and we have only one textual body. >> >> Any time I needed to model something more complex, in Domeo, I resorted >> to structured bodies and named graphs as I get all the flexibility I >> need by defining precisely the role of each item of the body. However, >> that increases the complexity of the resulting artifact. >> >> If we had to play with the current rules and introduce a role for each >> body of the annotation, one way would be to add a node like we did for >> Semantic Tags. But that will be verbose. >> >> Another way would be to change the rules and have a JSON format that >> is a compact version of the JSON-LD format, so that what Doug proposed - >> using something like hasRole in place of motivatedBy - makes sense in >> JSON and would be shaped with an intermediate node in JSON-LD. I am not >> sure somebody mentioned this already (many threads of emails went by on >> this topic) and I am not sure this would be a good idea for >> interoperability reasons. >> >> Yet another way I could think of, forgetting for a second JSON-LD, is to >> create a map of bodies so that in simple cases I would just look at the >> values of the map… and when I need to define roles I could attach that >> to the keys. Like a "bodies map". >> >> Paolo >> >> >> On Sun, Jun 21, 2015 at 6:02 PM, Robert Sanderson <azaroth42@gmail.com >> <mailto:azaroth42@gmail.com>> wrote: >> >> >> Ivan, Jacob, >> >> Yes, the pre-CG models only allowed for one body and multiple >> targets. The discussion in the CG was similar to the current one >> (one comment with several tags, edit text with reason, etc) and the >> desire to keep them as a single annotation, which led to multiple >> bodies and multiple targets. >> >> While it would be a departure from the CG's model, if there's a >> consistent, acceptable and simpler model that supports the same use >> cases, it would be good to go with that :) >> >> Rob >> >> >> >> On Sun, Jun 21, 2015 at 2:52 PM, Jacob Jett <jjett2@illinois.edu >> <mailto:jjett2@illinois.edu>> wrote: >> >> Hi Ivan, >> >> As memory serves multiple bodies and multiple targets were never >> restricted by the CG. In fact, as I recall it was designed to >> allow a number of bodies that apply equally to a number of >> targets within the context of the same motivation. This might >> have been a variety of the tagging use case that got spun out as >> a "needed" alternative to choices and composites. >> >> Regards, >> >> Jacob >> >> >> >> _____________________________________________________ >> Jacob Jett >> Research Assistant >> Center for Informatics Research in Science and Scholarship >> The Graduate School of Library and Information Science >> University of Illinois at Urbana-Champaign >> 501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA >> (217) 244-2164 <tel:%28217%29%20244-2164> >> jjett2@illinois.edu <mailto:jjett2@illinois.edu> >> >> >> On Sun, Jun 21, 2015 at 8:47 AM, Ivan Herman <ivan@w3.org >> <mailto:ivan@w3.org>> wrote: >> >> Rob, >> >> I am sympathetic to your proposal. However, we owe to >> ourselves to look at the reasons why we departed from the >> restriction of the Annotation CG's document and introduced >> multiple bodies. Shame on me, but I do not remember the >> reasons we made the change, and I did not find the traces in >> the mailing list. Can you remind me/us (or point at the >> relevant mails) of the issues we thought of solving by >> allowing multiple bodies? >> >> Thanks >> >> Ivan >> >> >> On Fri, June 19, 2015 4:16 pm, Robert Sanderson wrote: >> > Tim, all, >> > >> > On Fri, Jun 19, 2015 at 9:06 AM, Timothy Cole >> <t-cole3@illinois.edu <mailto:t-cole3@illinois.edu>> wrote: >> > >> >> In my mind, allowing body-level motivations, at least >> for the use cases so >> >> far proposed, is simply a way to conflate what should be >> separate >> >> annotation graphs. >> >> >> > >> > >> > >> >> For example, should the protocol have a way of allowing >> posting of >> >> multiple (related or chained) annotations in a single >> transaction? (Does it >> >> already?) >> >> >> > >> > It does not. LDP does not have a notion of transactions >> at all. And (as >> > you know) we don't have a notion of sets/lists of >> annotations beyond the >> > unordered containership. >> > >> > >> >> Anyway, I don’t want to flog a dead horse, but since >> Doug asked directly >> >> about slippery slopes, I did want to elaborate on the >> trouble we might get >> >> ourselves into if we allow multiple bodies that relate >> to multiple targets >> >> and to each other in substantively different ways. I >> still do think there >> >> is a slippery slope potential here. >> >> >> > >> > This seems like a good opportunity to re-evaluate >> multiple bodies as a >> > feature at all. To my knowledge, all multiple body use >> cases have been for >> > different motivations. Most frequently it has been >> comment plus tags that >> > are all really about the same target. If we went to a >> multiple annotation >> > model for edit + comment, we could more reliably also go >> to a multiple >> > annotation model for tag(s) + comment as well. Then the >> individual >> > annotations could be addressed individually, for example >> to moderate a tag >> > without at the same time moderating the comment, or vice >> versa. >> > >> > Rob >> > >> > -- >> > Rob Sanderson >> > Information Standards Advocate >> > Digital Library Systems and Services >> > Stanford, CA 94305 >> > >> >> >> -- >> Ivan Herman, W3C Team >> URL: http://www.w3.org/People/Ivan/ >> FOAF: http://www.ivan-herman.net/foaf.rdf >> >> >> >> >> >> -- >> Rob Sanderson >> Information Standards Advocate >> Digital Library Systems and Services >> Stanford, CA 94305 >> >> >> >> >> -- >> Dr. Paolo Ciccarese >> Principal Knowledge and Software Engineer at PerkinElmer Innovation Lab >> Assistant Professor in Neurology at Harvard Medical School >> Assistant in Neuroscience at Mass General Hospital >> ORCID: http://orcid.org/0000-0002-5156-2703 >> > >
Received on Tuesday, 23 June 2015 19:25:33 UTC