Re: [model] Modeling a Tweet: Tags from Jacob Jett on 2015-06-18 (public-annotation@w3.org from June 2015)

From: Jacob Jett <jjett2@illinois.edu>
Date: Thu, 18 Jun 2015 13:47:56 -0500
To: Randall Leeds <randall@bleeds.info>
Cc: Robert Sanderson <azaroth42@gmail.com>, Doug Schepers <schepers@w3.org>, TB Dinesh <dinesh@servelots.com>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CABzPtBLnQyjW6Z3cjxpjLvwONPgjK3q_PNTFPnD4WFV8-aJ-Kg@mail.gmail.com>
Hmmm, this looks a bit problematic.

On Thu, Jun 18, 2015 at 12:42 PM, Randall Leeds <randall@bleeds.info> wrote:

> In this case, #love means "this tweet expresses love" with the implication
> that it is the user who loves whatever else the tweet references.
>
This is an assumption. Even if there's a basis for it, it may still be
wrong.

> Although, you may be right that someone is trying to say, "look at this
> #love over here (link to some other resource that depicts love", so I'm
> willing to bet one can construct cases where I'm wrong.
>
> However, I'm not sure it hurts to suggest an interpretation since the
> hashtag is a folksonomic tag that hasn't got a definitive interpretation to
> begin with, yet the fact that it's used heavily in absence of any other
> explicit referents suggests to me that the most natural interpretation is
> as a reflexive tagging of the tweet that contains it.
>
I would say that it probably does hurt to suggest an interpretation. It
weirdly interjects developers into conversations that they aren't actually
party to. If it were me I wouldn't want someone assuming that my tweet
means X and then telling the world that I meant X when I really meant Y.

Assumptions are quite dangerous things. They're one of the leading problems
for statistical (or computational if you prefer) science today. Even stats
textbooks get it wrong (there's a great racial profiling example of this in
Pearson's 3rd ed stats textbook).

The moral of the story is that the risks for getting this wrong outweigh
the benefits that we can realize by making the assumptions in the first
place. So it's better if we are very cautious and conservative with any
assumptions we do make and best if we can make none or as few as possible.

> In practice, this is what it does as well: causes that tweet to show up in
> a search for that tag. It categorizes the tweet, and not any links it
> contains to other resources, within the Twitter system.
>
> On Thu, Jun 18, 2015, 10:37 Randall Leeds <randall@bleeds.info> wrote:
>
>> I tentatively disagree, Rob. When one expresses love about a thing by
>> using #love they generally also include a link to the thing (either by
>> explicitly including it in their tweet or by virtue of using a reply
>> function) but in either case the new tweet is an expression of love (for
>> whatever resources the tweet targets).
>>
>>
Actually this turns out not to be the case. This is because tweets aren't
annotations, they're discourse carriers, and very human discourse carriers
at that. Unless you've been tracking the whole line of discourse (which
could be hundreds or even thousands of tweets long) then you have no
reliable means of understanding the contextual clues that let human beings
resolve the tweet's "aboutness" (and indeed if we consider only the one
tweet, then we have to admit that there are virtually no contextual clues).

For instance it could easily be the case that #love refers to the tweet
being replied to, or some portion of the content in the tweet being replied
to, or entities named in the tweet being replied to...list goes on. It
could even be the case that it refers to some abstract thing (i.e., a
non-resource).

We have broached this topic (discourse annotations) in the past (use cases
from Digital Emblematica, digital courseware apps, etc.) but never really
considered how best to handle it. E.g., if I reply to a comment, am I
targeting the comment, the thing that the comment is about, or both? If in
the context our two annotations we're discussing a particular thing (using
a simple remark-respond-respond pattern) then I'm inclined to suspect that
we are really targeting a composite target that consists of the initial (or
preceding) comment annotation and the actual target of that comment
annotation. Note though that at no time could the reply be part of the
initial comment.

Going back to the inclusion of hashtags within the comment text it does
seem to me that the hashtag will be "about" the same target as the comment
text itself and so we might be able to safely get away with having an
annotation with multiple bodies. We could probably even get away with
having a different motivation for each body (if it were a specific
resource) but only if they are independently "about" the same target. If
the homogeneous "aboutness" factor is lost, then we have two very different
annotations.

I'm with Rob. We need to be very careful about suggesting particular usages
in this space. Both because we frequently have no clue about what a user's
intention was and because we sometimes don't even know what the target of
the tweet is.

Regards,

Jacob



_____________________________________________________
Jacob Jett
Research Assistant
Center for Informatics Research in Science and Scholarship
The Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
(217) 244-2164
jjett2@illinois.edu

On Thu, Jun 18, 2015, 10:35 Robert Sanderson <azaroth42@gmail.com> wrote:
>>
>>>
>>> The conflation of tags about the comment and tags about the target is
>>> something we should consider carefully when recommending any particular
>>> usage of the model in this space.  Translating from 140 characters and
>>> guessing the user's intent of the #hashtag convention seems unreliable at
>>> best.
>>>
>>> There's many examples of each in the twitter space.  A tweet that
>>> expresses #love for something is clearly about the reference, not that the
>>> user loves their own tweet :)
>>>
>>> R
>>>
>>>
>>> On Thu, Jun 18, 2015 at 10:27 AM, Randall Leeds <randall@bleeds.info>
>>> wrote:
>>>
>>>> I have long believed there is a substantive difference between what
>>>> systems commonly call tags and hashtags, and believe that this is
>>>> demonstrated in common practice.
>>>>
>>>> A tag is often applied to a separate entity.
>>>>
>>>> A hashtag is applied to its bearing entity.
>>>>
>>>> The example provided, to my mind, should have the tweet text as the
>>>> target for some annotation.
>>>>
>>>> One can get fancier with this and say that each hashtag should be
>>>> rendered as a separate annotation on the test with positional selectors
>>>> describing the offsets of the hashtag texts in the tweet text, with the
>>>> motivation to link these to the full URI expansion of that hashtag (such as
>>>> the URL of that hashtag's collection/search page).
>>>>
>>>> For an example of this in the wild, see the app.net API where
>>>> mentions, links, hashtags, etc are described by a property of the status
>>>> update called "annotations".
>>>>
>>>> A hashtag annotates the tweet.
>>>>
>>>> On Thu, Jun 18, 2015, 09:28 Doug Schepers <schepers@w3.org> wrote:
>>>>
>>>>> Hi, Dinesh–
>>>>>
>>>>> Thanks for tweaking this.
>>>>>
>>>>> The issue I was trying to resolve was not the distributed part, which
>>>>> you picked up on, but rather the conflict introduced by the data model
>>>>> having a strict separation between tagging and comment bodies.
>>>>>
>>>>> If anyone has any thoughts on that, please let us know. I'm not
>>>>> completely satisfied with my solution, but I can't think how else to
>>>>> do it.
>>>>>
>>>>> Regards–
>>>>> –Doug
>>>>>
>>>>> On 6/18/15 7:09 AM, TB Dinesh wrote:
>>>>> > Thanks Doug for this example.
>>>>> >
>>>>> > The way we have been thinking about this (in the swtr.us framework)
>>>>> is
>>>>> > that annotations will lead to 3rd party services that understand that
>>>>> > this (@id and t1 below) annotation maps to a tweet model and can
>>>>> > assist in tweeting it for you, provided the service is permitted to
>>>>> > look through your annotation repo.
>>>>> >
>>>>> > I will first try to re purpose your JSON-LD example so it reads a
>>>>> bit different.
>>>>> > First the annotation is in a repo some where (rewriting the @id to
>>>>> just
>>>>> > drive home that this object (t1) is identified by another creator --
>>>>> > and not twitter).
>>>>> > Also am using body1, body2 and body3 are local ids (with effective
>>>>> ids
>>>>> > being t1.body1, t1.body2, t1.body3) and dont know what the right
>>>>> > syntax is to do this.
>>>>> > Note that I changed the motivation to tweeting (from commenting) so
>>>>> as
>>>>> > to make it
>>>>> > easy for the 3rd party service to pick this up for tweeting.
>>>>> >
>>>>> > t1:
>>>>> >
>>>>> > {
>>>>> >    "@id": "https://annotation.repo/azaroth42/607727122975739905",
>>>>> >    "@type": "oa:Annotation",
>>>>> >    "annotatedBy": "https://twitter.com/azaroth42/",
>>>>> >    "annotatedAt": "2015-06-07T12:00:00Z",
>>>>> >    "serializedAt": "2013-02-04T17:53:00Z-8",
>>>>> >    "body": [
>>>>> >      {
>>>>> >        "@id": "body1"
>>>>> >        "motivation": "oa:tweeting",
>>>>> >        "value" : "Been a while. Indexing my phd thesis transcription
>>>>> as
>>>>> > #openannotations towards #iiif search demo implementation",
>>>>> >      },
>>>>> >      {
>>>>> >        "@id": "body2"
>>>>> >        "motivation": "oa:tagging",
>>>>> >        "value" : "openannotations",
>>>>> >      },
>>>>> >      {
>>>>> >        "@id": "body3"
>>>>> >        "motivation": "oa:tagging",
>>>>> >        "value" : "iiif",
>>>>> >      }
>>>>> >    ],
>>>>> > }
>>>>> >
>>>>> > Now #openannotations and tag "openannotations" will get different
>>>>> > services to pick up the intent. Twitter would know what to do with
>>>>> > #openannotations and t1's tags are not very useful for twitter, which
>>>>> > another service can indeed help azaroth42 connect to other meanings
>>>>> of
>>>>> > 42 if any using these tags.
>>>>> >
>>>>> > -d
>>>>> >
>>>>> >
>>>>> > On Thu, Jun 18, 2015 at 12:55 PM, Doug Schepers <schepers@w3.org>
>>>>> wrote:
>>>>> >> Hi, folks–
>>>>> >>
>>>>> >> We've talked before about how different kinds of popular social
>>>>> media, like
>>>>> >> Twitter tweets or Facebook posts, could be modeled as annotations.
>>>>> >>
>>>>> >> Tim Cole put together a diagram of this [1], and I made a slide
>>>>> inspired by
>>>>> >> Tim's diagram [2] (use the down arrow to step through the slide).
>>>>> >>
>>>>> >> But all the recent talk of multiple bodies and motivations made me
>>>>> realize
>>>>> >> that there may be something hard to represent in the data model:
>>>>> inline
>>>>> >> hashtags in a tweet.
>>>>> >>
>>>>> >> As an example, here's the text from a tweet by Rob Sanderson, from
>>>>> 7 June
>>>>> >> [3], which contains two inline hashtags:
>>>>> >> "Been a while. Indexing my phd thesis transcription as
>>>>> #openannotations
>>>>> >> towards #iiif search demo implementation"
>>>>> >>
>>>>> >> Inline hashtags are pretty common, and they blend tags and comment
>>>>> into a
>>>>> >> single common body. You can't remove the tags from the comment
>>>>> body, because
>>>>> >> they're part of the sentence structure; you can't only represent
>>>>> the tags as
>>>>> >> part of the comment body, because they have special status as
>>>>> search terms
>>>>> >> [4].
>>>>> >>
>>>>> >> How can we model this?
>>>>> >>
>>>>> >> The best I could come up with is to duplicate the hashtags in both
>>>>> the
>>>>> >> comment body and in their own bodies. Here's some example JSON-LD
>>>>> (please
>>>>> >> excuse the imprecise/incorrect inclusion of motivation on each
>>>>> body, it's
>>>>> >> just illustrative.):
>>>>> >>
>>>>> >> {
>>>>> >>    "@id": "https://twitter.com/azaroth42/status/607727122975739905
>>>>> ",
>>>>> >>    "@type": "oa:Annotation",
>>>>> >>    "annotatedBy": "https://twitter.com/azaroth42/",
>>>>> >>    "annotatedAt": "2015-06-07T12:00:00Z",
>>>>> >>    "serializedAt": "2013-02-04T17:53:00Z-8",
>>>>> >>    "body": [
>>>>> >>      {
>>>>> >>        "@id": "http://example.org/body1"
>>>>> >>        "motivation": "oa:commenting",
>>>>> >>        "value" : "Been a while. Indexing my phd thesis
>>>>> transcription as
>>>>> >> #openannotations towards #iiif search demo implementation",
>>>>> >>      },
>>>>> >>      {
>>>>> >>        "@id": "http://example.org/body2"
>>>>> >>        "motivation": "oa:tagging",
>>>>> >>        "value" : "openannotations",
>>>>> >>      },
>>>>> >>      {
>>>>> >>        "@id": "http://example.org/body3"
>>>>> >>        "motivation": "oa:tagging",
>>>>> >>        "value" : "iiif",
>>>>> >>      }
>>>>> >>    ],
>>>>> >> }
>>>>> >>
>>>>> >>
>>>>> >> Another solution might be to allow nested bodies, but that seems
>>>>> like it
>>>>> >> could get complicated.
>>>>> >>
>>>>> >> Thoughts?
>>>>> >>
>>>>> >>
>>>>> >> [1]
>>>>> >>
>>>>> http://www.w3.org/Talks/2015/schepers-annotation-journalism/data-model-anatomy.png
>>>>> >> [2]
>>>>> >>
>>>>> http://www.w3.org/Talks/2015/schepers-annotation-journalism/data-model-anatomy.svg#showall
>>>>> >> [3] https://twitter.com/azaroth42/status/607727122975739905
>>>>> >> [4] https://twitter.com/hashtag/iiif
>>>>> >>
>>>>> >> Regards–
>>>>> >> –Doug
>>>>> >>
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Rob Sanderson
>>> Information Standards Advocate
>>> Digital Library Systems and Services
>>> Stanford, CA 94305
>>>
>>
Received on Thursday, 18 June 2015 18:49:08 UTC