Re: Enrichment document

Dear Phil and other participants of the DWBP,

Sorry for the delay in answering the message. I was in Australia last 
week and this
week has been quite busy. Just today we, at UFMG, were able to discuss 
your message.

On 07-08-2015 09:59, Phil Archer wrote:
> Wagner, everyone,
>
> I've been reading through the Enrichment doc this morning and have 
> some comments/suggestions aimed at making use of the obvious expertise 
> at InWeb within our current framework.
Great!
>
> The sections on categorisation and segmentation seem, in my reading, 
> to be about specialisations of the more general topic of 'providing 
> metadata.' Would it be possible to look at the BPs in the metadata 
> section (http://www.w3.org/TR/dwbp/#metadata) and add to them?
>
Yes, it makes sense. We should evaluate such addition.
> Whether we're talking about machine readable formats or human readable 
> stuff, metadata is important of course and, yes, machines are getting 
> better at extracting useful information from all kinds of sources.
>
Agree.
> I find the section on Imputation particularly interesting. The details 
> of the techniques for doing this are outwith the scope of this WG but 
> recording that imputation techniques have been used would, I think, be 
> something to record using the DQV?

Ok. It makes sense.

>
> Entity Recognition, Data Disambiguation and Fusion - I think we could 
> derive discrete BPs about these, but I'd phrase it as "re-use other 
> people's identifiers," e.g. if you're providing data about a chemical 
> compound, use the same ID as everyone else (actually there are several 
> competing ID sets for chemical compounds). That helps disambiguation 
> and fusion.
>

Fusion may be a bit broader, but we may define the context apriori that 
makes the junction
reasonable.

> NB. Freebase is being shut down by Google. I'll forward a separate 
> mail that got into my inbox this week.
>
Ok.
> Another BP (I don't think we've covered this yet) is 'be consistent' 
> in your naming. So, if you refer to a country by its name, use the 
> same and and capitalisation: Brazil, Brasil, brasil, BR etc are all 
> different.
>
Ok.
> Might that sort of approach work? i.e. enriching the current BPs 
> (pardon the pun).
>
Yes, but we are not sure about the proper format for it. Adriano will 
detail this
in today's chat. Further, we had some doubts about your suggestions. Again,
Adriano will detail in the chat.

Another issue is that, regarding the types of techniques, there may be 
additions
or removals, which is another issue we may discuss in the F2F in 4 weeks.

Best,

Wagner
> Phil.
>

Received on Friday, 21 August 2015 12:48:10 UTC