Re: [MMSEM-UC] Music Use Case status from Giovanni Tummarello on 2007-04-26 (public-xg-mmsem@w3.org from April 2007)

From: Giovanni Tummarello <g.tummarello@gmail.com>
Date: Thu, 26 Apr 2007 02:06:57 +0100
To: MMSem-XG Public List <public-xg-mmsem@w3.org>
Message-ID: <462FFB31.90202@deri.org>
Hi guys,

here are some contributions on the music use case, first of all 
mentioning "lyrics" and "low level metadata" in the "complete 
description of a popular song". Lyrics are definitely needed if the 
description is "complete". Thoug not too much can be done with RDF, not 
all is lost, i suggest.
The commuter playlist is also described more in detail in its 
components, more details can be added i guess and of course plenty of 
fixes (will try to do that tomorrow a bit prior to the telecon).
Talk to you tomorrow
Giovanni
 

----------------------------

''Lyrics as metadata''

For a complete description of a song, lyrics must be considered as well.
While lyrics could in a sense be regarded as "acoustic metadata", they 
are per se actual information entities which have themselves annotation 
needs.
Lyrics share many similarities with metadata, e.g. they usually refer 
directly to well specified song, but acceptions exists as different 
artist might sing the same lyrics sometimes even with different musical 
bases and styles. Most notably, lyrics have often different authors than 
the music and voice that interprets them and might be composed at a 
different time.
Lyrics are not a simple text; they often have a structure which is 
similar to that of the song (e.g. a chorus) so they justify the use use 
of a markup language with a well specified semantics.
Unlike the previous types of metadata, however, they are not well suited 
to be expressed using the W3C Semantic Web initiative languages, e.g. in 
RDF. While RDF has been suggested instead of XML for for representig 
texts in situation where advanced and multilayered markup is wanted [Ref 
RDFTEI], music lyrics markup needs usually limit themselves to 
indicating particular sections of the songs (e.g. intro, outro, chorus) 
and possibly the performing character (e.g. in duets).
While there is no widespread standard for machine encoded lyrics, some 
have been proposed [LML][4ML] which in general fit the need for 
formatting and differentiating main parts.
An encoding in RDF of lyrics would be of limited use but still possible 
with RDF based queries possible just thanks to text search operators in 
the query language (therfore likely to be limited to "lyrics that 
contain word X"). More complex queries could be possible if more 
characters are performing in the lirics and each denoted by an RDF 
entity which has other metadata attached to it (e.g. the metadata 
described in the examples above).

It is to be reported however that an RDF encoding would have the 
disadvantage of complexity. In general it would require a supporting 
software (for example http://rdftef.sourceforge.net/) to be encoded as 
XML/RDF can be difficultly written by hand. Also, contrary to an XML 
based encoding, it could not be easily visualized in a human readable 
way by, e.g., a simple XSLT transformation.

Both in case of RDF and XML encoding, interesting processing and queries 
(e.g. conceptual similarities between texts, moods etc) would 
necessitate advanced textual analysis algorithms well outside the scope 
or XML or RDF languages.
Interestingly however, it might be possible to use RDF description to 
encode the results of such advanced processings. Keyword extraction 
algorithms (usually a combination of statistical analysis, stemming and 
linguistical processing e.g. using wordnet) can be successfully employed 
on lyrics. The resulting reppresentative "terms" can be encoded as 
metadata to the lyrics or to the related song itself.

''Lower Level Acoustic metadata''

"Acoustic metadata" is a broad term which can encompass both features 
which have an immediate use in higher level use cases (e.g. those 
presented in the above examples such as tempo, key, keyMode etc ) and 
those that can only be interpreted by data analisys (e.g. a full or 
simplified representation of the spectrum or the average power sliced 
every 10 ms). As we have seen, semantic technologies are suitable for 
reppresenting the higher level acoustic metadata. These are in fact both 
concise and can be used directly in semantic queries using, e.g., 
SparQL. Lower level metadata however, e.g. the MPEG7 features extracted 
by extractors like [Ref MPEG7AUDIODB] is very ill suited to be 
represented in RDF and is better kept in mpeg-7/xml format for 
serialization and interchange.

Semantic technologies could be of use in describing such "chunks" of low 
level metadata, e.g. describing what the content is in terms of 
describing which features are contained and at which quality. While this 
would be a duplicaiton of the information encoded in the MPEG-7/XML, it 
might be of use in semantic queries which select tracks also based on 
the availability of rich low level metadata.

 
=== 2.- A commuter playlist ===


Commuting is a big issue in any modern society. Semantically 
Personalized Playlists might provide both relief and actually benefit in 
time that cannot be devoted to actively productive activities.
Filippo commutes every morning an average of 50+-10 minutes.
Before leaving he connects his USB stick/mp3 player to have it "filled" 
with his morning playlist. The process is completed in 10 seconds, 
afterall is just 50mb he is downloading.
During the time of his commute, Filippo will be offered a smooth flow of 
news, personal dayli , entertainment, and cultural snippets from 
audiobooks and classes.

Musical content comes from Filippo personal music collection or is 
possible obtained from a provider, e.g. a low cost thanks to a one time 
paly licence. Further audio content comes from podcasts but also from 
text to speach reading blog posts, emails, calendar items etc.

Behind the scenes the system works by a combination of semantic queries 
and ad hoc algorithms.
Semantic queries operate on an RDF database collecting the semantic 
reppresentation of music metadata as explained above as well as 
annotations on podcasts, news items, audiobooks,  and "semantic desktop 
items" that is reppresentation of Filippo personal desktop information 
such as emails and calendar entries.

Ad hoc algorithms operate on low level metadata to provide e.g. smooth 
transiction among tracks. Textual analisys algorithms provide further 
links among songs and links within songs, pieces of news, emails etc.

At high level a global optimization algorithm takes care of the final 
playlist creation. This is done by balancing the need for having high 
priority items playd first (e.g. emails from addresss considered 
important) with the overall goal of providing a smooth and entertaining 
experience (e.g. interleaving news with music etc).

Semantics can help in providing "related information or content" which 
can be put adjacent to the actual core content. This can be done in 
relative freedom since the content can be at any time skipped by the 
user using simply the forward button.
------------------

Oscar Celma wrote:
> Dear all,
>
> Raphaël Troncy wrote:
>> Dear Giovanni,
>>> [...]
>>>
>>> I have a master student at Universita Politecnica delle Marche working
>>> on this now. There are works at deri involved with semantic music
>>> podcasts, obviously Oscar work would get into it. For this XG we just
>>> care about the semantic interoperability so that will come for the
>>> next one : ACTION Giovanni to complete the descriptio nand the
>>> solution of the Semantic Playlist generation
>>
>> What is the status of this action? We will have our last telecon this
> Good question... Giovanni any news? :-)
>> Thursday 26th of April.
> BTW, I hope that for the 26th, April, the Use Case is completely 
> written in the Wiki. I'm sorry for the delay.
>
> Regards,
>
> Oscar
>
Received on Thursday, 26 April 2007 01:07:24 UTC