Featured Dataset: BBC Music

Posted on 08/02/2011 by


Featured Dataset: BBC Music

dataset logoOur next developer event for Kasabi will be based on culture, so I thought it’d be a good idea to feature a musical dataset today.

BBC Music is a set based on the BBC Music website, and contains a comprehensive collection of information about musical talent covered by BBC programs—from radio and TV. The BBC provides linked data on artists, records and a series of musical reviews. The dataset is cross-linked with other sources, including dbpedia and musicbrainz, and here can be accessed via Kasabi’s APIs.

If we take a look at the Developer Documentation (tip: adding /guide to the end of a Kasabi dataset URI will take you to the dev docs), we can look at the way the dataset’s been modelled. It’s pretty rich in the vocabularies it’s using to describe the data, ranging from the BIO ontology (for biographical information) to the Review Ontology (expressing reviews and ratings). It’s primary structure is based on the Music Ontology, which:

is an attempt to provide a vocabulary for linking a wide range music-related information, and to provide a democratic mechanism for doing so.

We can take a look at the kind of data available here: http://data.kasabi.com/dataset/bbc-music/1296959358956. This is the resource “Between the Minds,” which we can see is a “Record,” which has a MusicBrainz ID: 1a2a074f-d484-4b01-8c6c-9cac345684c9. This entity also has a review, which points to the actual, written review by Chris Jones of the album.

I wanted to know a bit more about Chris Jones, so I did a search (using the set’s Search API) for “Chris Jones”, and got back a list of related items:

0. Chris Jones (score: 1.0)
1. Chris Jones (score: 1.0)
2. Chris Jones (score: 1.0)
3. Description of the artist Chris Jones (score: 0.7247112)
4. Reviews by Chris Jones (score: 0.7247112)

Using the Pytassium command line tool, I asked Kasabi to “Describe” the fifth item on the list (which, as Python geeks know is number 4, thanks to 0-indexed lists):

@prefix foaf: http://xmlns.com/foaf/0.1/ .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .

http://www.bbc.co.uk/music/reviewers/8ff9.rdf rdfs:label "Reviews by Chris Jones";
foaf:primaryTopic http://www.bbc.co.uk/music/reviewers/8ff9#reviewer .

So, I get back a link to the BBC’s home for all the reviews by Chris Jones.

For SPARQLers, there is also a set of example queries which can be found at it’s SPARQL API.

Attached to the BBC Music set is a customised API from John Goodwin which lets you: “Find out which music artists know each other through mutual collaboration in bands.” I think this is a particularly interesting take on the ‘set, and would like it in several musical applications I use all the time (here’s looking at Spotify, for starters!).

So, I’d be interested in a couple of things related to this set:

  1. Some ideas on what could be built on top of it—hacks, applications
  2. Related datasets which would compliment it

Any ideas? (drop a line to the Kasabi developer network)

Posted in: Datasets