Live Music Archive Linked Data

17/02/23: This site is currently being migrated to a new system, so some services may be unavailable or give errors. We are working to fix this.

This server provides access to Linked Data that describes the audio held in the Internet Archive’s Live Music Archive (also sometimes known as "etree"). The metadata from etree has been converted to RDF and is exposed through a SPARQL endpoint along with browsable pages.

The dataset contains information describing over 100,000 performances by 4,000 artists including 1,600,000 individual tracks, each of which may be available in a number of formats.

Sample Resources

Vocabularies

The dataset makes use of a number of vocabularies including:

Artists are linked to entries in MusicBrainz, while locations of performances are linked to entries in Geonames and last.fm using a number of methods. Any such alignments are represented explicitly as similarities.

SPARQL Endpoint

A SPARQL endpoint for the data is available at http://etree.linkedmusic.org/sparql. Some sample queries running against this endpoint are linked below. Note that as this is currently an experimental service, query timeouts are limited.

SPARQL Examples

The following query (click here to run it) retrieves a set of commonly performed songs by artists with between 200 and 1000 performances in the Live Music Archive.
PREFIX etree:<http://etree.linkedmusic.org/vocab/>
PREFIX mo:<http://purl.org/ontology/mo/>
PREFIX event:<http://purl.org/NET/c4dm/event.owl#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
PREFIX timeline:<http://purl.org/NET/c4dm/timeline.owl#>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sim: <http://purl.org/ontology/similarity/>

# Get artist, trackname and number of occurrences for those
# tracks that are played more than 20 times.
SELECT ?artist ?artistname ?trackname ?trackPerformances {
  # Q2. Get the tracks from artists and count occurrences, assuming
  # the same name is the same track
  {
    SELECT ?artist ?trackname (COUNT(?track) AS ?trackPerformances) {
      # Q1. Find artists with between 200 and 1000 performances. This
      # weeds out a lot of the "jam" bands and the Grateful Dead
      { 
    SELECT ?artist {
      FILTER (?performances > 200 && ?performances < 1000)
      {
        SELECT ?artist (COUNT(?perf) AS ?performances)
        { 
          ?perf mo:performer ?artist.
          ?perf rdf:type etree:Concert.
        } GROUP BY ?artist
      }
    } # LIMIT 100 
      } # End Q1
      ?track mo:performer ?artist.
      ?track rdf:type etree:Track.
      ?track skos:prefLabel ?trackname.
    } GROUP BY ?artist ?trackname 
  } # End Q2
  ?artist skos:prefLabel ?artistname
  # Inspect track names to remove noise (non-songs)
  FILTER (?trackPerformances > 100)
  FILTER (?trackname != "")
  FILTER (?trackname != "tmp")
  FILTER (!regex(?trackname,"tuning","i"))
  FILTER (!regex(?trackname,"intro","i"))
  FILTER (!regex(?trackname,"banter","i"))
  FILTER (!regex(?trackname,"jam","i"))
  FILTER (!regex(?trackname,"encore","i"))
} ORDER BY ?artistname ?trackname 
 
This example query (click here to run it) demonstrates how to obtain a link to the raw audio recording associated with performances of a specified song:
PREFIX etree:<http://etree.linkedmusic.org/vocab/>
PREFIX mo:<http://purl.org/ontology/mo/>
PREFIX event:<http://purl.org/NET/c4dm/event.owl#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
PREFIX timeline:<http://purl.org/NET/c4dm/timeline.owl#>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sim: <http://purl.org/ontology/similarity/>


# Get audio files for multiple performances of a song. 
SELECT ?audio {
  ?track mo:performer <http://etree.linkedmusic.org/artist/42300f90-4aac-012f-19e9-00254bd44c28>.
  ?track rdf:type etree:Track.
  ?track skos:prefLabel "15 Miles".
  ?track etree:audio ?audio.
  ?audio etree:audioDerivationStatus etree:originalAudio.
} 

Licensing

The Live Music Archive metadata is being made available under the CC0 1.0 Universal - "Creative Commons public domain waiver" license. Note that this only applies to the metadata supplied here. Links are given to audio materials for which other conditions may apply.

Sample Visualisations

The figured below show some sample overview visualisations produced using Cytoscape. The first shows relationships between artists (pink) and the geographical locations (green) in which they have been recorded. The second shows relationships between artists (pink) and mapped last.fm venues (green) in which they have been recorded.

  

Publications

The Computational Analysis of the Live Music Archive (CALMA) project aims to extend the etree data with computational analyses over the recordings through feature extraction, clustering, and classification.

Contact

This dataset was produced by

For more information, please contact Sean Bechhofer. We would also like to thank the Internet Archive for granting access to the source data.