Live Music Archive Linked Data

This server provides access to Linked Data that describes the audio held in the Internet Archive’s Live Music Archive (also sometimes known as "etree"). The metadata from etree has been converted to RDF and is exposed through a SPARQL endpoint along with browsable pages.

The dataset contains information describing over 100,000 performances by 4,000 artists including 1,600,000 individual tracks, each of which may be available in a number of formats.

Sample Resources


The dataset makes use of a number of vocabularies including:

Artists are linked to entries in MusicBrainz, while locations of performances are linked to entries in Geonames and using a number of methods. Any such alignments are represented explicitly as similarities.

SPARQL Endpoint

A SPARQL endpoint for the data is available at Some sample queries are shown below. Note that as this is currently an experimental service, query timeouts are limited.


The Live Music Archive metadata is being made available under the CC0 1.0 Universal - "Creative Commons public domain waiver" license. Note that this only applies to the metadata supplied here. Links are given to audio materials for which other conditions may apply.

Sample Visualisations

The figured below show some sample overview visualisations produced using Cytoscape. The first shows relationships between artists (pink) and the geographical locations (green) in which they have been recorded. The second shows relationships between artists (pink) and mapped venues (green) in which they have been recorded.



The Computational Analysis of the Live Music Archive (CALMA) project aims to extend the etree data with computational analyses over the recordings through feature extraction, clustering, and classification.


This dataset was produced by

For more information, please contact Sean Bechhofer. We would also like to thank the Internet Archive for granting access to the source data.