RFS1k is an audio content analysis dataset consisting of 1000 heterogeneous soundfiles encoded in 44.1kHz/16bit PCM WAV format. It was gathered at random from Freesound by querying their API for random integer sound IDs. The dataset allows only one sound per unique Freesound user to discourage overrepresentation of source material, recording chain, etc. The dataset was gathered as part of Chris Donahue's master's thesis. More information about the dataset is available in Chapter 5 of said thesis.


The dataset consists of 1000 sounds originally encoded and uploaded to Freesound as 44.1kHz/16bit PCM WAV files. For each WAV file, the dataset includes a JSON file of metadata and two preview OggVorbis encodings of low and high quality. It is useful for unsupervised learning tasks, some supervised tasks (the metadata contains user-provided tags and descriptions) and black-box analysis of sound manipulation techniques. The dataset can be acquired here:


Other Downloads


RFS1k is not distributed under any single license. Instead, all sound files retain their original license as determined by the Freesound.org user. The license for each sound file can be found in its associated JSON metadata file and will be one of the following three (in ascending order of strictness):

For more information on these licenses please head to the Freesound.org FAQ.


Thanks to Freesound for their wonderful service and API. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1053575. Thanks to the University of California, San Diego Department of Music for additional funding and web hosting for this project.


If using this data in a research project or paper, please cite Chris Donahue's master's thesis. BibTeX entry appears below:

  author={Christopher Donahue},
  title={Extensions to Convolution for Generalized Cross-Synthesis},
  school={University of California, San Diego},