MiDataSets - Browse /MiDataSets/V1 - prerelease at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
MiDataSets_V1_office.tar.gz	2007-05-29	32.3 MB	0
MiDataSets_V1_telecom2_au_gsm.tar.gz	2007-03-18	23.9 MB	0
MiDataSets_V1_telecom1_adpcm_pcm.tar.gz	2007-03-18	25.7 MB	0
MiDataSets_V1_network.tar.gz	2007-03-18	2.4 MB	0
MiDataSets_V1_consumer5_tiff_notcompressed.tar.gz	2007-03-18	40.1 MB	0
MiDataSets_V1_consumer4_tiff_compressed.tar.gz	2007-03-18	40.0 MB	0
MiDataSets_V1_consumer3_ppm.tar.gz	2007-03-18	38.5 MB	0
MiDataSets_V1_consumer2_jpg.tar.gz	2007-03-18	3.0 MB	0
MiDataSets_V1_consumer1_mp3_wav.tar.gz	2007-03-18	28.7 MB	0
MiDataSets_V1_automotive.tar.gz	2007-03-18	8.6 MB	0
MiDataSets_V1__README.txt	2007-03-18	4.5 kB	0
MiBench_for_MiDataSets_V1.tar.gz	2007-03-18	9.1 MB	0
Totals: 12 Items		252.4 MB	0

MiDataSets for MiBench (GPL license)

Developers:
 Grigori Fursin (*), http://fursin.net/research
 John Cavazos (**), http://homepages.inf.ed.ac.uk/jcavazos
 Michael O'Boyle (**), http://www.dcs.ed.ac.uk/~mob
 Olivier Temam (*), http://www.lri.fr/~temam

 (*) INRIA Futurs, France
 (**) University of Edinburgh, UK

 Started in February, 2006
 
Development website:
 http://midatasets.sourceforge.net

Remarks:
 Though we made an effort to include only copyright free datasets 
 from the Internet, mistakes are possible. In such cases, please 
 contact Grigori Fursin (grigori.fursin@inria.fr) as soon as possible 
 and we will try to resolve the issue.

********************************************************************************

Iterative optimization is now a popular technique to obtain performance 
or code size improvements over the default settings in a compiler. However, 
in most of the research projects, the best configuration is found for one 
arbitrary dataset and it is assumed that this configuration  will work well 
with any other dataset that a program uses. We created 20 different datasets 
per program for free MiBench benchmark (http://www.eecs.umich.edu/mibench)
to evaluate this assumption and analyze the behavior of various programs 
with multiple datasets. We hope that this will enable more realistic 
benchmarking and practical iterative optimizations.

This work has been presented at HiPEAC'07:

Grigori Fursin, John Cavazos, Michael O'Boyle and Olivier Temam. 
MiDataSets: Creating The Conditions For A More  Realistic Evaluation 
of Iterative Optimization. Proceedings of the International Conference 
on High Performance Embedded Architectures Compilers (HiPEAC 2007), 
Ghent, Belgium, January 2007

********************************************************************************

Datasets:

automotive_qsort_data
 20 datasets, random numbers, different size

automotive_susan_data
 20 datasets, pnm images, different size, different scenery

consumer_data
 20 datasets, mp3 audio, different size, different bit-rate, different genres
 20 datasets, wav audio converted from original mp3 datasets

consumer_jpeg_data
 20 datasets, jpeg images, different size, different scenery
 20 datasets, ppm images converted from original jpeg datasets

consumer_tiff_data
 30 datasets, tiff images converted from original jpeg datasets
 30 datasets, b&w tiff images converted from original jpeg datasets
 30 datasets, tiff images without compression converted from original jpeg datasets

network_dijkstra_data
 20 datasets, random numbers, random size
          
network_patricia_data
 20 datasets, random numbers, random size

office_data
 20 datasets, text files, different size, different genres
 20 datasets, ps converted from original text datasets
 20 datasets, pgp converted from original text datasets
 20 datasets, enc converted from original text datasets
 20 datasets, benc converted from original text datasets
 20 datasets, text small files with random words in each line, different size

telecom_data
 20 datasets, pcm audio converted from mp3 datasets
 20 datasets, adpcm audio converted from mp3 datasets

telecom_gsm_data
 20 datasets, au audio converted from mp3 datasets
 20 datasets, gsm audio converted from mp3 datasets

********************************************************************************

Most of the source codes have been slightly modified by Grigori Fursin
to simplify and automate iterative optimizations. A loop wrapper has been 
added around the main procedure to make some benchmarks run longer when 
real execution time is used for measurements instead of a simulator 
(we do not yet take into account cache effects - it's a future work). 

Each directory has 3 Makefiles for GCC, Intel compilers and PathScale compilers. 
Each directory has a "__run" batch file to execute a benchmark. The first
parameter is the dataset number and the second optional parameter is the 
upper bound of the loop wrapper around the main procedure. 
If second parameter is omitted, the loop wrapper upper bound 
is taken from the file _run/_finfo_dataset.<dataset_number>.

Several batch files are included as examples to automate iterative optimizations
 all__create_work_dirs - creates temporal work directories for each benchmark
 all__delete_work_dirs - delete all temporal work directories
 all_compile - compile all benchmarks in the temporal work directories 
 all_run - run all benchmarks with all datasets in the temporal work directories

Source: MiDataSets_V1__README.txt, updated 2007-03-18

MiDataSets Files

Get an email when there's a new version of MiDataSets