Update of /cvsroot/jboost/jboost/scripts
In directory sc8-pr-cvs17.sourceforge.net:/tmp/cvs-serv30612/scripts
Added Files:
AddRandomIndex.py
Log Message:
A script for adding a random index to a data file.
--- NEW FILE: AddRandomIndex.py ---
"""
add an INDEX field to a jboost data file. INDEX is a randomly permuted number ranging
between 1 and the number of examples in the data file. The script also takes care of
altering the spec file. This pre-processing step makes it possible to track examples through
an n-fold cross validation experiment.
"""
filename = "/Users/yoavfreund/projects/jboost/demo/spambase"
datafile = open(filename+".data",'r')
lines=[]
morelines = datafile.readlines(100000)
while len(morelines)>0:
lines.extend(morelines)
morelines = datafile.readlines(100000)
datafile.close()
length = len(lines)
from random import shuffle
shuffle(lines)
newdatafile = open(filename+"I.data",'w')
for i in range(length):
newdatafile.write(("%d," % i)+lines[i])
newdatafile.close()
|