From: Yoav F. <yf...@us...> - 2008-09-11 19:58:06
|
Update of /cvsroot/jboost/jboost/scripts In directory sc8-pr-cvs17.sourceforge.net:/tmp/cvs-serv30612/scripts Added Files: AddRandomIndex.py Log Message: A script for adding a random index to a data file. --- NEW FILE: AddRandomIndex.py --- """ add an INDEX field to a jboost data file. INDEX is a randomly permuted number ranging between 1 and the number of examples in the data file. The script also takes care of altering the spec file. This pre-processing step makes it possible to track examples through an n-fold cross validation experiment. """ filename = "/Users/yoavfreund/projects/jboost/demo/spambase" datafile = open(filename+".data",'r') lines=[] morelines = datafile.readlines(100000) while len(morelines)>0: lines.extend(morelines) morelines = datafile.readlines(100000) datafile.close() length = len(lines) from random import shuffle shuffle(lines) newdatafile = open(filename+"I.data",'w') for i in range(length): newdatafile.write(("%d," % i)+lines[i]) newdatafile.close() |