|
From: james w. <jbu...@ya...> - 2015-05-01 17:47:59
|
I now get the drift. I was referring to your earlier email:
"... Instances must have a single nominal attribute (excluding the class). This attribute must be the first attribute in the file and its values are used to reference rows/columns in the kernel matrix ..."
My question is: Can we use the string attribute as the first attribute in the training set arff file? This is the only attribute I have other than the class attribute, and I have a pre-computed matrix of similarities.
Best regards,
James.
On Friday, May 1, 2015 6:09 PM, Ralf <ralfjack673(a)gmail.com> wrote:
<[hidden email]> wrote:
Thanks again for your detailed reply, and sorry for my lack of clarity :-).I have an example in which my feature vector is a string attribute. I have a pre-computed 51 x 51 kernel matrix already, properly loaded. Now my arff file has attribute 1 as index from 0 to 50 and attribute 2 being yes or no. I observe a new feature (a string) and I want to classify it. My problem is the oattribriginal training set has no string ute in it. How will I use this string information during training and testing?
What file has attribute 1 as index from 0 to 50 and attribute 2 being yes or no is it your test set file?
To perform prediction where you have to separated files (one for training, and another for testing) both of them should have same attributes type and number. Training file, has text (PDF, Twitter, etc.) that you want classify (the learning algorithm will be trained on it). However, test file, is the file that has unlabeled class which you are going to predict.If your original training set has no string attribute in it, then what attribute it has? Could you please be more precise.
I performed classification (please see the attached files- training, test, and result) to have more idea about the dataset structure. In my example, I performed classification using "weka.classifiers.meta.FilteredClassifier", its based classifier SMO (with its default setting) after selecting the value of "buildLogisticModels" parameter to be True. In addition, I selected "weka.filters.unsupervised.attribute.StringToWordVector" filter as a filter of FilteredClassifier.
HTH.
Ralf
On Friday, May 1, 2015 3:10 PM, Ralf <[hidden email]> wrote:
SMO Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier. This implementation globally replaces all missing values and transforms nominal attributes into binary ones. Moreover, it normalizes all attributes by default.
However, In WEKA, If you have in addition to the training set file a test set that is located in different file form the training set, where you have to replace the last attribute with "?". To perform prediction process with e.g., SMO, then in Classify panel, under the "Test options" choose "Supplied test set" option to load your test set file.
PS: Regarding the "PrecomputedKernelMatrixKernel", the kernel matrix needs to be stored in a separate file which you have to load it form the "kernelMatrixFile" an option of PrecomputedKernelMatrixKernel. On other hand, in order to get probability estimates with SMO, you have to fit the logistic model to the outputs through the option "buildLogisticModels" by setting it to be "True".
Kind regards,
Ralf
On Fri, May 1, 2015 at 8:00 PM, james wafula [via WEKA] <[hidden email]> wrote:
Many thanks Ralf, it is now OK. But I wish to know how then this will be useful when I have a new dataset - test set i.e. I want to predict the class for a new observation. If only what I am providing is the index attribute that refers to the rows/columns of the kernel matrix, how will this information be sufficient to label a new observation appropriately?
Best regards,
James.
On Friday, May 1, 2015 9:40 AM, Ralf <[hidden email]> wrote:
I believe THAT the kernel you provided cannot deal with the data that you loaded. This kernel is based on a static kernel matrix that is read from a file. Instances must have a single nominal attribute (excluding the class). This attribute must be the first attribute in the file and its values are used to reference rows/columns in the kernel matrix. The second attribute must be the class attribute.
Wish this helps.
Ralf
On Fri, May 1, 2015 at 4:09 AM, james wafula [via WEKA] <[hidden email]> wrote:
Hi all,
I am very new to Weka. I have played around with using pre-computed kernel matrix and all was well until recently. Now I get the following error:
Problem reading matrix from kernelMatrix.matrix
What could be the problem? I have neither altered the kernel matrix nor the arff file at all.
Best regards,
James.
On Thursday, April 30, 2015 6:57 PM, jason roger <[hidden email]> wrote:
Dear Weka user,
Anybody knows how Weka able to select "wordsToKeep" parameter of StringToWordVector filer?
Thanks a lot.
Jason
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
If you reply to this email, your message will be added to the discussion below: http://weka.8497.n7.nabble.com/no-subject-tp34407p34408.html To start a new topic under WEKA, email [hidden email]
To unsubscribe from WEKA, click here.
NAML
View this message in context: Re: Problem reading matrix from kernelMatrix.matrix
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
If you reply to this email, your message will be added to the discussion below: http://weka.8497.n7.nabble.com/no-subject-tp34407p34418.html To start a new topic under WEKA, email [hidden email]
To unsubscribe from WEKA, click here.
NAML
View this message in context: Re: Problem reading matrix from kernelMatrix.matrix
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
If you reply to this email, your message will be added to the discussion below: http://weka.8497.n7.nabble.com/no-subject-tp34407p34421.html To start a new topic under WEKA, email [hidden email]
To unsubscribe from WEKA, click here.
NAML
results.txt (349K) Download Attachment
training set.arff (10K) Download Attachment
test set.arff (4K) Download Attachment
View this message in context: Re: Problem reading matrix from kernelMatrix.matrix
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: Wekalist(a)list.waikato.ac.nz
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
|