|
From: Ralf <ral...@gm...> - 2015-05-01 18:14:55
|
You, you are able to have a training set that contain string attribute as a first attribute . But, to perform classification later on, you need to convert the attribute that has string type into numeric one using FilteredClassifier in conjunction with e.g., SMO and StringToWordVector. This what I did in my previous example. Regards, Ralf On Sat, May 2, 2015 at 1:31 AM, james wafula [via WEKA] < ml-node+s8497n34423h89(a)n7.nabble.com> wrote: > > I now get the drift. I was referring to your earlier email: > > "... Instances must have a single nominal attribute (excluding the class). > This attribute must be the first attribute in the file and its values are > used to reference rows/columns in the kernel matrix ..." > > My question is: Can we use the string attribute as the first attribute in > the training set arff file? This is the only attribute I have other than > the class attribute, and I have a pre-computed matrix of similarities. > > Best regards, > > James. > > > > > > On Friday, May 1, 2015 6:09 PM, Ralf <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=34423&i=0>> wrote: > > > > <[hidden email]> wrote: > > > Thanks again for your detailed reply, and sorry for my lack of clarity :-). > I have an example in which my feature vector is a string attribute. I have > a pre-computed 51 x 51 kernel matrix already, properly loaded. Now my arff > file has attribute 1 as index from 0 to 50 and attribute 2 being yes or no. > I observe a new feature (a string) and I want to classify it. My problem is > the oattribriginal training set has no string ute in it. How will I use > this string information during training and testing? > > > What file has attribute 1 as index from 0 to 50 and attribute 2 being yes > or no is it your test set file? > > To perform prediction where you have to separated files (one for training, > and another for testing) both of them should have same attributes type and > number. Training file, has text (PDF, Twitter, etc.) that you want classify > (the learning algorithm will be trained on it). However, test file, is the > file that has unlabeled class which you are going to predict.If your > original training set has no string attribute in it, then what attribute it > has? Could you please be more precise. > > I performed classification (please see the attached files- training, test, > and result) to have more idea about the dataset structure. In my example, I > performed classification using "weka.classifiers.meta.FilteredClassifier", > its based classifier SMO (with its default setting) after selecting the > value of "buildLogisticModels" parameter to be True. In addition, I > selected "weka.filters.unsupervised.attribute.StringToWordVector" filter as > a filter of FilteredClassifier. > > HTH. > Ralf > > > > > > > > > > > On Friday, May 1, 2015 3:10 PM, Ralf <[hidden email]> wrote: > > > SMO Implements John Platt's sequential minimal optimization algorithm for > training a support vector classifier. This implementation globally replaces > all missing values and transforms nominal attributes into binary ones. > Moreover, it normalizes all attributes by default. > > However, In WEKA, If you have in addition to the training set file a test > set that is located in different file form the training set, where you have > to replace the last attribute with "?". To perform prediction process with > e.g., SMO, then in Classify panel, under the "Test options" choose > "Supplied test set" option to load your test set file. > > PS: Regarding the "PrecomputedKernelMatrixKernel", the kernel matrix > needs to be stored in a separate file which you have to load it form the > "kernelMatrixFile" an option of PrecomputedKernelMatrixKernel. On other > hand, in order to get probability estimates with SMO, you have to fit the > logistic model to the outputs through the option "buildLogisticModels" by > setting it to be "True". > > Kind regards, > Ralf > > On Fri, May 1, 2015 at 8:00 PM, james wafula [via WEKA] <[hidden email]> > wrote: > > Many thanks Ralf, it is now OK. But I wish to know how then this will be > useful when I have a new dataset - test set i.e. I want to predict the > class for a new observation. If only what I am providing is the index > attribute that refers to the rows/columns of the kernel matrix, how will > this information be sufficient to label a new observation appropriately? > > Best regards, > > James. > > > > On Friday, May 1, 2015 9:40 AM, Ralf <[hidden email]> wrote: > > > I believe THAT the kernel you provided cannot deal with the data that you > loaded. This kernel is based on a static kernel matrix that is read from a > file. Instances must have a single nominal attribute (excluding the class). > This attribute must be the first attribute in the file and its values are > used to reference rows/columns in the kernel matrix. The second attribute > must be the class attribute. > > Wish this helps. > Ralf > > > On Fri, May 1, 2015 at 4:09 AM, james wafula [via WEKA] <[hidden email]> > wrote: > > Hi all, > > I am very new to Weka. I have played around with using pre-computed kernel > matrix and all was well until recently. Now I get the following error: > > Problem reading matrix from kernelMatrix.matrix > > What could be the problem? I have neither altered the kernel matrix nor > the arff file at all. > > Best regards, > > James. > > > > > On Thursday, April 30, 2015 6:57 PM, jason roger <[hidden email]> wrote: > > > Dear Weka user, > > Anybody knows how Weka able to select "wordsToKeep" parameter of > StringToWordVector filer? > > Thanks a lot. > > Jason > > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://weka.8497.n7.nabble.com/no-subject-tp34407p34408.html > To start a new topic under WEKA, email [hidden email] > To unsubscribe from WEKA, click here. > NAML > <http://weka.8497.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > > ------------------------------ > View this message in context: Re: Problem reading matrix from > kernelMatrix.matrix > <http://weka.8497.n7.nabble.com/no-subject-tp34407p34416.html> > Sent from the WEKA mailing list archive <http://weka.8497.n7.nabble.com/> > at Nabble.com. > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://weka.8497.n7.nabble.com/no-subject-tp34407p34418.html > To start a new topic under WEKA, email [hidden email] > To unsubscribe from WEKA, click here. > NAML > <http://weka.8497.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > > ------------------------------ > View this message in context: Re: Problem reading matrix from > kernelMatrix.matrix > <http://weka.8497.n7.nabble.com/no-subject-tp34407p34419.html> > Sent from the WEKA mailing list archive <http://weka.8497.n7.nabble.com/> > at Nabble.com. > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://weka.8497.n7.nabble.com/no-subject-tp34407p34421.html > To start a new topic under WEKA, email [hidden email] > To unsubscribe from WEKA, click here. > NAML > <http://weka.8497.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > > *results.txt* (349K) Download Attachment > <http://weka.8497.n7.nabble.com/attachment/34422/0/results.txt> > *training set.arff* (10K) Download Attachment > <http://weka.8497.n7.nabble.com/attachment/34422/1/training%20set.arff> > *test set.arff* (4K) Download Attachment > <http://weka.8497.n7.nabble.com/attachment/34422/2/test%20set.arff> > > ------------------------------ > View this message in context: Re: Problem reading matrix from > kernelMatrix.matrix > <http://weka.8497.n7.nabble.com/no-subject-tp34407p34422.html> > Sent from the WEKA mailing list archive <http://weka.8497.n7.nabble.com/> > at Nabble.com. > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > <http:///user/SendEmail.jtp?type=node&node=34423&i=1> > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > > _______________________________________________ > Wekalist mailing list > Send posts to: [hidden email] > <http:///user/SendEmail.jtp?type=node&node=34423&i=2> > List info and subscription status: > http://list.waikato.ac.nz/mailman/listinfo/wekalist > List etiquette: > http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://weka.8497.n7.nabble.com/no-subject-tp34407p34423.html > To start a new topic under WEKA, email ml-node+s8497n2h38(a)n7.nabble.com > To unsubscribe from WEKA, click here > <http://weka.8497.n7.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=2&code=cmFsZmphY2s2NzNAZ21haWwuY29tfDJ8LTExNzg4Njg3OQ==> > . > NAML > <http://weka.8497.n7.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://weka.8497.n7.nabble.com/no-subject-tp34407p34424.html Sent from the WEKA mailing list archive at Nabble.com. |