Hi,
I have a training dataset of web pages related to a particular domian (one class). I want to get a similarity score or how much a new document (web page) is related to my initial training domain when I fed it to the classifier/clusterer. How do I do this in MinorThird?
Thank You.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
MinorThird does not come with clustering methods (though you can write one using the API). Most of the classifiers do not compare instance-instance similarity directly; KnnClassifier/Learner is one that does, and by default it uses a cosine-related similarity function so you can readily apply it to documents. You can modify the KnnClassifier class to output or log the similarity scores you want when classifying a new document.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have a training dataset of web pages related to a particular domian (one class). I want to get a similarity score or how much a new document (web page) is related to my initial training domain when I fed it to the classifier/clusterer. How do I do this in MinorThird?
Thank You.
MinorThird does not come with clustering methods (though you can write one using the API). Most of the classifiers do not compare instance-instance similarity directly; KnnClassifier/Learner is one that does, and by default it uses a cosine-related similarity function so you can readily apply it to documents. You can modify the KnnClassifier class to output or log the similarity scores you want when classifying a new document.