Document similarity in MinorThird?

Help
sameendra
2012-05-18
2013-04-26
  • sameendra

    sameendra - 2012-05-18

    Hi,
    I have a training dataset of web pages related to a particular domian (one class). I want to get a similarity score or how much a new document (web page) is related to my initial training domain when I fed it to the classifier/clusterer. How do I do this in MinorThird?
    Thank You.

     
  • Frank Lin

    Frank Lin - 2012-05-23

    MinorThird does not come with clustering methods (though you can write one using the API). Most of the classifiers do not compare instance-instance similarity directly; KnnClassifier/Learner is one that does, and by default it uses a cosine-related similarity function so you can readily apply it to documents. You can modify the KnnClassifier class to output or log the similarity scores you want when classifying a new document.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks