<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Modules</title><link>https://sourceforge.net/p/macsy/wiki/Modules/</link><description>Recent changes to Modules</description><atom:link href="https://sourceforge.net/p/macsy/wiki/Modules/feed" rel="self"/><language>en</language><lastBuildDate>Mon, 12 Nov 2012 20:32:21 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/macsy/wiki/Modules/feed" rel="self" type="application/rss+xml"/><item><title>WikiPage Modules modified by Panagiota Antonakaki</title><link>https://sourceforge.net/p/macsy/wiki/Modules/</link><description>&lt;pre&gt;--- v1
+++ v2
@@ -1,35 +1,47 @@
-1. WordCount : Add a field in each document between desired dates, that specifies the number of  
-    words of the desired field of input.
-2. TimeLineTagsCount : This module calculates the number of the documents in a specified period of 
-    time that have a Tag name of interest. The result is the distribution of this specific tag per day, 
-    and can be displayed on the screen if necessary.
-3. TimeLineFieldAverage : Calculates the average of a field's values of interest to the dates in a 
-    specified period of days.
-4. InnerProduct : Computes the inner product or the cosine similarity of a text of interest w (provided 
-    as an external file) and documents' field x in the BB and stores the result in specified output   
-    field name.
-5. ReplaceTags : Queries the input BB for docs having all input Tags. It then replaces all the input 
-    tags with all output tags.
-6. LanguageDetector : Queries the input BB for docs having input Tag. Then it classifies the language of 
-    the specified fields. The input tag then is being replaced by output tag that includes the txt's 
+These modules are used to collect the right information of the stored documents and perform useful tasks such as extraction of documents' features like TF_IDF, or classification of the documents, etc.
+
+***MODULES***
+&gt; 1. **FeaturesExtractorTFIDF**
+&gt; &gt; Module for creating TF/IDF features of a text field.
+
+&gt; * **SVMTagger**
+&gt; &gt; Module for classifing docs based on LibSVM.
+
+&gt; * **WordCount**
+&gt; &gt; Module that adds a field in each document between desired dates, that specifies the number of words of the desired field of input.
+
+&gt; * **InnerProduct**
+&gt; &gt; This module computes the inner product or the cosine similarity of a text of interest w (provided as an external file) and documents' field x in the BB and stores the result in specified output field name.
+
+&gt; * **InnerProductWithWeights**
+&gt; &gt; Computes the weighted inner product of a text of interest (as an input vocabulary) and documents in the database with a predefined Tag name and Field name, within a period of time. The module writes the result on each document's registry.
+
+&gt; * **UrlFeedFinder**
+&gt; &gt; Module for classifing docs based on LibSVM.
+
+&gt; * **ReplaceTags**
+&gt; &gt; Queries the input BB for docs having all input Tags. It then replaces all the input tags with all output tags.
+
+&gt; * **LanguageDetector**
+&gt; &gt; Queries the input BB for docs having input Tag. Then it classifies the language of the specified fields. The input tag then is being replaced by output tag that includes the txt's 
     language.
-7. BinaryRepresentation : Queries the input BB for all docs in a specific period of dates. Then it 
-    checks if the words from the INPUT_VOCABULARY_FILENAME is present to the doc's field specified by 
-    the user and it adds a tag if the number of words are greater than a threshold again specified by 
-    the user.
-8. OnLineLearningPerceptronOnWords : Implements online learning using Perceptron algorithm. It adjust 
-    the weight vector w according to the INPUT_LEARN_TAGS, and the learning information is printed on 
-    the screen but also on the txt file STATISTICAL. It also updates the documents in the database by 
-    adding:
-         a) a new tag to all processed docs (positive or negative according to the predicted output).
-         b) a field with y_hat value.
- 
-             y_hat(t) = &lt;w(t).x(t)&gt;
- 
-    The module works  on every document that carries all the tags in INPUT_TAG field.
-9. InnerProductWithWeights : Computes the weighted inner product of a text of interest (as an input 
-    vocabulary) and documents in the database with a predefined Tag name and Field name, within a period 
-    of time. The module writes the result on each document's registry. (I haven't test it with 
-    cosine...).
-10. DistributionField : Exports the distribution of a given Field and exports a histogram (optional) 
-    (I'm still working on the visualization of the histogram).
+
+&gt; * **BinaryRepresentation**
+&gt; &gt; Queries the input BB for all docs in a specific period of dates. Then it checks if the words from the INPUT_VOCABULARY_FILENAME is present to the doc's field specified by the user and it adds a tag if the number of words are greater than a threshold again specified by the user.
+
+&gt; * **OnLineLearningPerceptronOnWords**
+&gt; &gt; Implements online learning using Perceptron algorithm. It adjust the weight vector w according to the INPUT_LEARN_TAGS, and the learning information is printed on the screen but also on the txt file STATISTICAL. It also updates the documents in the database by adding:
+&gt; &gt; &gt; a) a new tag to all processed docs (positive or negative according to the predicted output).
+&gt; &gt; &gt; b) a field with y_hat value.
+&gt; &gt; &gt; &gt; y_hat(t) = &lt;w(t).x(t)\&gt;
+&gt; &gt; The module works  on every document that carries all the tags in INPUT_TAG field.
+
+&gt; * **OnLineLearningPerceptronOnFeatures**
+&gt; &gt; The only difference with the previous module is that it takes the already calculated features as an input.
+
+&gt; * **OnLineLearningWinnowOnWords**
+&gt; &gt; Implements online learning using Winnow algorithm. It adjust the weight vector w according to the INPUT_LEARN_TAGS, and the learning information is printed on the screen but also on the txt file STATISTICAL. It also updates the documents in the database by adding: 
+&gt; &gt; &gt; a) a new tag to all processed docs (positive or negative according to the predicted output).
+&gt; &gt; &gt; b) a field with y_hat value.
+&gt; &gt; &gt; &gt; y_hat(t) = &lt;w(t).x(t)\&gt;
+&gt; &gt; The module works  on every document that carries all the tags in INPUT_TAG field.
&lt;/pre&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Panagiota Antonakaki</dc:creator><pubDate>Mon, 12 Nov 2012 20:32:21 -0000</pubDate><guid>https://sourceforge.net34c714875edc3cbb806c424375f16debf185cacc</guid></item><item><title>WikiPage Modules modified by Panagiota Antonakaki</title><link>https://sourceforge.net/p/macsy/wiki/Modules/</link><description>1. WordCount : Add a field in each document between desired dates, that specifies the number of  
    words of the desired field of input.
2. TimeLineTagsCount : This module calculates the number of the documents in a specified period of 
    time that have a Tag name of interest. The result is the distribution of this specific tag per day, 
    and can be displayed on the screen if necessary.
3. TimeLineFieldAverage : Calculates the average of a field's values of interest to the dates in a 
    specified period of days.
4. InnerProduct : Computes the inner product or the cosine similarity of a text of interest w (provided 
    as an external file) and documents' field x in the BB and stores the result in specified output   
    field name.
5. ReplaceTags : Queries the input BB for docs having all input Tags. It then replaces all the input 
    tags with all output tags.
6. LanguageDetector : Queries the input BB for docs having input Tag. Then it classifies the language of 
    the specified fields. The input tag then is being replaced by output tag that includes the txt's 
    language.
7. BinaryRepresentation : Queries the input BB for all docs in a specific period of dates. Then it 
    checks if the words from the INPUT_VOCABULARY_FILENAME is present to the doc's field specified by 
    the user and it adds a tag if the number of words are greater than a threshold again specified by 
    the user.
8. OnLineLearningPerceptronOnWords : Implements online learning using Perceptron algorithm. It adjust 
    the weight vector w according to the INPUT_LEARN_TAGS, and the learning information is printed on 
    the screen but also on the txt file STATISTICAL. It also updates the documents in the database by 
    adding:
         a) a new tag to all processed docs (positive or negative according to the predicted output).
         b) a field with y_hat value.
 
             y_hat(t) = &lt;w(t).x(t)&gt;
 
    The module works  on every document that carries all the tags in INPUT_TAG field.
9. InnerProductWithWeights : Computes the weighted inner product of a text of interest (as an input 
    vocabulary) and documents in the database with a predefined Tag name and Field name, within a period 
    of time. The module writes the result on each document's registry. (I haven't test it with 
    cosine...).
10. DistributionField : Exports the distribution of a given Field and exports a histogram (optional) 
    (I'm still working on the visualization of the histogram).</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Panagiota Antonakaki</dc:creator><pubDate>Sat, 10 Nov 2012 02:56:22 -0000</pubDate><guid>https://sourceforge.net974e694f19035744a9d7f18e3df04c18c0618d9b</guid></item></channel></rss>