Thread: RE: [Classifier4j-devel] Training and Classifying
Status: Beta
Brought to you by:
nicklothian
From: Nick L. <nl...@es...> - 2004-05-17 23:12:35
|
> > Hi Nick, > > The problem I am having is as follows: > > I wish to classify an unknown article against trained data, > which is in > the "word_probability" table. > > So, to do this I must use BayesianClassifier as follows: > > 1. Create an instance of JDBCWordsDataSource > 2. Create a new instance of BayesianClassifier, passing the > JDBCWordsDataSource to the constructor > 3. Call BayesianClassifier.classify(String) > > However, by creating an instance of JDBCWordsDataSource (1), the > constructor for this method creates a new table. > ------ which I do not want because I will have to delete the table > containing all my trained data. > Well that would be bad! Fortunatly you are incorrect about the constructor: JDBCWordsDataSource detects is the table already exists, and if the table does exist it doesn't create it - see <http://classifier4j.sourceforge.net/xref/net/sf/classifier4J/bayesian/JDBCW ordsDataSource.html#235> (Note that there is a bug in the 0.5 version of Classifer4J here if you are using MySQL on Unix - the table names are case sensitive. Get the CVS version). If you are actually seeing this behaviour (the table being dropped) could you please give more details about the database you are using, etc. > What I want is to be able to retrieve my trained data (in a > JDBCWordsDataSource??), then classify the new article against > this data. > > How can I do this? I would very much appreciate some sample code. > It sounds like you are on the right track. Let me know if you try this and it doesn't work. Nick |
From: Nick L. <nl...@es...> - 2004-05-18 23:55:05
|
> > Hi Nick, > > > Is PostgreSQL case sensitive with respect to table names? > > PostgreSQL is not case-sensitive - I have tested this. > Are you sure about this? There seems to be a fair bit of discussion about the inability to switch it off: <http://forums.devshed.com/archive/t-44274> > > If that doesn't help, can please send the result set > returned from the > > following code: > > > > DatabaseMetaData dbm = con.getMetaData(); > > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > > I'm not too sure what you mean when you say to "send the > result set" to > you. > What method should I call on the ResultSet. > > When I do: > ResultSetMetaData rM = rs.getMetaData(); > String name = rM.getColumnName(2); > > The value of name is "table_schem". Shouldn't it be one of > the names of > the columns in the word_probability table?? > No. The getTables(..) call gets data about the tables that exist in the database. See <http://java.sun.com/j2se/1.4.2/docs/api/java/sql/DatabaseMetaData.html#getT ables(java.lang.String,%20java.lang.String,%20java.lang.String,%20java.lang. String[])> Could you please run code like System.out.println("word_probability (in lower case) Result Set:"); DatabaseMetaData dbm = con.getMetaData(); ResultSet rs = dbm.getTables(null, null, "word_probability", null); while (rs.next()) { System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); System.out.println("TABLE_SCHEM = " + rs.getString("TABLE_SCHEM")); System.out.println("TABLE_NAME = " + rs.getString("TABLE_NAME")); System.out.println("TABLE_TYPE = " + rs.getString("TABLE_TYPE")); System.out.println("REMARKS = " + rs.getString("REMARKS")); System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); System.out.println("TYPE_SCHEM = " + rs.getString("TYPE_SCHEM")); System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); System.out.println("SELF_REFERENCING_COL_NAME = " + rs.getString("SELF_REFERENCING_COL_NAME")); System.out.println("REF_GENERATION = " + rs.getString("REF_GENERATION")); } System.out.println("End of Result Set"); rs.close(); System.out.println("WORD_PROBABILITY (in UPPER case) Result Set:"); dbm = con.getMetaData(); rs = dbm.getTables(null, null, "WORD_PROBABILITY", null); while (rs.next()) { System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); System.out.println("TABLE_SCHEM = " + rs.getString("TABLE_SCHEM")); System.out.println("TABLE_NAME = " + rs.getString("TABLE_NAME")); System.out.println("TABLE_TYPE = " + rs.getString("TABLE_TYPE")); System.out.println("REMARKS = " + rs.getString("REMARKS")); System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); System.out.println("TYPE_SCHEM = " + rs.getString("TYPE_SCHEM")); System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); System.out.println("SELF_REFERENCING_COL_NAME = " + rs.getString("SELF_REFERENCING_COL_NAME")); System.out.println("REF_GENERATION = " + rs.getString("REF_GENERATION")); } System.out.println("End of Result Set"); rs.close(); (Note that this hasn't been tested, but it should pretty close) |
From: Neil G. <np...@nt...> - 2004-05-19 23:02:06
|
Hi Nick, I really appreciate the advice you are providing. After running the code you provided, the following is output: word_probability (in lower case) Result Set: TABLE_CAT = null TABLE_SCHEM = public TABLE_NAME = word_probability TABLE_TYPE = TABLE REMARKS = null End of Result Set WORD_PROBABILITY (in UPPER case) Result Set: End of Result Set I had to comment out TYPE_CAT, TYPE_SCHEM, TYPE_NAME, SELF_REFERENCING_COL_NAME, REF_GENERATION Because an error occurred saying that these column names could not be found. The Java API says: "Note: Some databases may not return information for all tables." What I understand from the above output is that when UPPER case is used, the Result Set does not have any values. Do you agree? Regards Neil > -----Original Message----- > From: cla...@li... [mailto:classifier4j- > dev...@li...] On Behalf Of Nick Lothian > Sent: 19 May 2004 00:54 > To: 'cla...@li...' > Subject: RE: [Classifier4j-devel] Training and Classifying > > > > > Hi Nick, > > > > > Is PostgreSQL case sensitive with respect to table names? > > > > PostgreSQL is not case-sensitive - I have tested this. > > > > Are you sure about this? There seems to be a fair bit of discussion about > the inability to switch it off: <http://forums.devshed.com/archive/t- > 44274> > > > > If that doesn't help, can please send the result set > > returned from the > > > following code: > > > > > > DatabaseMetaData dbm = con.getMetaData(); > > > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > > > > I'm not too sure what you mean when you say to "send the > > result set" to > > you. > > What method should I call on the ResultSet. > > > > When I do: > > ResultSetMetaData rM = rs.getMetaData(); > > String name = rM.getColumnName(2); > > > > The value of name is "table_schem". Shouldn't it be one of > > the names of > > the columns in the word_probability table?? > > > > No. The getTables(..) call gets data about the tables that exist in the > database. See > <http://java.sun.com/j2se/1.4.2/docs/api/java/sql/DatabaseMetaData.html# ge > tT > ables(java.lang.String,%20java.lang.String,%20java.lang.String,%20java.l an > g. > String[])> > > Could you please run code like > > System.out.println("word_probability (in lower case) Result Set:"); > DatabaseMetaData dbm = con.getMetaData(); > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > while (rs.next()) { > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > System.out.println("TABLE_SCHEM = " + rs.getString("TABLE_SCHEM")); > System.out.println("TABLE_NAME = " + rs.getString("TABLE_NAME")); > System.out.println("TABLE_TYPE = " + rs.getString("TABLE_TYPE")); > System.out.println("REMARKS = " + rs.getString("REMARKS")); > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > System.out.println("TYPE_SCHEM = " + rs.getString("TYPE_SCHEM")); > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > System.out.println("SELF_REFERENCING_COL_NAME = " + > rs.getString("SELF_REFERENCING_COL_NAME")); > System.out.println("REF_GENERATION = " + > rs.getString("REF_GENERATION")); > } > System.out.println("End of Result Set"); > rs.close(); > > System.out.println("WORD_PROBABILITY (in UPPER case) Result Set:"); > dbm = con.getMetaData(); > rs = dbm.getTables(null, null, "WORD_PROBABILITY", null); > while (rs.next()) { > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > System.out.println("TABLE_SCHEM = " + rs.getString("TABLE_SCHEM")); > System.out.println("TABLE_NAME = " + rs.getString("TABLE_NAME")); > System.out.println("TABLE_TYPE = " + rs.getString("TABLE_TYPE")); > System.out.println("REMARKS = " + rs.getString("REMARKS")); > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > System.out.println("TYPE_SCHEM = " + rs.getString("TYPE_SCHEM")); > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > System.out.println("SELF_REFERENCING_COL_NAME = " + > rs.getString("SELF_REFERENCING_COL_NAME")); > System.out.println("REF_GENERATION = " + > rs.getString("REF_GENERATION")); > } > System.out.println("End of Result Set"); > rs.close(); > > (Note that this hasn't been tested, but it should pretty close) > > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: SourceForge.net Broadband > Sign-up now for SourceForge Broadband and get the fastest > 6.0/768 connection for only $19.95/mo for the first 3 months! > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel |
From: Nick L. <nl...@es...> - 2004-05-19 23:33:13
|
That means that PostgreSQL is case sensitive. Please get the CVS version (in particular <http://cvs.sourceforge.net/viewcvs.py/*checkout*/classifier4j/Classifier4J/ src/java/net/sf/classifier4J/bayesian/JDBCWordsDataSource.java?rev=1.14>) or just apply the modification yourself: <http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/src/java/ne t/sf/classifier4J/bayesian/JDBCWordsDataSource.java?r1=1.13&r2=1.14> Hopefully that will fix your problems. Nick > -----Original Message----- > From: Neil Gandhi [mailto:np...@nt...] > Sent: Wednesday, 19 May 2004 12:18 AM > To: cla...@li... > Subject: RE: [Classifier4j-devel] Training and Classifying > Importance: Low > > > Hi Nick, > > I really appreciate the advice you are providing. > > After running the code you provided, the following is output: > > word_probability (in lower case) Result Set: > TABLE_CAT = null > TABLE_SCHEM = public > TABLE_NAME = word_probability > TABLE_TYPE = TABLE > REMARKS = null > End of Result Set > WORD_PROBABILITY (in UPPER case) Result Set: > End of Result Set > > I had to comment out TYPE_CAT, TYPE_SCHEM, TYPE_NAME, > SELF_REFERENCING_COL_NAME, REF_GENERATION > Because an error occurred saying that these column names could not be > found. > > The Java API says: "Note: Some databases may not return > information for > all tables." > > What I understand from the above output is that when UPPER > case is used, > the Result Set does not have any values. > Do you agree? > > Regards > > Neil > > > -----Original Message----- > > From: cla...@li... > [mailto:classifier4j- > > dev...@li...] On Behalf Of Nick Lothian > > Sent: 19 May 2004 00:54 > > To: 'cla...@li...' > > Subject: RE: [Classifier4j-devel] Training and Classifying > > > > > > > > Hi Nick, > > > > > > > Is PostgreSQL case sensitive with respect to table names? > > > > > > PostgreSQL is not case-sensitive - I have tested this. > > > > > > > Are you sure about this? There seems to be a fair bit of discussion > about > > the inability to switch it off: > <http://forums.devshed.com/archive/t- > > 44274> > > > > > > If that doesn't help, can please send the result set > > > returned from the > > > > following code: > > > > > > > > DatabaseMetaData dbm = con.getMetaData(); > > > > ResultSet rs = dbm.getTables(null, null, "word_probability", > null); > > > > > > I'm not too sure what you mean when you say to "send the > > > result set" to > > > you. > > > What method should I call on the ResultSet. > > > > > > When I do: > > > ResultSetMetaData rM = rs.getMetaData(); > > > String name = rM.getColumnName(2); > > > > > > The value of name is "table_schem". Shouldn't it be one of > > > the names of > > > the columns in the word_probability table?? > > > > > > > No. The getTables(..) call gets data about the tables that exist in > the > > database. See > > > <http://java.sun.com/j2se/1.4.2/docs/api/java/sql/DatabaseMeta > Data.html# > ge > > tT > > > ables(java.lang.String,%20java.lang.String,%20java.lang.String > ,%20java.l > an > > g. > > String[])> > > > > Could you please run code like > > > > System.out.println("word_probability (in lower case) Result Set:"); > > DatabaseMetaData dbm = con.getMetaData(); > > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > > while (rs.next()) { > > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > > System.out.println("TABLE_SCHEM = " + > rs.getString("TABLE_SCHEM")); > > System.out.println("TABLE_NAME = " + > rs.getString("TABLE_NAME")); > > System.out.println("TABLE_TYPE = " + > rs.getString("TABLE_TYPE")); > > System.out.println("REMARKS = " + rs.getString("REMARKS")); > > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > > System.out.println("TYPE_SCHEM = " + > rs.getString("TYPE_SCHEM")); > > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > > System.out.println("SELF_REFERENCING_COL_NAME = " + > > rs.getString("SELF_REFERENCING_COL_NAME")); > > System.out.println("REF_GENERATION = " + > > rs.getString("REF_GENERATION")); > > } > > System.out.println("End of Result Set"); > > rs.close(); > > > > System.out.println("WORD_PROBABILITY (in UPPER case) Result Set:"); > > dbm = con.getMetaData(); > > rs = dbm.getTables(null, null, "WORD_PROBABILITY", null); > > while (rs.next()) { > > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > > System.out.println("TABLE_SCHEM = " + > rs.getString("TABLE_SCHEM")); > > System.out.println("TABLE_NAME = " + > rs.getString("TABLE_NAME")); > > System.out.println("TABLE_TYPE = " + > rs.getString("TABLE_TYPE")); > > System.out.println("REMARKS = " + rs.getString("REMARKS")); > > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > > System.out.println("TYPE_SCHEM = " + > rs.getString("TYPE_SCHEM")); > > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > > System.out.println("SELF_REFERENCING_COL_NAME = " + > > rs.getString("SELF_REFERENCING_COL_NAME")); > > System.out.println("REF_GENERATION = " + > > rs.getString("REF_GENERATION")); > > } > > System.out.println("End of Result Set"); > > rs.close(); > > > > (Note that this hasn't been tested, but it should pretty close) > > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: SourceForge.net Broadband > > Sign-up now for SourceForge Broadband and get the fastest > > 6.0/768 connection for only $19.95/mo for the first 3 months! > > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > > _______________________________________________ > > Classifier4j-devel mailing list > > Cla...@li... > > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... > Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > |
From: Neil G. <np...@nt...> - 2004-05-20 10:01:12
|
Thanks Nick, The case-sensitivity was the problem. Thanks for your help. Regards, Neil > -----Original Message----- > From: cla...@li... [mailto:classifier4j- > dev...@li...] On Behalf Of Nick Lothian > Sent: 20 May 2004 00:32 > To: 'cla...@li...' > Subject: RE: [Classifier4j-devel] Training and Classifying > > That means that PostgreSQL is case sensitive. > > Please get the CVS version (in particular > <http://cvs.sourceforge.net/viewcvs.py/*checkout*/classifier4j/Classifie r4 > J/ > src/java/net/sf/classifier4J/bayesian/JDBCWordsDataSource.java?rev=1.14> ) > or > just apply the modification yourself: > <http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/src/jav a/ > ne > t/sf/classifier4J/bayesian/JDBCWordsDataSource.java?r1=1.13&r2=1.14> > > > Hopefully that will fix your problems. > > Nick > > > -----Original Message----- > > From: Neil Gandhi [mailto:np...@nt...] > > Sent: Wednesday, 19 May 2004 12:18 AM > > To: cla...@li... > > Subject: RE: [Classifier4j-devel] Training and Classifying > > Importance: Low > > > > > > Hi Nick, > > > > I really appreciate the advice you are providing. > > > > After running the code you provided, the following is output: > > > > word_probability (in lower case) Result Set: > > TABLE_CAT = null > > TABLE_SCHEM = public > > TABLE_NAME = word_probability > > TABLE_TYPE = TABLE > > REMARKS = null > > End of Result Set > > WORD_PROBABILITY (in UPPER case) Result Set: > > End of Result Set > > > > I had to comment out TYPE_CAT, TYPE_SCHEM, TYPE_NAME, > > SELF_REFERENCING_COL_NAME, REF_GENERATION > > Because an error occurred saying that these column names could not be > > found. > > > > The Java API says: "Note: Some databases may not return > > information for > > all tables." > > > > What I understand from the above output is that when UPPER > > case is used, > > the Result Set does not have any values. > > Do you agree? > > > > Regards > > > > Neil > > > > > -----Original Message----- > > > From: cla...@li... > > [mailto:classifier4j- > > > dev...@li...] On Behalf Of Nick Lothian > > > Sent: 19 May 2004 00:54 > > > To: 'cla...@li...' > > > Subject: RE: [Classifier4j-devel] Training and Classifying > > > > > > > > > > > Hi Nick, > > > > > > > > > Is PostgreSQL case sensitive with respect to table names? > > > > > > > > PostgreSQL is not case-sensitive - I have tested this. > > > > > > > > > > Are you sure about this? There seems to be a fair bit of discussion > > about > > > the inability to switch it off: > > <http://forums.devshed.com/archive/t- > > > 44274> > > > > > > > > If that doesn't help, can please send the result set > > > > returned from the > > > > > following code: > > > > > > > > > > DatabaseMetaData dbm = con.getMetaData(); > > > > > ResultSet rs = dbm.getTables(null, null, "word_probability", > > null); > > > > > > > > I'm not too sure what you mean when you say to "send the > > > > result set" to > > > > you. > > > > What method should I call on the ResultSet. > > > > > > > > When I do: > > > > ResultSetMetaData rM = rs.getMetaData(); > > > > String name = rM.getColumnName(2); > > > > > > > > The value of name is "table_schem". Shouldn't it be one of > > > > the names of > > > > the columns in the word_probability table?? > > > > > > > > > > No. The getTables(..) call gets data about the tables that exist in > > the > > > database. See > > > > > <http://java.sun.com/j2se/1.4.2/docs/api/java/sql/DatabaseMeta > > Data.html# > > ge > > > tT > > > > > ables(java.lang.String,%20java.lang.String,%20java.lang.String > > ,%20java.l > > an > > > g. > > > String[])> > > > > > > Could you please run code like > > > > > > System.out.println("word_probability (in lower case) Result Set:"); > > > DatabaseMetaData dbm = con.getMetaData(); > > > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > > > while (rs.next()) { > > > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > > > System.out.println("TABLE_SCHEM = " + > > rs.getString("TABLE_SCHEM")); > > > System.out.println("TABLE_NAME = " + > > rs.getString("TABLE_NAME")); > > > System.out.println("TABLE_TYPE = " + > > rs.getString("TABLE_TYPE")); > > > System.out.println("REMARKS = " + rs.getString("REMARKS")); > > > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > > > System.out.println("TYPE_SCHEM = " + > > rs.getString("TYPE_SCHEM")); > > > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > > > System.out.println("SELF_REFERENCING_COL_NAME = " + > > > rs.getString("SELF_REFERENCING_COL_NAME")); > > > System.out.println("REF_GENERATION = " + > > > rs.getString("REF_GENERATION")); > > > } > > > System.out.println("End of Result Set"); > > > rs.close(); > > > > > > System.out.println("WORD_PROBABILITY (in UPPER case) Result Set:"); > > > dbm = con.getMetaData(); > > > rs = dbm.getTables(null, null, "WORD_PROBABILITY", null); > > > while (rs.next()) { > > > System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); > > > System.out.println("TABLE_SCHEM = " + > > rs.getString("TABLE_SCHEM")); > > > System.out.println("TABLE_NAME = " + > > rs.getString("TABLE_NAME")); > > > System.out.println("TABLE_TYPE = " + > > rs.getString("TABLE_TYPE")); > > > System.out.println("REMARKS = " + rs.getString("REMARKS")); > > > System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); > > > System.out.println("TYPE_SCHEM = " + > > rs.getString("TYPE_SCHEM")); > > > System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); > > > System.out.println("SELF_REFERENCING_COL_NAME = " + > > > rs.getString("SELF_REFERENCING_COL_NAME")); > > > System.out.println("REF_GENERATION = " + > > > rs.getString("REF_GENERATION")); > > > } > > > System.out.println("End of Result Set"); > > > rs.close(); > > > > > > (Note that this hasn't been tested, but it should pretty close) > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > This SF.Net email is sponsored by: SourceForge.net Broadband > > > Sign-up now for SourceForge Broadband and get the fastest > > > 6.0/768 connection for only $19.95/mo for the first 3 months! > > > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > > > _______________________________________________ > > > Classifier4j-devel mailing list > > > Cla...@li... > > > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: Oracle 10g > > Get certified on the hottest thing ever to hit the market... > > Oracle 10g. > > Take an Oracle 10g class now, and we'll give you the exam FREE. > > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > > _______________________________________________ > > Classifier4j-devel mailing list > > Cla...@li... > > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel |
From: Nick L. <ni...@ma...> - 2004-05-20 11:34:53
|
Excellent. I'm glad that it worked. Nick >Thanks Nick, > >The case-sensitivity was the problem. > >Thanks for your help. > >Regards, > >Neil > > > >>-----Original Message----- >>From: cla...@li... >> >> >[mailto:classifier4j- > > >>dev...@li...] On Behalf Of Nick Lothian >>Sent: 20 May 2004 00:32 >>To: 'cla...@li...' >>Subject: RE: [Classifier4j-devel] Training and Classifying >> >>That means that PostgreSQL is case sensitive. >> >>Please get the CVS version (in particular >> >> >> ><http://cvs.sourceforge.net/viewcvs.py/*checkout*/classifier4j/Classifie >r4 > > >>J/ >> >> >> >src/java/net/sf/classifier4J/bayesian/JDBCWordsDataSource.java?rev=1.14> >) > > >>or >>just apply the modification yourself: >> >> >> ><http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/src/jav >a/ > > >>ne >>t/sf/classifier4J/bayesian/JDBCWordsDataSource.java?r1=1.13&r2=1.14> >> >> >>Hopefully that will fix your problems. >> >>Nick >> >> >> >>>-----Original Message----- >>>From: Neil Gandhi [mailto:np...@nt...] >>>Sent: Wednesday, 19 May 2004 12:18 AM >>>To: cla...@li... >>>Subject: RE: [Classifier4j-devel] Training and Classifying >>>Importance: Low >>> >>> >>>Hi Nick, >>> >>>I really appreciate the advice you are providing. >>> >>>After running the code you provided, the following is output: >>> >>> word_probability (in lower case) Result Set: >>> TABLE_CAT = null >>> TABLE_SCHEM = public >>> TABLE_NAME = word_probability >>> TABLE_TYPE = TABLE >>> REMARKS = null >>> End of Result Set >>> WORD_PROBABILITY (in UPPER case) Result Set: >>> End of Result Set >>> >>>I had to comment out TYPE_CAT, TYPE_SCHEM, TYPE_NAME, >>>SELF_REFERENCING_COL_NAME, REF_GENERATION >>>Because an error occurred saying that these column names could not >>> >>> >be > > >>>found. >>> >>>The Java API says: "Note: Some databases may not return >>>information for >>>all tables." >>> >>>What I understand from the above output is that when UPPER >>>case is used, >>>the Result Set does not have any values. >>>Do you agree? >>> >>>Regards >>> >>>Neil >>> >>> >>> >>>>-----Original Message----- >>>>From: cla...@li... >>>> >>>> >>>[mailto:classifier4j- >>> >>> >>>>dev...@li...] On Behalf Of Nick Lothian >>>>Sent: 19 May 2004 00:54 >>>>To: 'cla...@li...' >>>>Subject: RE: [Classifier4j-devel] Training and Classifying >>>> >>>> >>>> >>>>>Hi Nick, >>>>> >>>>> >>>>> >>>>>>Is PostgreSQL case sensitive with respect to table names? >>>>>> >>>>>> >>>>>PostgreSQL is not case-sensitive - I have tested this. >>>>> >>>>> >>>>> >>>>Are you sure about this? There seems to be a fair bit of >>>> >>>> >discussion > > >>>about >>> >>> >>>>the inability to switch it off: >>>> >>>> >>><http://forums.devshed.com/archive/t- >>> >>> >>>>44274> >>>> >>>> >>>> >>>>>>If that doesn't help, can please send the result set >>>>>> >>>>>> >>>>>returned from the >>>>> >>>>> >>>>>>following code: >>>>>> >>>>>>DatabaseMetaData dbm = con.getMetaData(); >>>>>>ResultSet rs = dbm.getTables(null, null, "word_probability", >>>>>> >>>>>> >>>null); >>> >>> >>>>>I'm not too sure what you mean when you say to "send the >>>>>result set" to >>>>>you. >>>>>What method should I call on the ResultSet. >>>>> >>>>>When I do: >>>>> ResultSetMetaData rM = rs.getMetaData(); >>>>> String name = rM.getColumnName(2); >>>>> >>>>>The value of name is "table_schem". Shouldn't it be one of >>>>>the names of >>>>>the columns in the word_probability table?? >>>>> >>>>> >>>>> >>>>No. The getTables(..) call gets data about the tables that exist >>>> >>>> >in > > >>>the >>> >>> >>>>database. See >>>> >>>> >>>> >>><http://java.sun.com/j2se/1.4.2/docs/api/java/sql/DatabaseMeta >>>Data.html# >>>ge >>> >>> >>>>tT >>>> >>>> >>>> >>>ables(java.lang.String,%20java.lang.String,%20java.lang.String >>>,%20java.l >>>an >>> >>> >>>>g. >>>>String[])> >>>> >>>>Could you please run code like >>>> >>>>System.out.println("word_probability (in lower case) Result >>>> >>>> >Set:"); > > >>>>DatabaseMetaData dbm = con.getMetaData(); >>>>ResultSet rs = dbm.getTables(null, null, "word_probability", >>>> >>>> >null); > > >>>>while (rs.next()) { >>>> System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); >>>> System.out.println("TABLE_SCHEM = " + >>>> >>>> >>>rs.getString("TABLE_SCHEM")); >>> >>> >>>> System.out.println("TABLE_NAME = " + >>>> >>>> >>>rs.getString("TABLE_NAME")); >>> >>> >>>> System.out.println("TABLE_TYPE = " + >>>> >>>> >>>rs.getString("TABLE_TYPE")); >>> >>> >>>> System.out.println("REMARKS = " + rs.getString("REMARKS")); >>>> System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); >>>> System.out.println("TYPE_SCHEM = " + >>>> >>>> >>>rs.getString("TYPE_SCHEM")); >>> >>> >>>> System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); >>>> System.out.println("SELF_REFERENCING_COL_NAME = " + >>>>rs.getString("SELF_REFERENCING_COL_NAME")); >>>> System.out.println("REF_GENERATION = " + >>>>rs.getString("REF_GENERATION")); >>>>} >>>>System.out.println("End of Result Set"); >>>>rs.close(); >>>> >>>>System.out.println("WORD_PROBABILITY (in UPPER case) Result >>>> >>>> >Set:"); > > >>>>dbm = con.getMetaData(); >>>>rs = dbm.getTables(null, null, "WORD_PROBABILITY", null); >>>>while (rs.next()) { >>>> System.out.println("TABLE_CAT = " + rs.getString("TABLE_CAT")); >>>> System.out.println("TABLE_SCHEM = " + >>>> >>>> >>>rs.getString("TABLE_SCHEM")); >>> >>> >>>> System.out.println("TABLE_NAME = " + >>>> >>>> >>>rs.getString("TABLE_NAME")); >>> >>> >>>> System.out.println("TABLE_TYPE = " + >>>> >>>> >>>rs.getString("TABLE_TYPE")); >>> >>> >>>> System.out.println("REMARKS = " + rs.getString("REMARKS")); >>>> System.out.println("TYPE_CAT = " + rs.getString("TYPE_CAT")); >>>> System.out.println("TYPE_SCHEM = " + >>>> >>>> >>>rs.getString("TYPE_SCHEM")); >>> >>> >>>> System.out.println("TYPE_NAME = " + rs.getString("TYPE_NAME")); >>>> System.out.println("SELF_REFERENCING_COL_NAME = " + >>>>rs.getString("SELF_REFERENCING_COL_NAME")); >>>> System.out.println("REF_GENERATION = " + >>>>rs.getString("REF_GENERATION")); >>>>} >>>>System.out.println("End of Result Set"); >>>>rs.close(); >>>> >>>>(Note that this hasn't been tested, but it should pretty close) >>>> >>>> >>>> >>>> >>>> >>>> >>>>------------------------------------------------------- >>>>This SF.Net email is sponsored by: SourceForge.net Broadband >>>>Sign-up now for SourceForge Broadband and get the fastest >>>>6.0/768 connection for only $19.95/mo for the first 3 months! >>>>http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click >>>>_______________________________________________ >>>>Classifier4j-devel mailing list >>>>Cla...@li... >>>>https://lists.sourceforge.net/lists/listinfo/classifier4j-devel >>>> >>>> >>> >>> >>>------------------------------------------------------- >>>This SF.Net email is sponsored by: Oracle 10g >>>Get certified on the hottest thing ever to hit the market... >>>Oracle 10g. >>>Take an Oracle 10g class now, and we'll give you the exam FREE. >>>http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >>>_______________________________________________ >>>Classifier4j-devel mailing list >>>Cla...@li... >>>https://lists.sourceforge.net/lists/listinfo/classifier4j-devel >>> >>> >>> >>------------------------------------------------------- >>This SF.Net email is sponsored by: Oracle 10g >>Get certified on the hottest thing ever to hit the market... Oracle >> >> >10g. > > >>Take an Oracle 10g class now, and we'll give you the exam FREE. >>http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >>_______________________________________________ >>Classifier4j-devel mailing list >>Cla...@li... >>https://lists.sourceforge.net/lists/listinfo/classifier4j-devel >> >> > > > > >------------------------------------------------------- >This SF.Net email is sponsored by: Oracle 10g >Get certified on the hottest thing ever to hit the market... Oracle 10g. >Take an Oracle 10g class now, and we'll give you the exam FREE. >http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click >_______________________________________________ >Classifier4j-devel mailing list >Cla...@li... >https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > |
From: <br...@bj...> - 2004-05-24 21:52:01
|
(this may be outdated.. I tried sending this a few days ago and I got a mail rejected message from sourceforge.net.. I have contacted them about this issue) Not sure if this helps or not (I havent been following this whole email thread).. but here is the code I use for classifying emails as spam: ... Class.forName("com.mysql.jdbc.Driver"); DriverMangerJDBCConnectionManager cm = new DriverMangerJDBCConnectionManager("jdbc:mysql://localhost/webgate", "dbuser", "dbpass"); JDBCWordsDataSource wds = new JDBCWordsDataSource(cm); BayesianClassifier classifier = new BayesianClassifier(wds); String classifyString = subject + " " + body; classifier.teachMatch("spam", classifyString); ... This just classifies a string in a "spam" category. Hope this helps. - Brent > -----Original Message----- > From: np...@nt... [mailto:np...@nt...] > Sent: Sunday, May 16, 2004 4:00 PM > To: cla...@li... > Cc: np...@nt... > Subject: Re: Re: [Classifier4j-devel] Training and Classifying > > Hi Nick, > > The problem I am having is as follows: > > I wish to classify an unknown article against trained data, which is > in the "word_probability" table. > > So, to do this I must use BayesianClassifier as follows: > > 1. Create an instance of JDBCWordsDataSource > 2. Create a new instance of BayesianClassifier, passing the > JDBCWordsDataSource to the constructor > 3. Call BayesianClassifier.classify(String) > > However, by creating an instance of JDBCWordsDataSource (1), the > constructor for this method creates a new table. > ------ which I do not want because I will have to delete the table > containing all my trained data. > > What I want is to be able to retrieve my trained data (in a > JDBCWordsDataSource??), then classify the new article against this > data. > > How can I do this? I would very much appreciate some sample code. > > Regards > > Nigel > > > > > > > > > > > > > From: Nick Lothian <ni...@ma...> > > Date: 2004/05/15 Sat PM 01:24:43 GMT > > To: cla...@li... > > Subject: Re: [Classifier4j-devel] Training and Classifying > > > > > > Are you referring to the Trainer.java & Analyser.java in the > > net.sf.classifier4J.demo package? If so then these are > intended as a > > demo only (and a pretty old one at that). > > > > Classifier4J supports training directly via the teachMatch > method (see > > > http://classifier4j.sourceforge.net/xref/net/sf/classifier4J/b > ayesian/BayesianClassifier.html#193). > > > > That supports data persistence via the IWordsDataSource interface, > > in-particular the JDBCWordsDataSource > > <../../../../net/sf/classifier4J/bayesian/JDBCWordsDataSource.html> > > implementation of it. > > > > Nick > > > > np...@nt... wrote: > > > > >Hi, > > > > > >I would like to use Classifier4J to input "good" and "bad" > articles; > > >then provide the system with a new article for classification. > > > > > >Using Trainer.java I can input my articles into a table. > However, how do I continue adding articles to the table? > > >The following line in Trainer.java always creates a new > table, I would like to continue adding to an existing table: > > > > > >JDBCWordsDataSource wds = new JDBCWordsDataSource(cm); > > > > > >I have a similar problem with Anayser.java. I would like the new > > >article to be classified against the table built from > Trainer.java. > > >However, the setupClassifier method in Analyser.java > always creates a new table. > > > > > >I would very much appreciate any advice. > > > > > >Regards > > > > > >----------------------------------------- > > >Email provided by http://www.ntlhome.com/ > > > > > > > > > > > > > > >------------------------------------------------------- > > >This SF.Net email is sponsored by: SourceForge.net > Broadband Sign-up > > >now for SourceForge Broadband and get the fastest > > >6.0/768 connection for only $19.95/mo for the first 3 months! > > >http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > > >_______________________________________________ > > >Classifier4j-devel mailing list > > >Cla...@li... > > >https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: SourceForge.net > Broadband Sign-up > > now for SourceForge Broadband and get the fastest > > 6.0/768 connection for only $19.95/mo for the first 3 months! > > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > > _______________________________________________ > > Classifier4j-devel mailing list > > Cla...@li... > > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > > > ----------------------------------------- > Email provided by http://www.ntlhome.com/ > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g Get certified on the > hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel > > |
From: Neil G. <np...@nt...> - 2004-05-18 10:44:02
|
Hi Nick, Thanks for the reply, however there is still a problem. The following outlines how I am trying to access the data I have trained in the "word_probability" table. I wish to classify the String 'contents' against the data in the "word_probability" table: DriverMangerJDBCConnectionManager cm = new DriverMangerJDBCConnectionManager(connString, user, pw); JDBCWordsDataSource wds = new JDBCWordsDataSource(cm); ITokenizer tokenizer = new DefaultTokenizer(); IClassifier classifier = new BayesianClassifier(wds, tokenizer); classifier.classify(contents); I get a system error message saying: net.sf.classifier4J.bayesian.WordsDataSourceException: Problem creating table and Caused by: java.sql.SQLException: ERROR: Relation 'word_probability' already exists It seems as if the JDBCWordsDataSource constructor is always trying to create a table, when I just want to be able to read from the existing table! I am using a Postgres SQL database. I would very much appreciate you providing some code which enables me to read from the "word_probability" table and then use the standard methods to classify an unknown string. Regards Neil > -----Original Message----- > From: cla...@li... [mailto:classifier4j- > dev...@li...] On Behalf Of Nick Lothian > Sent: 18 May 2004 00:11 > To: 'cla...@li...' > Subject: RE: [Classifier4j-devel] Training and Classifying > > > > > Hi Nick, > > > > The problem I am having is as follows: > > > > I wish to classify an unknown article against trained data, > > which is in > > the "word_probability" table. > > > > So, to do this I must use BayesianClassifier as follows: > > > > 1. Create an instance of JDBCWordsDataSource > > 2. Create a new instance of BayesianClassifier, passing the > > JDBCWordsDataSource to the constructor > > 3. Call BayesianClassifier.classify(String) > > > > However, by creating an instance of JDBCWordsDataSource (1), the > > constructor for this method creates a new table. > > ------ which I do not want because I will have to delete the table > > containing all my trained data. > > > > Well that would be bad! > > Fortunatly you are incorrect about the constructor: JDBCWordsDataSource > detects is the table already exists, and if the table does exist it > doesn't > create it - see > <http://classifier4j.sourceforge.net/xref/net/sf/classifier4J/bayesian/J DB > CW > ordsDataSource.html#235> > > (Note that there is a bug in the 0.5 version of Classifer4J here if you > are > using MySQL on Unix - the table names are case sensitive. Get the CVS > version). > > If you are actually seeing this behaviour (the table being dropped) could > you please give more details about the database you are using, etc. > > > > What I want is to be able to retrieve my trained data (in a > > JDBCWordsDataSource??), then classify the new article against > > this data. > > > > How can I do this? I would very much appreciate some sample code. > > > > It sounds like you are on the right track. Let me know if you try this and > it doesn't work. > > Nick > > > ------------------------------------------------------- > This SF.Net email is sponsored by: SourceForge.net Broadband > Sign-up now for SourceForge Broadband and get the fastest > 6.0/768 connection for only $19.95/mo for the first 3 months! > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel |
From: Nick L. <ni...@ma...> - 2004-05-18 11:48:46
|
Neil Gandhi wrote: >Hi Nick, > >Thanks for the reply, however there is still a problem. > >The following outlines how I am trying to access the data I have trained >in the "word_probability" table. I wish to classify the String >'contents' against the data in the "word_probability" table: > > DriverMangerJDBCConnectionManager cm = new >DriverMangerJDBCConnectionManager(connString, user, pw); > > JDBCWordsDataSource wds = new JDBCWordsDataSource(cm); > > ITokenizer tokenizer = new DefaultTokenizer(); > > IClassifier classifier = new BayesianClassifier(wds, tokenizer); > > classifier.classify(contents); > > >I get a system error message saying: >net.sf.classifier4J.bayesian.WordsDataSourceException: Problem creating >table > >and > >Caused by: java.sql.SQLException: ERROR: Relation 'word_probability' >already exists > >It seems as if the JDBCWordsDataSource constructor is always trying to >create a table, when I just want to be able to read from the existing >table! > >I am using a Postgres SQL database. > >I would very much appreciate you providing some code which enables me to >read from the "word_probability" table and then use the standard methods >to classify an unknown string. > > > The code you have is exactly correct. As previously noted, Classifier4J has been tested with MySQL & HSQLDB. PostgreSQL _should_ work fine with it, provided the JDBC driver supports the database metadata functionality correctly. Is PostgreSQL case sensitive with respect to table names? If so there was a bug we fixed after the 0.5 release that may effect you. Get the latest version from CVS, (or do the change yourself: see <http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/src/java/net/sf/classifier4J/bayesian/JDBCWordsDataSource.java?r1=1.13&r2=1.14>) If that doesn't help, can please send the result set returned from the following code: DatabaseMetaData dbm = con.getMetaData(); ResultSet rs = dbm.getTables(null, null, "word_probability", null); Nick |
From: Neil G. <np...@nt...> - 2004-05-18 14:24:29
|
Hi Nick, > Is PostgreSQL case sensitive with respect to table names? PostgreSQL is not case-sensitive - I have tested this. > If that doesn't help, can please send the result set returned from the > following code: > > DatabaseMetaData dbm = con.getMetaData(); > ResultSet rs = dbm.getTables(null, null, "word_probability", null); I'm not too sure what you mean when you say to "send the result set" to you. What method should I call on the ResultSet. When I do: ResultSetMetaData rM = rs.getMetaData(); String name = rM.getColumnName(2); The value of name is "table_schem". Shouldn't it be one of the names of the columns in the word_probability table?? Regards Neil > -----Original Message----- > From: cla...@li... [mailto:classifier4j- > dev...@li...] On Behalf Of Nick Lothian > Sent: 18 May 2004 12:51 > To: cla...@li... > Subject: Re: [Classifier4j-devel] Training and Classifying > > Neil Gandhi wrote: > > >Hi Nick, > > > >Thanks for the reply, however there is still a problem. > > > >The following outlines how I am trying to access the data I have trained > >in the "word_probability" table. I wish to classify the String > >'contents' against the data in the "word_probability" table: > > > > DriverMangerJDBCConnectionManager cm = new > >DriverMangerJDBCConnectionManager(connString, user, pw); > > > > JDBCWordsDataSource wds = new JDBCWordsDataSource(cm); > > > > ITokenizer tokenizer = new DefaultTokenizer(); > > > > IClassifier classifier = new BayesianClassifier(wds, tokenizer); > > > > classifier.classify(contents); > > > > > >I get a system error message saying: > >net.sf.classifier4J.bayesian.WordsDataSourceException: Problem creating > >table > > > >and > > > >Caused by: java.sql.SQLException: ERROR: Relation 'word_probability' > >already exists > > > >It seems as if the JDBCWordsDataSource constructor is always trying to > >create a table, when I just want to be able to read from the existing > >table! > > > >I am using a Postgres SQL database. > > > >I would very much appreciate you providing some code which enables me to > >read from the "word_probability" table and then use the standard methods > >to classify an unknown string. > > > > > > > > The code you have is exactly correct. > > As previously noted, Classifier4J has been tested with MySQL & HSQLDB. > PostgreSQL _should_ work fine with it, provided the JDBC driver supports > the database metadata functionality correctly. > > Is PostgreSQL case sensitive with respect to table names? If so there > was a bug we fixed after the 0.5 release that may effect you. Get the > latest version from CVS, (or do the change yourself: see > <http://cvs.sourceforge.net/viewcvs.py/classifier4j/Classifier4J/src/jav a/ > net/sf/classifier4J/bayesian/JDBCWordsDataSource.java?r1=1.13&r2=1.14>) > > If that doesn't help, can please send the result set returned from the > following code: > > DatabaseMetaData dbm = con.getMetaData(); > ResultSet rs = dbm.getTables(null, null, "word_probability", null); > > Nick > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: SourceForge.net Broadband > Sign-up now for SourceForge Broadband and get the fastest > 6.0/768 connection for only $19.95/mo for the first 3 months! > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click > _______________________________________________ > Classifier4j-devel mailing list > Cla...@li... > https://lists.sourceforge.net/lists/listinfo/classifier4j-devel |