1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

Main Page

From mysqlftppc

Jump to: navigation, search

日本語

Contents

MySQL full-text parser plugin collection

MySQL 5.1 and later, full-text parser can be plugged to swap default builtin parser with user provided one. MySQL full-text parser plugin collection project (mysqlftppc) now provides following full-text parser plugins. Latest mysqlftppc is 1.6.1.

Installation & Settings

Requirements

If you're running on Fedora14 (for example), add mysql, mysql-devel, mecab, mecab-ipadic and libicu-devel packages.


compiling

If you download tar-ball, extract it. Run configure script. If the script could not find mysql_config, pass --mysql-config=/path/to/mysql_config argument at least. If your mysqld is compiled in 64bit, or debug=full, be careful to supply extra appropriate CFLAGS.

installation

After you issue 'make install', the plugin must be loaded into mysqld daemon process. Connect to mysql with administrative user, and run INSTALL PLUGIN sql.

INSTALL PLUGIN bigram SONAME 'libftbigram.so'

my.cnf

mysqlftppc plugins (except snowball plugin) are not affected by the setting of MySQL system variable ft_min_word_len and ft_max_word_len. Upper limit of the word length is 254 bytes and it is hardcoded (same with HA_FT_MAXBYTELEN of MyISAM).

If you use skip-grant-tables option, you might want to load the plugin at server startup. Use plugin-load as following in my.cnf

[mysqld]
plugin-load=space=libftspace.so
;; If you have multiple plugins:
;; plugin-load=space=libftspace.so:mecab=libftmecab.so

Using subversion

If you want to use the latest source code, please check out the source from subversion repository. You can generate configure script like this:

$ svn co http://mysqlftppc.svn.sourceforge.net/svnroot/mysqlftppc/mecab/trunk/ mecab
$ cd mecab
$ aclocal
$ libtoolize --automake
$ automake --add-missing
$ automake
$ autoconf
$ ./configure --with-mysql-config=/path/to/mysql_config --with-mecab-config=/path/to/mecab/bin/mecab-config

Unicode normalization

Current MySQL implementation of Unicode collation algorithm is not complete, but is really useful in real application. If your application is fine with that default collation implementation, you don't have to compile the plugins with ICU library. You can use MySQL collation, which can be defined at CREATE TABLE statement.

Examples:

mysql> SELECT 'ガギグゲゴ'='カキクケコ' COLLATE utf8_unicode_ci;
+-------------------------------------------------------------+
| 'ガギグゲゴ'='カキクケコ' COLLATE utf8_unicode_ci           |
+-------------------------------------------------------------+
|                                                           1 |
+-------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT '㍉'='ミリ' COLLATE utf8_unicode_ci;
+----------------------------------------+
| '㍉'='ミリ' COLLATE utf8_unicode_ci    |
+----------------------------------------+
|                                      1 |
+----------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT 'ガ'='ガ' COLLATE utf8_unicode_ci;
+----------------------------------------+
| 'ガ'='ガ' COLLATE utf8_unicode_ci      |
+----------------------------------------+
|                                      1 |
+----------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT '①'='1' COLLATE utf8_unicode_ci;
+-----------------------------------+
| '①'='1' COLLATE utf8_unicode_ci   |
+-----------------------------------+
|                                 1 |
+-----------------------------------+
1 row in set (0.01 sec)

You do have to compile the plugins with ICU library only when you want to control perfect Unicode normalization, typically when you want to decompose the string sequence or want to normalize into compatibility form (NFKC, NFKD). When you enable ICU and use unicode normalization, plugin will use more memory and CPU.

Reporting bugs

Please use the Tracker, when you have found a bug, have a question, or have something to report.

Personal tools