Menu

MessageParser_Plugin

nanotube

The MessageParser plugin allows you to set custom regexp triggers, which will trigger the bot to respond if they match anywhere in the message. This is useful for those cases when you want a bot response even when the bot was not explicitly addressed by name or prefix character. Read on to see how to use it, and just how useful it can be.

The code for this plugin lives in the git repository of the gribble sourceforge project.

Commands

The following commands are available in the plugin:

  • add - add a trigger and its corresponding response command
  • remove - remove a trigger and its corresponding response command
  • show - show a trigger and its corresponding response command
  • info - show all info about a trigger and its corresponding response command
  • list - show all triggers and their ids
  • rank - show the 20 top-used triggers in rank order, along with actual usage counts
  • lock - lock a trigger; a locked trigger cannot be overwritten or deleted
  • unlock - unlock a trigger; an unlocked trigger can be overwritten or deleted
  • vacuum - vacuum the sqlite3 database

MessageParser.add

To add a trigger, use the obviously-named "messageparser add" command. It takes two arguments, the regexp (using standard python style regular expressions), and the command that is executed when it matches. If either of those contain spaces, they must be quoted. If they contain quotes, those quotes must be escaped.

Here is a basic example command:

messageparser add "some stuff" "echo I saw some stuff!"

Once that is added, any message that contains the string "some stuff" will cause the bot to respond with "I saw some stuff!".

The response command string can contain placeholders for regexp match groups. These will be interpolated into the string. Here's an example:

messageparser add "my name is (\w+)" "echo hello, $1!"

If you then send a message "hi, my name is bla", the bot will respond with "hello, bla!".

If more than one regexp trigger matches, each one will cause its respective response. If one regexp matches multiple times in a message, it will cause multiple responses.

Tip: If you want to avoid multiple occurrences of the match in the line triggering multiple times, just add a ".*" to your regexp, to consume the rest of the line after the first match.

You can use arbitrary supybot commands as the action - be creative, and don't limit yourself to 'echo'. Your imagination is the limit! You'll probably also enjoy the useful examples section.

regexp uniqueness

The regexp triggers are set to be unique - if you add the same regexp on top of an existing one, its response string will be overwritten.This is deliberate, to avoid accidental spam from multiple instances of the same regexp. If, however, you really want multiple responses to happen to one trigger, you can always tweak your regexp with some non-matching groups. My favorites for this are '(?i)', which causes regexp to be non-case-sensitive, but doesn't consume any characters, and '(?m)', which causes the regexp to be multiline, but also doesn't consume any characters. See the python documentation on the re module for more info.

So, for example, if you want to set multiple triggers on someone saying "stuff", you could add triggers for "stuff", "(?m)stuff", "(?m)(?m)stuff", "(?m)(?m)(?m)stuff", etc. If you want it to be case-insensitive, you can use (?i) to the same effect.

But generally it's a good idea to avoid spamminess. :)

MessageParser.remove

You can remove a trigger using the remove command, by specifying the verbatim regexp you want to remove the trigger for, or by id if you add the --id option. Here's a simple example:

messageparser remove "some stuff"

This would remove the trigger for "some stuff" if you have set one.

If you know the trigger id (which can be seen in 'list' output, or in 'info' output), you can remove by id. E.g.:

messageparser remove --id 12

would remove the regexp entry with id 12.

MessageParser.show

You can show the contents of the response string for a particular trigger by using the show command, and specifying the verbatim regexp you want to display, or by id if you add the --id option. Here's an example:

messageparser show "my name is (\w+)"

will display the trigger with its associated response string, or, by id:

messageparser show --id 12

will display the trigger and response with id 12.

MessageParser.list

The list command will list all the regexps which are currently in the database, with their corresponding ids in parentheses. It takes no agruments. If you send this out of channel, specify channel name as argument. Here's an example of its use:

messageparser list

MessageParser.rank

The plugin by default keeps statistics on how many times each regexp was triggered. Using the rank command you can see the regexps sorted in descending order of number of trigger count. The number in parentheses after each regexp is the count of trigger occurrences for each. The maximum number of entries shown by this command is controlled with the rankListLength config.

If you overwrite an existing regexp by using 'add' on an existing regexp, its usage count will be preserved.

MessageParser.vacuum

This commands vacuums the sqlite3 database of regexps. By default it requires the 'admin' capability to run. This can be changed through the requireVacuumCapability config.

Configuration

Supybot configuration is self-documenting. Run

config list plugins.messageparser

for list of config keys, and

config help <config key>

for help on each one.

Useful examples

Here are a few examples of how this plugin can be extremely useful.

Single word in-line commands

messageparser add ",,(\w+)" "$1"

This one causes the bot to take one-word commands from in-message, if they're preceded by double-comma. So you could send a message like "Show me your ,,version and your ,,uptime", and you'd get two responses back, one with version, one with uptime. Also very useful for calling up factoids in-message. E.g. "see the ,,rsync docs" will output the rsync factoid, if the plugin is loaded and the rsync factoid exists.

Multi-word in-line commands

messageparser add ",,\(([^\)]*?)\)" "$1"

This one causes the bot to take multi-word commands from in-message, if they're preceded by double-comma and open-parenthesis, and closed with close-parenthesis. So you could send a message like "I'd like a ,,(factoids search ) please", and you'd the output of command 'factoids search '.

Output-redirect style addressing

messageparser add "^\)(.+)\s+\|\s+([\w\-\[\]\`^{}\|]+)" "echo $2: [$1]"

This implements the "output redirect" style functionality, so you can have the bot address a nick of your choice with the output of a command. So, e.g., a message like ")version | JoeSmith" will cause the bot to respond with "JoeSmith: <version output here>". Note here that you don't want the starting character to be your bot's regular trigger character, since messageparser ignores commands directly addressed to the bot to avoid unintentional 'double-output'. Note also that the complexity of the second match group is due to it including all characters allowed in IRC nicks.

Ex-post typo correction

Thanks to Joe Julian for this idea.

mp add "^(s/(.+)/(.*)/([gi]*))$" "echo What $nick meant to say was: [re \"$1\" [histsearch \"$2\" [cif [match i \"$4 \"] \"echo i\" \"utilities ignore\"]]]"
alias add histsearch "last --from [echo $nick] --regexp \"/^(?!s\/).*$1.*/@1\" --in [echo $channel]"

Here we allow a user to enter a sed regexp expression in a message, using the forward slash as the delimiter (a message of this form: "s/pattern/replacement/flags"). The bot then goes and looks up the last message posted by the user that matches the pattern expression, puts the message through substitution by the replacement expression, and output- a corrected message.

The setup here is rather complicated, so let's go through it bit by bit.

  • "^(s/(.+)/(.)/([gi]))$" - This matches any message that is of the form "s/pattern/replacement/flags". Replacement may be empty, but pattern may not be. Allowable flags are 'g', for global, and 'i', for case-insensitive. The match groups in this regexp are read from left to right by the opening parenthesis. Thus, group 1 is the whole expression, group 2 is the pattern expression, group 3 is the replacement expression, and group 4 is the flags.
  • [cif [match i \"$4 \"] \"echo i\" \"utilities ignore\"] - This gives the 'i' flag to the histsearch command as an argument, if the 'i' flag was given in the sed regexp. Note the quotes and a space around '$4' - we are making sure we are not passing an empty string to match, even if group 4 is empty.
  • The histsearch command looks up the last message in the channel from the user, which matches the pattern part of the sed expression, as long as the message itself is not a sed expression (doesn't start with 's/'). Takes optional argument (@1) which is the expression modifier flag (the only useful one is 'i' for case-insensitivity).
  • The nested re command call uses the whole sed expression as its regexp, then passes the pattern part to histsearch as the first argument, and optionally the 'i' flag, if present. The conditional cif nesting is due to the fact that 'g' is not a valid flag for a matching regexp, which is what the last command is using to find the matching message. Thus we cannot just pass all the flags on to the last command.
  • Note also the quotes around the arguments passed to re and histsearch. These are necessary to be able to pass the regexp literally, and as one argument. Otherwise, spaces in the regexp would break up the argument, and regexp character classes (which are represented with square brackets) would be treated as nested commands, which is not what we want.

One quirk of this is that backreferences, which usually can be passed directly to the re command as '\1', for example, have to be double-backslashed. Otherwise, in parsing the string, the bot takes that to mean the literal \x01 character.

The result of this can be seen in the following sample session:

&lt;nanotube&gt; fill out tihs from
&lt;nanotube&gt; s/from/form/
&lt;gribble&gt; What nanotube meant to say was: fill out tihs form
&lt;nanotube&gt; s/tihs from/this form/
&lt;gribble&gt; What nanotube meant to say was: fill out this form
&lt;nanotube&gt; stuff
&lt;nanotube&gt; s/([stuf])/\\1\\1/g
&lt;gribble&gt; What nanotube meant to say was: ssttuuffff

More examples

If you come up with other useful examples, please do let me know and I'll include them here.


Related

Wiki: Conditional_Plugin
Wiki: Gribble_Project_Git_Repository
Wiki: Plugin_testing

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.