Content manipulation

September 22nd, 2009 Leave a comment Go to comments

Bad-word filter

IMSpector is able to remove offensive words from all IM messages. Three config options are used:

  • badwords_filename – Should be pointed at a file of naughty words, one per line.
  • badwords_replace_character – Should be a single character. Bad words will be replaced with the character. Default is an asterisk.
  • badwords_block_count – If a message contains more then this many bad words then the message will be completely blocked, not just replaced.

This filter is by no means uncircumventable, but the supplied list of naughty words is enough to filter most strong (English) swear words. The filter is not enabled by default.

This filter utilises the “categories” event field. If a replacement is made, the field will contain the number of replacements.

File-backed ACL filtering

In addition to content replacement, IMSpector is able to completely block messages and other events from reaching the recipient. This is useful for, say, limiting people to certain contacts, or from blocking a listed group of people. There are two implementations for this type of blocking: file based, and database (SQlite) based. In both cases, the mechanism used is a kind of ACL.

In the file-based filter, a single file holds the ACL. For example:

allow sales@hotmail.com client@hotmail.com company.com
allow admin@company.com
allow all support@company.com
deny all

The format of these lists is:

allow|deny localid|all [groupchat|remoteid1 ... remoteidN]

Lines are processed in file order, from top to bottom. The action is the first thing on the line, followed by the local ID, followed by an optional list of remote IDs. If the remote ID list is empty, then all remote IDs will match. Also, if the local ID is “all” then all local IDs will match. A special remote ID value of “groupchat” will match all groupchats. This can be used to block users from going into group chats, which might let them get around the ACL restrictions.

IDs can either be complete, such as user@company.com, or partial.

Thus the example above tranlates to:

  1. sales@hotmail.com can talk to client@hotmail.com and everyone at company.com.
  2. The local user admin@company.com can talk to anyone at all.
  3. The remote user support@company.com can talk to any local user.
  4. Otherwise the communications are blocked.

To enable ACL support, include the following options in the configuration file:

acl_filename=/path/to/file

Of course, the file must be readable by the user IMSpector runs as.

Database-backed filter

The DB-backed filter is not built by default. To build it, run make dbfilterplugin.so. SQLite client libraries are needed to build this plugin. It adds the following options to the config file:

  • db_filter_filename – The filename of the DB.

The table in this database will be called “lists”, and will be automatically created if needed:

CREATE_TABLE "CREATE TABLE IF NOT EXISTS lists (
id integer PRIMARY KEY AUTOINCREMENT,
localid text,
remoteid text,
action integer NOT NULL,
type integer NOT NULL,
timestamp integer NOT NULL );

localid and remoteid may be NULL, which means when a search is done for an entry, it will match any value. They can also be a domain, to match all users within the domain.

“action” can be one of the following values:

  • 1 – ACCEPT – allow the message to pass.
  • 2 – BLOCK – reject the message.
  • 3 – AWL – matching outgoing messages will have the localid and remote id automatically inserted as ACCEPT rules, so replies can pass.

“type” can be one of the following values:

  • 1 – MANUAL – the type value to use for manually added rules.
  • 2 – AUTO – AWL entries will be given ths type.

The timestamp is set when an entry is created by the AWL rules and is useful if one wished to, say, remove all entries over a month old.

The matching logic is as follows:

  1. First look for matches with action=ACCEPT, allowing the messages if we find any.
  2. If a message is outgoing, then look for action=AWL. If we find a match, then allow this message to pass, and automatically create a rule with action=ACCEPT and type=AUTO.
  3. Look for a match with action=REJECT, blocking the message if we find any.
  4. Finally, allow the message.

Note that the ordering of rules within the table is not relevent; only the fact that a match was found somewhere in the table is important.

Example: to enable AWL for all local users except the user example@company.com – which is to always be allowed, create three rows:

  1. localid=example@company.com, remoteid=NULL, action=ACCEPT, type=MANUAL
  2. localid=NULL, remoteid=NULL, action=AWL, type=MANUAL
  3. localid=NULL, remoteid=NULL, action=BLOCK, type=MANUAL

Note 1: because of the way SQLite works, you can freely modify the table from outside of IMSpetor, while it is still running, without any problems.

Note 2: What would happen if this filter was combined with the file-based ACL filter is unclear.

Other blocking

Also two additional, global, options are available that are applied regardless of the outcome of ACL processing:

block_files=on

This option will block all file-transfers on all protocols that IMSpector is watching and understand file-transfers.

block_webcams=on

This option will block all webcam sessions. Currently IMSpector can only spot webcam sessions on Yahoo.

Socket-API for filtering

IMSpector is able to talk to an external processes by way of a UNIX socket, in order to determine the fate of a message. This enables integrators to implement their filtering routines in any language that is able to listen on a socket. The hypothetical deamon that listens for these connections is called “censord”.

Presently only message events are handled by this plugin; all other event types are always allowed.

censord=on

This is the only option needed in IMSpector. The censord filtering plugin will connect to the UNIX socket at /tmp/.censord.sock and send the following information. All lines are CR+LF ended.

imspector-{incoming|outgoing}
protocol {im rotocol}
localid {local id}
remoteid {remote id}
charset UTF-8
length {count of message bytes}

{message bytes}

Currently IMSpector knows nothing about character sets, but the charset header is sent for future use.

The censoring deamon should respond with the following. Again CR+LF is the line ending.

{response}
result {category info}
length {count of message bytes}

{optional replacement bytes}

Response is one of the following:

  • BLCK – IMSpector should drop the message completely.
  • PASS – IMSpector should pass the message as is.
  • ERR! – Censoring server had an error.
  • MDFY – Censord has replcement text.

In the case of MDFY, the length returned must currently match the source length. The replacement bytes is the text that will be passed back.

In all cases, if a result header is present, this will be appened to the category field in the log entry. You could, for example, use this field to classify the type of profanity in the message.

Example:

IMSpector sends the following request.

imspector-incoming
protocol MSN
localid local@local.com
remoteid remote@remote.com
charset UTF-8
length 11

Mmmm pizza!

Censord responds with the following.

MDFY
result food
length 11

Mmmm *****!

This socket-API opens up the ability to implement blocking and censoring policies in any language that can manipluate a UNIX socket. In the future, it would be nice to include a simple censoring service, probably written in perl, into the IMSpector project.

  1. cpm
    April 22nd, 2009 at 11:18 | #1

    cpm :
    I think above information for implement Socket-API for filtering is not enough. Please can anyone explain in detail,how we can implement a socket-API in any programming language with imspector.

  2. leandronf
    January 30th, 2010 at 18:53 | #2

    I saw that there are several options for blocking but have not found one in which I need.

    I wonder if there is the possibility of allowing only certain logins (user @ domain) to connect to msn server, and block the rest.

    Thank you for your attention!

  3. February 15th, 2010 at 15:36 | #3

    Blocking is event based, and in IMSpector’s eyes login is not an “event”. It could be, however. I’ll look into how feasable this is…. Thanks for the comment!

  1. No trackbacks yet.
You must be logged in to post a comment.