Writing Rules

The rules configuration defines the checks that are done on the messages. Each rule has a unique identifier, writen in all caps, and can have multiple options.

After all rules are checked a final score is provided for the message and according to the required_score option the message is marked as spam.

Defining a rule

Every rule must be in the following format:

<rule type>     <rule identifier>   <value>

Simple rule definition example:

body        LOOK_FOR_TEST   /test/

Where body is the rule type, LOOK_FOR_TEST is the identifier and the value is /test/. This rule will look for the string “test” in the body of the message and the rule will be triggered if it is found. When a rule is triggered the corresponding score is added to the total score of the message.

For every message all defined rules are checked and the score applied with the following exceptions:

  • Any rule that has an identifier starting with __ will not be checked.
  • Any rule that has a score of 0 will not be checked.

Note

Rules that are not checked can still be used in combination with other rules. See the meta rule type for more details.

Rule options

Additional options can be configured to any rule in the following format:

<option name> <rule identifier> <value>

The parser will use the unique identifier to configure the option to the specific rule with the same name. The option doesn’t have to be added immediately after the rule definition (i.e. the next line), but it has to be somewhere after the initial rule definition.

Note

Defining the same rule or option twice will override the previous value.

Scoring option

Any rule defined will have by default a score of 1.0. This can be adjusted by using the score option:

  • A positive score means that the message is more likely to be spam
  • A negative score means that the message is more likely to be legitimate
  • A score of 0 disables the rule

Examples:

body    LOOK_FOR_TEST /test/
score   LOOK_FOR_TEST 1.5

header  LOOK_FOR_SUBJECT_TEST Subject =~ /test/
score   LOOK_FOR_SUBJECT_TEST -5

More advance scoring can be specified for any rule depending on whether the Bayesian classifier and network tests are activated. For example:

body    LOOK_FOR_TEST /test/
score   LOOK_FOR_TEST 1 1.5 0.5 3

For the advanced scoring the following final score will be used:

  • The first score if the Bayesian classifier and networks tests are disabled (for this case 1)
  • The second score if the Bayesian classifier is disabled but the networks tests are enabled (for this case 1.5)
  • The third score if the Bayesian classifier is enabled but the networks tests are disabled (for this case 0.5)
  • The fourth score if the Bayesian classifier and the networks tests are enabled (for this case 3)

Note

This configuration is optional and any rule that doesn’t have it will get the default score of 1.0.

Describe option

The describe option can be used to provide a small text that describes what the rule is doing. This text is useful when debugging and when generating various reports.

Example configuration:

report ==== Start report ====
report _REPORT_

body        LOOK_FOR_TEST /test/
describe    LOOK_FOR_TEST Look for the test string in the body.

And the result for a message that matches:

$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?

This is a test.


==== Start report ====

* 1.0 LOOK_FOR_TEST BODY: Look for the test string in the body.

For more details on the report see the report section of the documentation.

Note

This configuration is optional and any rule that doesn’t have it will get “No description available”.

Priority option

This option can be used to prioritize rules to be evaluated before others. By default the rules are checked in the order they are defined in the config file and their priority value is 0. A negative priority will leave the evaluation at the end. Also note that the value of the priority must be integer.

Example configuration:

body    TEST_RULE1  /test/
body    TEST_RULE2  /test/
body    TEST_RULE3  /test/
priority TEST_RULE2 5
priority TEST_RULE1 -1

They will be evaluated in the next order:

TEST_RULE2
TEST_RULE3
TEST_RULE1

Note

This configuration is optional and any rule that doesn’t have it will get the priority 0.

Lang option (Locali[sz]ation)

The lang option can be used to provide text in a specific language. A line starting with the text lang xx will only be interpreted if the user is in that locale, allowing test descriptions and templates to be set for that language.

Rule option that enables using localized translations for rule descriptions and reports:

The locales string should specify either both the language and country, e.g. lang pt_BR, or just the language, e.g. lang de.

lang nl describe <RULE IDENTIFIER> <translated text> lang nl report <translated text>

Example configuration:

report ==== Start report ====
report _REPORT_

body        LOOK_FOR_TEST /test/
describe    LOOK_FOR_TEST Look for the test string in the body.
lang en describe    LOOK_FOR_TEST Description in en.
lang en report      Look for the test string in the body.

And the result for a message that matches:

$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?

This is a test.


==== Start report ====

Look for the test string in the body.
* 1.0 LOOK_FOR_TEST BODY: Description in en.

For more details on the report see the report section of the documentation.

Note

lang nl describe <RULE IDENTIFIER> <translated text> If the language specified as a second parameter correspond with locales, description for RULE IDENTIFIER will be overwritten.

Tflags option

Used to set flags on a test. These flags are used in the score-determination back end system for details of the test’s behaviour.

tflags <TEST_NAME> <net|nice|learn|userconf|noautolearn>
net
The test is a network test, and will not be run in the mass checking system or if -L is used, therefore its score should not be modified.
nice
The test is intended to compensate for common false positives, and should be assigned a negative score.
userconf
The test requires user configuration before it can be used.
learn
The test requires training before it can be used.
noautolearn
The test will explicitly be ignored when calculating the score for learning systems.

Example configuration:

report ==== Start report ====
report _REPORT_

body        LOOK_FOR_TEST /test/
tflags      LOOK_FOR_TEST nice

And the result for a message that matches:

$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?

This is a test.


==== Start report ====

* -1.0 LOOK_FOR_TEST BODY

Note

This configuration is optional and any rule that doesn’t have it will get the default value False.