Writing Rules¶
The rules configuration defines the checks that are done on the messages. Each rule has a unique identifier, writen in all caps, and can have multiple options.
After all rules are checked a final score is provided for the message and according to the required_score option the message is marked as spam.
Defining a rule¶
Every rule must be in the following format:
<rule type> <rule identifier> <value>
Simple rule definition example:
body LOOK_FOR_TEST /test/
Where body is the rule type, LOOK_FOR_TEST is the identifier and the value is /test/. This rule will look for the string “test” in the body of the message and the rule will be triggered if it is found. When a rule is triggered the corresponding score is added to the total score of the message.
For every message all defined rules are checked and the score applied with the following exceptions:
- Any rule that has an identifier starting with __ will not be checked.
- Any rule that has a score of 0 will not be checked.
Note
Rules that are not checked can still be used in combination with other rules. See the meta rule type for more details.
Rule options¶
Additional options can be configured to any rule in the following format:
<option name> <rule identifier> <value>
The parser will use the unique identifier to configure the option to the specific rule with the same name. The option doesn’t have to be added immediately after the rule definition (i.e. the next line), but it has to be somewhere after the initial rule definition.
Note
Defining the same rule or option twice will override the previous value.
Scoring option¶
Any rule defined will have by default a score of 1.0. This can be adjusted by using the score option:
- A positive score means that the message is more likely to be spam
- A negative score means that the message is more likely to be legitimate
- A score of 0 disables the rule
Examples:
body LOOK_FOR_TEST /test/
score LOOK_FOR_TEST 1.5
header LOOK_FOR_SUBJECT_TEST Subject =~ /test/
score LOOK_FOR_SUBJECT_TEST -5
More advance scoring can be specified for any rule depending on whether the Bayesian classifier and network tests are activated. For example:
body LOOK_FOR_TEST /test/
score LOOK_FOR_TEST 1 1.5 0.5 3
For the advanced scoring the following final score will be used:
- The first score if the Bayesian classifier and networks tests are disabled (for this case 1)
- The second score if the Bayesian classifier is disabled but the networks tests are enabled (for this case 1.5)
- The third score if the Bayesian classifier is enabled but the networks tests are disabled (for this case 0.5)
- The fourth score if the Bayesian classifier and the networks tests are enabled (for this case 3)
Note
This configuration is optional and any rule that doesn’t have it will get the default score of 1.0.
Describe option¶
The describe option can be used to provide a small text that describes what the rule is doing. This text is useful when debugging and when generating various reports.
Example configuration:
report ==== Start report ====
report _REPORT_
body LOOK_FOR_TEST /test/
describe LOOK_FOR_TEST Look for the test string in the body.
And the result for a message that matches:
$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?
This is a test.
==== Start report ====
* 1.0 LOOK_FOR_TEST BODY: Look for the test string in the body.
For more details on the report see the report section of the documentation.
Note
This configuration is optional and any rule that doesn’t have it will get “No description available”.
Priority option¶
This option can be used to prioritize rules to be evaluated before others. By default the rules are checked in the order they are defined in the config file and their priority value is 0. A negative priority will leave the evaluation at the end. Also note that the value of the priority must be integer.
Example configuration:
body TEST_RULE1 /test/
body TEST_RULE2 /test/
body TEST_RULE3 /test/
priority TEST_RULE2 5
priority TEST_RULE1 -1
They will be evaluated in the next order:
TEST_RULE2
TEST_RULE3
TEST_RULE1
Note
This configuration is optional and any rule that doesn’t have it will get the priority 0.
Lang option (Locali[sz]ation)¶
The lang option can be used to provide text in a specific language. A line starting with the text lang xx will only be interpreted if the user is in that locale, allowing test descriptions and templates to be set for that language.
Rule option that enables using localized translations for rule descriptions and reports:
The locales string should specify either both the language and country, e.g. lang pt_BR, or just the language, e.g. lang de.
lang nl describe <RULE IDENTIFIER> <translated text> lang nl report <translated text>
Example configuration:
report ==== Start report ====
report _REPORT_
body LOOK_FOR_TEST /test/
describe LOOK_FOR_TEST Look for the test string in the body.
lang en describe LOOK_FOR_TEST Description in en.
lang en report Look for the test string in the body.
And the result for a message that matches:
$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?
This is a test.
==== Start report ====
Look for the test string in the body.
* 1.0 LOOK_FOR_TEST BODY: Description in en.
For more details on the report see the report section of the documentation.
Note
lang nl describe <RULE IDENTIFIER> <translated text> If the language specified as a second parameter correspond with locales, description for RULE IDENTIFIER will be overwritten.
Tflags option¶
Used to set flags on a test. These flags are used in the score-determination back end system for details of the test’s behaviour.
tflags <TEST_NAME> <net|nice|learn|userconf|noautolearn>
- net
- The test is a network test, and will not be run in the mass checking system or if -L is used, therefore its score should not be modified.
- nice
- The test is intended to compensate for common false positives, and should be assigned a negative score.
- userconf
- The test requires user configuration before it can be used.
- learn
- The test requires training before it can be used.
- noautolearn
- The test will explicitly be ignored when calculating the score for learning systems.
Example configuration:
report ==== Start report ====
report _REPORT_
body LOOK_FOR_TEST /test/
tflags LOOK_FOR_TEST nice
And the result for a message that matches:
$ ./scripts/match.py -t -C /root/myconf/ --sitepath /root/myconf/ < /root/test.eml
Subject: Do you think this is Spam?
This is a test.
==== Start report ====
* -1.0 LOOK_FOR_TEST BODY
Note
This configuration is optional and any rule that doesn’t have it will get the default value False.