MailEnable Enterprise Guide
Message Filtering / Bayesian filtering / Bayesian filter general settings
In This Topic
    Bayesian filter general settings
    In This Topic

    How to access Bayesian Filtering properties

    1. Navigate to the following location within the administration console: MailEnable Management > Servers > Localhost > Extensions > MailEnable Message Filter
    2. Click on MailEnable Message Filter to highlight the available filtering extensions on the right hand pane window
    3. Double click on MailEnable Bayesian Filter.




    MailEnable Dictionaries are located under Program Files\Mail Enable\Dictionaries. MailEnable provides a default dictionary that can be used with the filter. This dictionary is located in Program Files\Dictionary\default and is called MAILENABLE.TAB.  For more details please see the MailEnable Default Dictionary section.

    Options (Process HTML content in Messages)

    If this option is selected and the message contains HTML, then the HTML is parsed as well as the message plain/text boundary. Tokens will therefore also include data from the HTML messages. It makes the filter more likely to detect HTML as spam because the tokens/patterns of the HTML of bad messages can be used to calculate the probability of spam.

    Spam Calculation method

    When a message is split into its tokens/words for analysis each token in the message is given a probability of either being spam or non-spam.

    As such, MailEnable can be configured to use a number of methods for calculating the final probability of a message being spam

    Measure highest and lowest percentiles of the most frequent tokens - Only those tokens most frequently occurring in the message will be used/aggregated to measure the probability of the message being spam i.e. If this option is used, then messages containing multiple instances of a spam token will most likely be diagnosed as spam.

    Measure all tokens in the message - This means that all tokens occurring in the message will be used/aggregated to calculate the probability of the message being spam. The recommended method to use is: "Measure all tokens in the message" because it provides a more balanced calculation.

    Measure tokens within the highest and lowest percentiles - This means that only those tokens/words in the message that are most likely to denote the message as spam or non-spam are considered i.e. If this option is used, it will mean that a legitimate message containing the word 'viagra' would be more likely to be detected as spam.