MailEnable Enterprise Guide
Message Filtering / Bayesian filtering / Configuring Bayesian Filtering / Setting up auto-training Bayesian filtering
In This Topic
    Setting up auto-training Bayesian filtering
    In This Topic

    Bayesian Filtering is founded on having two pools of messages (good and bad) and creating a word dictionary that outlines the frequency of tokens (words or text snippets) within these messages. This dictionary allows MailEnable to analyze messages and provide a probability of a message being spam, as a new message can have its tokens compared against this dictionary. For example, if the token “FREE” occurs mostly in spam emails, but rarely in good emails and a new message has the token “FREE” in it, it is likely to be spam. As multiple tokens are used, the accuracy is improved. If an incoming email has the “FREE” token but also the token “mailenable”, which may appear only in good emails, then the good token will stop the email from being marked as spam.

    The effectiveness of this approach is determined by having good samples of spam and non-spam. The process of compiling a dictionary from samples of spam and non-spam is called ‘training’.

    MailEnable has four options for configuring Bayesian filtering:

    1. Auto-training
    2. Using the default dictionary
    3. Manual training via a command line utility and scripts
    4. A combination of both manual and auto-training

    Setting up auto or manual training (although not essential) allows the Bayesian filter to better detect spam by continuously updating and adding to the dictionary.

    The option of manually training the filter is a more complex process and is described in the Manual Training section.