Clustering with Pro: Internal error intermittently occuring

Raise/discuss any potential issues with MailEnable for consideration in project issue register.
Post Reply
rtpHarry
Posts: 5
Joined: Fri May 19, 2006 2:36 pm

Clustering with Pro: Internal error intermittently occuring

Post by rtpHarry »

Hi, we have MailEnable Pro v6.0 installed. It is set up on a load balanced server with mirroring between the two instances.

Some clients have been complaining that mails are not being sent or received. We have investigated this further and found that of the 15,105 emails we processed between the 16th-20th, there were 29 “Internal Errors”. These were split with 11 on server A, and 18 on server B, and 10 on SMTP to SMTP relays, and 19 on SMTP to Postboxes.

The error logs look like this when they occur:

Code: Select all

11/20/12 09:30:27         ME-MTA-ROUTE [427023440B4D4E3486940A07E770EEFB.MAI] from [SMTP] Connector queued to [SMTP] Connector as [559398DBFEA44CDF8D65FB7AA08355DB.MAI]
11/20/12 09:30:27         ME-MTA-ERROR Internal Error: (Error 5) Message (427023440B4D4E3486940A07E770EEFB.MAI) could not be transferred from [SMTP] Connector to [SMTP] Connector. The message could not be copied to \\directorybusiness.live\iis\MailEnable Config\Queues\SMTP\Outgoing\Messages\559398DBFEA44CDF8D65FB7AA08355DB.MAI. It has been bad mailed.
11/20/12 09:30:28         Debug 114: Message (427023440B4D4E3486940A07E770EEFB.MAI) has been copied to the system BadMail directory
11/20/12 09:30:28         Warning!- Could not delete Inbound Message: [427023440B4D4E3486940A07E770EEFB.MAI]. A filter or agent may have deleted the message.

11/18/12 19:48:26         ME-MTA-ROUTE [B042FFF7CA7F45A08FE0D005A97985C1.MAI] from [SMTP] Connector queued to [SF] Connector as [9A8DF95C913E4096AA6FC27D9E91F522.MAI]
11/18/12 19:48:26         ME-MTA-ERROR Internal Error: (Error 5) Message (B042FFF7CA7F45A08FE0D005A97985C1.MAI) could not be transferred from [SMTP] Connector to [SF] Connector. The message could not be copied to \\directorybusiness.live\iis\MailEnable Config\Queues\SF\Outgoing\Messages\9A8DF95C913E4096AA6FC27D9E91F522.MAI. It has been bad mailed.
11/18/12 19:48:26         Debug 114: Message (B042FFF7CA7F45A08FE0D005A97985C1.MAI) has been copied to the system BadMail directory
11/18/12 19:48:26         Warning!- Could not delete Inbound Message: [B042FFF7CA7F45A08FE0D005A97985C1.MAI]. A filter or agent may have deleted the message.

11/19/12 13:13:12         ME-MTA-ROUTE [AAFA61367DDB4EA2AF259061E0D34863.MAI] from [SMTP] Connector queued to [SF] Connector as [625444770929471C9403E63A28D743D9.MAI]
11/19/12 13:13:12         ME-MTA-ERROR Internal Error: (Error 5) Message (AAFA61367DDB4EA2AF259061E0D34863.MAI) could not be transferred from [SMTP] Connector to [SF] Connector. The message could not be copied to \\directorybusiness.live\iis\MailEnable Config\Queues\SF\Outgoing\Messages\625444770929471C9403E63A28D743D9.MAI. It has been bad mailed.
11/19/12 13:13:12         Debug 114: Message (AAFA61367DDB4EA2AF259061E0D34863.MAI) has been copied to the system BadMail directory
11/19/12 13:13:12         Warning!- Could not delete Inbound Message: [AAFA61367DDB4EA2AF259061E0D34863.MAI]. A filter or agent may have deleted the message.
Can anyone shed any light on this issue, or give me any thoughts on what I could try to do to resolve it / dig deeper?

MailEnable
Site Admin
Posts: 4441
Joined: Tue Jun 25, 2002 3:03 am
Location: Melbourne, Victoria Australia

Re: Internal error intermittently occuring

Post by MailEnable »

It looks like there is a connectivity or network stability problem accessing the file service \\directorybusiness.live\iis.
I suggest investigating whether thats the case by logging pings between the mailenable server and directorybusiness.live over a test period to see if connectivity is being lost.
Regards, Andrew

rtpHarry
Posts: 5
Joined: Fri May 19, 2006 2:36 pm

Re: Internal error intermittently occuring

Post by rtpHarry »

Thanks for your reply. I passed this to one of our devs and he came back with the following:
There is a DFS Event Log and that logs replication errors – there are a few, but none coincide with the times of the failed emails.

I believe the behaviour in such a situation is to just continue on without the missing replication partner and sync up later anyway.

I managed to dig out the replication debug logs that Windows keeps though, and it seems that it only receives a request to create the file, populate it (which all complete successfully before the Mail Transfer Agent picks it up) and then processes the delete operation at the same time the MTA logs record it as attempting but failing to do so:

11/29/12 11:29:04 Warning!- Could not delete Inbound Message: [07FF9C743842418CB435143069EC265F.MAI]. A filter or agent may have deleted the message.

There’s no file activity being recorded as attempted in between, so I have no idea what could be causing the MTA to fail to read these emails.
As I understand it, what he is saying is that the file system records the delete as successfully deleting the record but then the mail server reports an error. Additionally there is no record of any file operation where it is moving it from the inbound queue to the outbound queue.

Do you have any additional thoughts based on this?

MailEnable
Site Admin
Posts: 4441
Joined: Tue Jun 25, 2002 3:03 am
Location: Melbourne, Victoria Australia

Re: Internal error intermittently occuring

Post by MailEnable »

Having a closer look at the logging, the errors look like contention. ie: it looks like you may have both Professional Edition Servers sharing the same queues folder: \\directorybusiness.live\iis\MailEnable Config\Queues\SF\Outgoing\Messages

MailEnable Professional Edition is not cluster aware, so servers cant both share the same queues with Professional Edition.
The clustering locking mechanism is only activated in Enterprise Edition. Clustering is largely unchartered territory with Professional Edition.

If you are bound to using Professional Edition then the starting point is to only have one server active at a time.
It is either that or take the queues off the share and have them on discrete servers and just have the message store on a shared volume.

Also, the queues themselves are very volitile; and replicating is probably not ideal. DFS for the message store is more appropiate though.

Irrespective though, there are certain transitive system files that should not be replicated/backed up (ass follows).

* Anything matching "_activity.*" absolutely cannot be backed up or replicated. Doing so could cause system lockups.
The content of the _activity files are (very) volitile - so, there is nothing gained by backing them up. They exist under the POSTOFFICES folder.

* These ones should also be excluded (particularly in the queues folder, but also under the message store (POSTOFFICES folder) and CONFIG (Configuration Folder)): _change.*, *.blk, *.tmp, *.maid
Regards, Andrew

Post Reply