Tuesday, January 16, 2007

Exchange 2003 DR

Honestly, why does Exchange go down ONLY when I'm not in the building?

The trouble was diagnosed to a hardware fault with the fibre RAID array so I'm sorry Apple but your array is going elsewhere and will never darken my Exchange server again. Looks like a disk failed and the controller did not write the contents of the cache back to disk. Consequently the stores were not consistent and I got that nightmare of every Exchange admin - -1018 error.

Choices:
  • Aelita Recovery Manager from the clever people at quest.com (yes, but I only have my own copy not a corporate copy),
  • Exchange command line fix routine (not guaranteed to work)
  • Move mailboxes to a new store or new store on new server (corrupt emails will be deleted and there were a lot)
  • Restore from backup

Well, my last successful backup was Friday. Monday failed. Me tearing out what little hair I have left, users fretting and manager wanting to know ETA.

Restore from backup using the no loss option (do not delete TLs) which then mounts each store as it completes the restore and plays forward the logs which are on a different array. So everything plays forward consistently, no corrupt emails and the 40,000 of so emails sitting on the bridgehead can then flow in.

Users restored: 135
Data restored: 210GB
Users harmed in the making of this blog: 0 (so far)
Servers restored: 1
Neil Hobson's to thank: 1
Apple RAID arrays to be replaced with another manufacturer: 1

Moral of the story? Well, there are lots of them but a good one is to continue to use Veritas Backup Exec 8.6. Yes, that 5 year old piece of backup ingenuity that works. It doesn't mess about and it just gets on with it. Thank you Veritas - between you and the MVP program, email is running once again.

Carry on.

No comments: