In this article, we are going to see the most common issues that occurs during our troubleshooting in the member of Database Availability Group in the Exchange system.
Introduction:
Database Availability Group (DAG) takes care of the Mailbox High Availability, even though any of the server or the Database goes down and the mailboxes will be accessible by the end user through the passive copy. Hence the end-user will not see major difference. Even though users are able to access the mailbox, we need to work on fixing that and bring up that server or database. We are going to see the most common issues that occurs in DAG.
1.Quorum and File Share Witness (FSW):
The Word Quorum is derived from process of Parliamentary procedure, which means a majority of voting members must be involved or should be in online status to make a decision. File Share Witness is used to achieve the majority or maintain the quorum even if one server is down. FSW is just a folder that was created during DAG configuration and the folder path will be assigned to Witness Directory parameter of the DAG. FSW can be a Hub server. If we are using a non-exchange FSW, then we should add the Exchange Trusted Subsystem group to the local administrator groups.
2.Ignored File Share Witness:
File Share Witness will be ignored when there is an odd number of member servers that was added to the DAG. The member servers itself will form the quorum and thus the File Share Witness is ignored. This happens because the quorum is calculated using the formula Q= (Nodes/2) + 1. So, for a DAG with even number of members should always have a File Share Witness. Let’s discuss this with an Example.
A small company initially start their exchange configuration with 1 Mailbox server, 1 CAS server and 1 Transport Server. For Mailbox Site Resiliency, they create a second mailbox server in a Disaster Recovery site. Now they form a DAG with both the mailbox servers. Since the DAG has even number of member servers the Hub Server is added as a File share witness to maintain the votes. This configuration is called as “Node and File Share Majority”. However if that company adds one more mailbox server to the DAG in site 1, the FSW will automatically be ignored since there are odd number of member servers in the DAG and they can maintain the quorum themselves, and the quorum configurations automatically switches to “Node Majority”.
3.Split Brain Syndrome:
This is one of the interesting issues with DAG. Split Brain Syndrome is bit difficult to understand, hence we will try to put it in layman terms. Let’s say we have two Active Directory sites. DAG is configured with 2 Mailbox servers in each sites respectively. If the communications between two sites lost, then the passive server will think that active server is down. So it will try to mount the passive database. In this scenario, both servers will have active copies and users will be connected to the respective servers across different locations. Once the connectivity is resumed, then primary copy will take high priority and the passive copy will be dismounted. The data that was sent to the passive copy during this time will be lost. If the database itself is corrupted, then we can use the passive copy or we can try Exchange OST file recovery option.
4.Failure of Primary Active Manager (PAM):
The first joined member server will hold the Primary Active Manager, so when this goes down then the selection of best copy will become difficult, hence we move the PAM to a different server. This can be achieved via PowerShell Command.
Get-DatabaseAvailabilityGroup -Status -Identity “DAG_Name” | fl primaryActiveManager
This command will tell us which server is holds the Primary Active Manager Role and to change it we should use the below command.
Move-ClusterGroup –Cluster “DAG_Name” –Name “Cluster Group” –Node “Server Name”
Moving the PAM is a very rare scenario. We will move the PAM only when the server holding the PAM role goes down are unreachable.
Conclusion:
Thought DAG can encounter with lots of issues the above discussed are some of the primary issues and topics to be known, rest can be easily handled as these are the basics.
Author Introduction:
Sophia Mao is a data recovery expert in DataNumen, Inc., which is the world leader in data recovery technologies, including repair pst email corruption and word recovery software products. For more information visit www.datanumen.com
Leave a Reply