Problems getting a DAG up and running.

Status
Not open for further replies.
L

Lain Robertson



Hi all,

I'm struggling here a little bit. At the moment, I'm working along with the following guide:
http://technet.microsoft.com/en-us/library/dd638129.aspx

and the following problem description:
http://social.technet.microsoft.com/Forums/en-US/exchange2010/thread/eced682e-b553-4667-a8c8-c7a4866c8643

I'm running the following commands from the powershell command line:
New-DatabaseAvailabilityGroup -Name DAG00 -WitnessServer fsw01.university.edu.au -WitnessDirectory "D:\Exchange\Witness\DAG00" -DatabaseAvailabilityGroupIpAddresses 10.1.1.1,10.1.5.1 <skipped the Set-DatabaseAvailabilityGroup command, because we don't have a backup witness server> Add-DatabaseAvailabilityGroupServer -Identity DAG00 -MailboxServer exc01.university.edu.au Add-DatabaseAvailabilityGroupServer -Identity DAG00 -MailboxServer exc02.university.edu.au

At this point everything has run with no errors, though I did stumble at the final at the fourth point originally, because while the script adds permissions for the DAG00$ computer account to the share, it had no rights to the file system.

It seems the longer I take to make this post, the more changes crop up. Firstly, I had this error when I went to refresh the GUI:

When I double checked this with the following commandlet, the error was actually coming up on the source server - the one we consider to be the primary. The commandlet was:

Get-DatabaseAvailabilityGroupNetwork | Sort-Object -Property Name

Anyway, half an hour passed while I got sidetracked helping someone else, and lo and behold, the commandlet finally returned both network definitions and the GUI reflected this. I was hoping all would be well, but then I noticed in the Networks tab that the second definition is incomplete, as shown below:

This is also confirmed with the output from the "Get-DatabaseAvailabilityGroupNetwork | Sort-Object -Property Name" commandlet. (Results below)

Identity ReplicationEnabled Subnets
-------- ------------------ ----- DAG00\DAGNetwork01 True {{10.x.x.0/21,Up}}
DAG00\DAGNetwork02 True {{10.x.y.0/24,Unknown}}

The only thing I can think of here is that within DNS, only the first of the two IP addresses show up - ie, only the first has been registered as an A record. This is despite providing both IPs on the command line as per the Technet article linked up the top of this post, and both being shown in the "Get-DatabaseAvailabilityGroup | fl" commandlet.

What I'm not sure about is if it's safe for me to manually enter the second A record into DNS, as I see that the one that was successfully added is not classed as static and has (by our standards) a very short TTL of 20mins. I'm assuming that if I add the second A record manually it's potentially going to cause issues at any kind of failover time.

Does anyone have any thoughts on this problem?

Cheers,
Lain
 
L

Lain Robertson



Actually, I just realised that I spent all my time describing the problem but not the environment.
The two members of the DAG are Exchange 2010. The witness server is Server 2008 R2 Core only, not a Hub transport, and it located at the primary site. The sites are physically separate, but the response times are around 50ms - well below the prescribed figure I read of 250ms maximum.

Cheers,
Lain
 
Z

Zahir Hussain Shah



Hi Lain,

Before going to help you for resolving your issue, I would like to ask some questions:
Did you added the Remote Witness Server Computer Account to the Exchange Trusted Sub-System Group in AD? You must be sure that this remote WITNESS Server on the Primary Site should have the FULL RIGHTS (ADMINISTRATIVE RIGHTS) on the Exchange Organization. You may also check the NTFS Permission for this COMPUTER OBJECT on the physical drive / FOLDER. Can you also confirm about the location of your MAILBOXES & CAS Servers among these two seperate sites? If you have one MBX in one site and another MBX is in another site, so you also have to give the TWO IP ADDRESSES on the for the DatabaseAvailabilityGroupIPAddresses for both subnets.

Try to put some more information here, so it will be handy for us to resolve the problem or find the actual cause of prolem.
Zahir

Zahir Hussain Shah Infrastructure Practice Consultant " Messaging & Unified Communication Engineer SMTP: zhshah@live.com | HTTP: www.zahirshahblog.com | Voice: 0092-50-8249903 Visit my blog for Exchange, AD & OCS Solutions: http://zahirshahblog.com
 
L

Lain Robertson



Hi Zahir and Martin,

I'll answer both sets of questions in point form:
Yes. I hadn't read this requirement anywhere so far, so I had not done this. Do you have a link to this information? I'd be happy to give this a try if I can find some technical material to reference this requirement. As I mentioned in the first post, I stumbled across this while working through the Technet article and fixed it straight away. There are three servers: two Exchange 2010 servers hosting the CAS+mailbox+hub roles, while the third server is a non-Exchange Server 2008 R2 Enterprise machine. The primary Exchange server is in the same site as the witness server, while the secondary Exchange server is in a separate subnet (and therefore site). As I already mentioned in the first post, I assigned two IP addresses to the DAG via the New-DatabaseAvailabilityGroup commandlet: one for each subnet.
Martin: Yeah, I was worried about the firewall. Initially, I enabled logging for dropped packets only (since I usually leave logging off altogether), but nothing was being trapped. Eventually, I ended up creating one rule on each of the Exchange servers allowing traffic to any local port from the opposing Exchange server, but this actually made no difference.
Ultimately, nothing changed until I had to disappear for half an hour, after which I moved on from the original problem show in the first picture to that in the second, so within reason, it appeared to resolve its own issues to a degree.

What I'll probably do tomorrow (since it's coming up to midnight here now) is tear down the entire DAG again, remove the shares and run through the same steps I listed above with a half an hour or more pause before attempting the adding of the second DAG member.

Thanks for the replies, guys.

Cheers,
Lain
 
Z

Zahir Hussain Shah



HI Lain,
Sounds good to me that you worked very nicely in the resolution of this problem.
Moreover, I'd say that use the below URL of my blog entry for creation DAG through Exchange Managmenet PowerShell:

http://zahirshahblog.com/2010/05/04/creating-dag-in-exchange-2010-using-exchange-management-shell/
Keep us posted..
Cheers!
Zahir

Zahir Hussain Shah Infrastructure Practice Consultant " Messaging & Unified Communication Engineer SMTP: zhshah@live.com | HTTP: www.zahirshahblog.com | Voice: 0092-50-8249903 Visit my blog for Exchange, AD & OCS Solutions: http://zahirshahblog.com
 
J

jader3rd



The FSW's don't need to be members of the Exchange Trusted Subsystems, it's just that if they're not you need to deal with configuring quorum because Exchange can't do it.

With your networks, which one do you want to be the Mapi network and which one is going to be the Replication network?

Do both of the Nics on both of the nodes have IP Addresses?

The IP Addresses for the replication network shouldn't be registered in DNS.
 
Z

Zahir Hussain Shah



Yes, If we are putting the Witness Directory on any of Exchange Server, so we dont need to add the Server account in the trusted sub-system group in AD, but lets say that you are going to configure the Witness Directory on any-non Exchange Server then, you must add that non-exchange computer account in the exchange-trusted-sub-system group in AD.

THE MAPI-Network of the DAG should be configured in the Failover Cluster Manager > Networks > take the properties of the MAPI network and configured it as for "CLIENT WILL USE THIS NETWORK FOR CONNECTING TO THE CLUSTER".
You can un-tick the above "client will use this network for connecting to the cluster" for NON-MAPI network, like your replication one.
and as per said by "JADER" you can configure the NETWORK PROPERTIES of NON-MAPI network (REPLICATION ONE) to disabled the NetBIOS and UNTICK for register this connection in the DNS.
Zahir

Zahir Hussain Shah Infrastructure Practice Consultant " Messaging & Unified Communication Engineer SMTP: zhshah@live.com | HTTP: www.zahirshahblog.com | Voice: 0092-50-8249903 Visit my blog for Exchange, AD & OCS Solutions: http://zahirshahblog.com
 
M

Martin Sundström



Good to hear, I'm glad that it seems to be working now. It might be a good thing to re-do the process. If nothing else you will at least have a good idea on how it works... And by the way, there is a wait time when working with DAGs as with any Microsoft product ;)

Martin Sundström | Microsoft Certified Trainer | MCITP: Enterprise Messaging Administrator | http://msundis.wordpress.com
 
M

Martin Sundström

Zahir, please don't mark every post you make as an answer to the threads. This will not help you to get more points! Martin Sundstr&ouml;m | Microsoft Certified Trainer | MCITP: Enterprise Messaging Administrator | http://msundis.wordpress.com
 
L

Lain Robertson



Hi Martin and co,

Martin, time was indeed the critical factor here, as when I followed the below process, the second node added fine:
Run from the primary site:
New-DatabaseAvailabilityGroup -Name DAG00 -WitnessServer fsw01.university.edu.au -WitnessDirectory "D:\Exchange\Witness\DAG00" -DatabaseAvailabilityGroupIpAddresses 10.1.1.1,10.1.5.1 Run from the primary site:
Add-DatabaseAvailabilityGroupServer -Identity DAG00 -MailboxServer exc01.university.edu.au Use repadmin /replicate to force replication of the default naming context, cn=DomainDNSZones and cn=ForestDNSZones across to the remote subnet; Wait for the remote subnet's domain controller to load the DAG00 A record into the zone; Run from the remote site's Exchange server:
Add-DatabaseAvailabilityGroupServer -Identity DAG00 -MailboxServer exc02.university.edu.au

So, this has resolved the problems I was experiencing in the first post. Unfortunately, I've run straight into another problem, as outlined by another user in this post:

http://social.technet.microsoft.com/Forums/en-US/exchange2010/thread/1c1ccd21-f2eb-4acf-ac8a-87d339d6cd2c/

Just for posterity, here's the error produced when I try to bring the primary node back online:

Anyway, what I'll do is mark this thread as resolved, as I should open a new thread (or contribute to the above linked thread on the same topic, though it looks like the owner might have since abandoned it) to discuss the new issue that can be addressed separately.

Cheers,
Lain
 
Status
Not open for further replies.
Top