CAS Array Failover Issue

Status
Not open for further replies.
R

rdsoxfn

So I have two sites A and B. Site A will be the live site and contains 2 CAS/HUB servers configured in a CAS Array and 2 MBX servers. Site B contains a single CAS/HUB server and a single MBX server. All 3 MBX servers are configured in a DAG array and the CAS/HUB in Site B will only be used in case of a full Site A failure. Last night we tested the CAS array by shutting down one of the CAS/HUB servers. When the server shutdown we were unable to browse OWA or perform any client access functions. As soon as it came back up everything was fine.

I ran get-mailboxdatabase | fl *rpcclientaccess*,name and all my databases point to the CAS Array.

What can I look at as a reason for my NLB CAS array failing when shutting down just one of the servers in the array.

Thanks,

Jim
 
C

Courtenay Snell

Hi Jim,

Does the DNS A Record for your CAS Array point to the VIP of your Load balancer? If you've accidentally pointed it the the host address of the CAS Server you shutdown then you will see this issue.

Also what Load Balancing solution are you running? As it could be a configuration issue with the your priority, affinity or service health checking thats still tryin to send request onto the failed host.

Cheers
Courtenay Snell
 
N

Neil Hobson [MVP]

What was the status of the other CAS server in the NLB array at the time you shut down the first CAS server?

" rdsoxfn" wrote in message news:4be4feaf-07f7-40ba-93ea-863ab6f47f19@communitybridge.codeplex .com...

So I have two sites A and B. Site A will be the live site and contains 2 CAS/HUB servers configured in a CAS Array and 2 MBX servers. Site B contains a single CAS/HUB server and a single MBX server. All 3 MBX servers are configured in a DAG array and the CAS/HUB in Site B will only be used in case of a full Site A failure. Last night we tested the CAS array by shutting down one of the CAS/HUB servers. When the server shutdown we were unable to browse OWA or perform any client access functions. As soon as it came back up everything was fine.

I ran get-mailboxdatabase | fl *rpcclientaccess*,name and all my databases point to the CAS Array.

What can I look at as a reason for my NLB CAS array failing when shutting down just one of the servers in the array.

Thanks,

Jim
Neil Hobson, Exchange MVP
 
R

rdsoxfn

Courtenay:

The CAS array DNS A record does point to the VIP of the load balancer. We are using the windows load balancing solution. I did see in my Advanced TCP/IP settings on the NLB NIC's I have two IP's. One obviously is the NIC IP, but the 2nd is the VIP. Do I need the VIP IP bound to the NLB NIC's?

Neil:

During the test we only shutdown one CAS server so the other was up and running. While it was shutdown I logged into the other server and launched the NLB management client. The cluster was there, but only the CAS server that was up appeared in the client; however the shutdown server disappeared from the cluster completely. I would have expected it to show the server, but it was in a failed state.
 
R

rdsoxfn

Just another note about our CAS/HUB environment. The first CAS server is a physical server and the second is a virtual server in a vSphere environment. The server we shutdown was the virtual. I just read on msexchange.org:

" When you plan to configure a WNLB array for virtual Exchange 2010 CAS servers that uses VMware ESX Server as the virtualization platform, it is recommend to configure your WNLB in multi-cast mode since you otherwise will expect an issue with the WNLB array not working properly" It then references the following vmware kb article:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1556

I checked the Cluster Operation Mode and we are set to Unicast. What's your opinion on the Unicast setting causing this?
 
T

Tkoeppe

Did you ever find an answer to this, I having mixed performance and connectivity issues with the exact same setup (WNLB and Unicast for two CAS servers running on ESX 3.5 Update 5) I have not performed the steps in the VMWare KB (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1556) becuase that impact the entire VSwitch that my servers are running on. I would like to just switch my WNLB to Multicast as a simpler solution if its also a recomended solution for my problem.
 
J

James-Luo

Quote: “I did see in my Advanced TCP/IP settings on the NLB NIC's I have two IP's. One obviously is the NIC IP, but the 2nd is the VIP. Do I need the VIP IP bound to the NLB NIC's?”

What"s your definition on NLB NIC? As I know, in the unicast mode, each node has two NICs, one (Public NIC) has the IP subnet that all servers/clients can communicate to; and the other (NLB NIC) has the IP subnet that only NLB nodes can use to talk with each other

If your definition is the same, the VIP should set on public NIC, not NLB NIC. The NLB NICs are only using for NLB communication between two nodes cross-over or via VLAN if you will add more node to NLB cluster in the future sometime

James Luo

 
R

rdsoxfn

Just to update this post we changed the NLB cluster for the CAS Array to multicast mode from unicast and it broke all communication. Unable to browse OWA and Outlook clients could not connect to Exchange. Once we switched back to Unicast all was good.

James, thank you for your post. My definition is the same as yours. NIC 1 (Public) does have full config (with gateway) while NIC 2 (NLB) has just he IP and subnet. During our next maintenance window (likely this weekend) I will remove the VIP from the NLB NIC config on both CAS servers in the array and bind it to the Public NIC. I will update again after we make this change.

Jim
 
R

rdsoxfn

I finally got approval to test tonight. I will let you all know the results.
 
R

rdsoxfn

James, here is the update. I removed the CAS Array VIP from the NLB NICs on both CAS/HUB servers and moved the VIP to the Public NIC. After the move I could not browse OWA and the NLB Cluster manager showed an error that the CAS/HUB Servers were misconfigured. This created a yellow/black exclamation point for each respective server in NLB Cluster Manager. As soon as I moved the VIPs back to the NLB NICs OWA browsed without issue and the cluster returned to normal and without issue.

So with that test I have confirmed that the VIP of the CAS Array belongs on the NLB NICs. During the test I left the VIP on the primary NIC and shutdown one of the CAS/HUB servers. I was still unable to browse OWA therefore the CAS failover failed.

Any suggestions as to the next step would be greatl appreciated.

Jim
 
R

rdsoxfn

Just an update to this issue. Yesterday I discovered that the NLB IP's I have set are also assigned out to other devices according to DNS. Instead of tracking down what was using them, I just obtained two new IP's in the same VLAN and verified no entries existed in DNS.

I changed the IP's and then created new DNS entries for each IP. Afterwards, I verified I could ping each IP. I then shut down CAS01 and could still ping and telnet to the cas array, I most importantly could browse to OWA and log into my mailbox.

However, as soon as CAS02 was shutdown my response from the cas array FQDN timed out and I thus could not ping nor browse OWA.

I have checked the configs on both servers and everything matches. Opening a ticket with Microsoft this morning.
 
C

CKZ-NLD

Did you changed the RCPClientAccessServer prop on the databases? Because all databases before the creation of the cas array don't hold this prop. Only the ones you create afterwards.

get-MailboxDatabase -Identity %STORE% | fl RpcClientAccessServer
 
R

rdsoxfn

All databases point to the cas array. Didn't have to set this b/c the array was created before all the databases. On a call with Microsoft right now. We have found that when we shutdown CASHUB02 (VM) that we can browse without issue to OWA if we stay in the server vlan. After a period of time (about 2 to 3 minutes) the client machines or those connected to the VPN figure out that CASHUB01 is up on the array and the cas array starts to ping again. The MS network engineer is currently looking for a solution to this.
 
C

CKZ-NLD

The outlook clients are poiting to this CAS array FQDN?

And the CAS Array is configured with it's own IP address in the DNS server?
 
R

rdsoxfn

Yes to both questions. MS network engineer says the CAS array config as well as the NLB config look good. He believes this may be a routing issue and is performing further research and will get back to me. I will post his feedback once I have it.
 
Status
Not open for further replies.
Thread starter Similar threads Forum Replies Date
L Outlook clients did not reconnect to Exchange when one CAS server in CAS array became unresponsive Exchange Server Administration 1
B CAS Array and NLB Exchange Server Administration 3
D CAS Array Question Exchange Server Administration 11
M Proper way to install Exchange 2010 SP1 on a CAS Array Exchange Server Administration 3
J Segmenting IMAP traffic from CAS Array Exchange Server Administration 8
B CAS Array and Outlook 2003 clients Exchange Server Administration 5
S CAS array in Exchange 2010 Exchange Server Administration 1
G Exchange 2010 CAS Array Exchange Server Administration 4
J Geographically redundant cas array Exchange Server Administration 15
D CAS Array Questions Exchange Server Administration 3
T Exchange 2010 CAS Array setup and lab Exchange Server Administration 18
S NLB exchange 2010 CAS Array Exchange Server Administration 4
S Access CAS Array Behind ISA 2006 Exchange Server Administration 6
C Introduce another CAS/HT server on the cas array on NLB Exchange Server Administration 7
P DNS Round Robin on CAS Array Exchange Server Administration 5
B After rebooting one of the cas array servers blackbarry stops working Exchange Server Administration 1
B cas array exchnage 2010 not working properly on vm Exchange Server Administration 5
S One CAS Array for mulitple site Exchange Server Administration 2
S CAS Array - NLB - Can't Ping VIP Exchange Server Administration 4
R accessing owa externally and having a cas array internally Exchange Server Administration 9
K CAS array SSL cert -RPC cas certificate warning Exchange Server Administration 3
S What to Use Instead a Hardware-based Load Balancer for an Exchange 2010 CAS Array Exchange Server Administration 21
R cas array owa is slow Exchange Server Administration 5
S Created CAS Array - Outlook certificate error Exchange Server Administration 2
S Exchange 2010 two datacenters two CAS array Exchange Server Administration 4
B Do we need to create a CAS Array for a single server site? Exchange Server Administration 3
M CAS Array Exchange Server Administration 6
J Autodiscover not issuing CAS array to Outlook Clients Exchange Server Administration 2
R Installing SP1 on Hub/CAS servers in CAS Array Exchange Server Administration 9
R CAS array not working with Outlook Exchange Server Administration 15
B CAS Array article...? Exchange Server Administration 2
C Cas array question Exchange Server Administration 12
T Rename CAS array Exchange Server Administration 4
S CAS array legacy URL Exchange Server Administration 9
S CAS array in exchange 2003 and exchange 2010 coexsit environment Exchange Server Administration 4
S Looking for feedback on this CAS Array testing plan Exchange Server Administration 2
H CAS Array / OWA issues Exchange Server Administration 9
C CAS Array and Outlook Exchange Server Administration 3
C How to assign certificate(s) to an CAS array? Exchange Server Administration 1
C how should i assign an SSL certificate to a CAS array? Exchange Server Administration 2
Z Active SynC Issue in Exchange 2010 on CAS Array Exchange Server Administration 4
T CAS Array/RPCClientAccessServer and Outlook profile experiences Exchange Server Administration 8
W CAS Array and Autodiscover for Internal and external access Exchange Server Administration 25
C cas array and mapi Exchange Server Administration 1
A NLB, CAS array or OTHER issue? Exchange Server Administration 24
C Reboot a CAS array member causes Outlook 2010 clients to prompt for credentials. Using Outlook 5
L Cas Array failing when assigning using -rpcclientaccessserver Exchange Server Administration 3
K Outlook client on Exchange 2010 changes cas array name to instance-<guid> Using Outlook 21
B CAS Array with Hardware Load Balancers Exchange Server Administration 6
A Why not using DAG virtual IP/fqdn for CAS array in two nodes setup? Exchange Server Administration 2
Similar threads


















































Top