My clients lost connectivity with the server [SOLVED]

jfgoodhew1

Honorable
Aug 30, 2013
5
0
10,510
Background: I'm a volunteer, with experience of fixing quite a few client side problems (TH has been an excellent resource there!). Now, in rural Africa, I've been put in charge of a network that was created by a service company. No other reason than I'm the best they've got.

Network details: ~15 clients, 1 DC/server, 1 router.

The service company seem to have done a lot of advanced things (not all of them sensible according to things I'm learning at the moment), which makes troubleshooting difficult for a beginner.

Clients: Win XP/7 - yes we are upgrading all to Win 7 soon :)
Server: Win Server 2003


SCENARIO:
Clients can connect to each other, and the server can connect to them.
Clients cannot connect to the server.
Users can still log into the domain (luckily, lease must still be remembered).

NIC settings are: server for default DNS, static IP for all, gateway is router.

I have tried disabling Routing and Remote Access from both services.msc and computer management on the server.

I have tried all the netsh reset commands, restart DNS Server, restart IPsec, restart net logon, etc. to no avail.

I have discovered trying to change Windows firewall settings leads to error about ipnat.sys being in use already. Only security program running is ClamWin, which doesn't have a firewall.
I also found there are 4 dnsnodes in our zone under AD -> Domain -> System -> MicrosoftDNS -> reverse zone: @, 127, 201 and 99. I *think* 99 might the only one that should be there (our server), though I'm not sure about the @. If I right-click and delete the other numbers from AD (first quadrant of actual client IP's), they reappear very quickly. Not sure about deleting them from DNS.

File and Connection sharing is enabled as far as I can tell.

Win 7 clients can see the full network map, by opening My Computer and clicking Network.


RECENT DNS ERRORS IN SERVER EVENT LOG:

410 - list of restricted interfaces not contain valid IP for the server.
4007 - unable to open zone _msdcs.xxx.local from AD partition unable to open 4007 - unable to open zone _msdcs.xxx.local from AD partition domaindnszones
4015 - critical error from AD
6702 - DNS server has updated its own host records
4016 - timed out on attempting servvice operation on DC=xxxx (a current client PC name) DC=xxxx.local (domain name) ...
4004 - unable to complete directory enumeration service enumeration of zone xxxx.ocal (our domain forward zone)
4004 - same but for our reverse lookup zone
4004 - same for our _msdcs forward zone
4004 - same for zone ..
4521 - encountered error 32 attempting to load zone reverse lookup (1.168.192.in-addr.arpa)

Thank you for any help and assistance you can offer I've been Googling for 2 days with a server down... Luckily our DB program can be run from a client so everyone's slightly consoled... (excuse the pun).

Oh I almost forgot, no computers on the network have internet connectivity either. I tried setting external DNS address on one client and server, but it failed both times. I reset those to the server IP straight away thanks to Ace Fekay's helpful blog on the subject: http://msmvps.com/blogs/acefekay/archive/2009/08/17/ad-and-its-reliance-on-dns.aspx

Something about the server is wrong, I cannot figure out what. Your help is tremendously appreciated. Please do start with basic configuration as I'm not even convinced that is correct.


SOLUTION:
Part 1: Server network adapter properties -> TCP/IPv4 -> Advanced -> WINS -> NetBIOS over TCP/IP was disabled, set to default.
Part 2: stop Routing and Remote Access service on server.

Internet connection issue is a separate problem, going to make a new post.
Thanks for help Josh!
 

jfgoodhew1

Honorable
Aug 30, 2013
5
0
10,510
Thanks guys for reading. I have an update from this morning:
NETBIOS over TCP was disabled in the network adapter properties -> TCP/IPv4 settings. I set it back to default and we regained connectivity to the server in safe mode with networking (firewall off).
Rebooted into normal mode, and server is still online.

Problem now is internet connection: no computers on the network can access the internet. All computers can ping external addresses but browsers do not load external pages.
 

Shadowjk

Honorable
Aug 2, 2013
26
0
10,560
DNS is the issue there. If the server is acting as a DHCP server then you need to add a second DNS server to the lease. For example:

Primary: {DC's IP address}

Secondary: {Routers IP address}

As long as the router has valid DNS servers configured for it the above should resolve this issue. If you do not use DHCP then you are going to have to statically assign the additional DNS servers.

Hope This Helps,
Josh :)
 

jfgoodhew1

Honorable
Aug 30, 2013
5
0
10,510


Hi Josh thanks so much for the advice, I totally agree it looks like DNS - but I don't know my way around it to diagnose :( Unfortunately all clients are already configured like you said - static IP, DC = DNS IP 1, router = DNS IP 2. Router is from ISP, so pre-configured with their DNS servers.

Problem also is I misupdated :( Clients *could* ping external addresses on Friday. Possibly while server was in safe mode with networking, possibly related to netbios over TCP/IP I honestly don't know what caused that functionality to stop. Now, server is running in normal mode, and noone can ping externally anymore. AFAIK there's no firewall on the server, yet ipnat.sys is being used so windows firewall can't run. Is that a red herring do you think?

So, it's a tricky one... Not been able to look into it much more today since our internal DB was throwing issues as well. Fixed those, back on the case now.
 

Shadowjk

Honorable
Aug 2, 2013
26
0
10,560
Does the DC have NAT installed? Under routing and remote access service. Unless you are routing the clients through the server to get to the router it should work.... hmmm...

Can you edit DNS on the DC so that if a DNS query fails it forwards it on to the router?

For example my DC is configured to forward to my routers IP address whenever a query fails.

https://docs.google.com/file/d/0B3xO7mJ8TY7mZGUyQzl6NGtYb2c/edit?usp=sharing

Hope This Helps,
Josh
 

jfgoodhew1

Honorable
Aug 30, 2013
5
0
10,510


NAT - don't know, will check. Sounds good because RRAC is now thoroughly disabled (was a potential solution I found before). If I enable it again (also thoroughly), having fixed the netbios issue, it might just magically work.

I'll look at your link sounds like a great idea, especially if the timeout is short.

Just noticed - can't ping router from clients either...

Thanks so much for the pointers, I had literally lost all sense of where to go with this one.

James
 

jfgoodhew1

Honorable
Aug 30, 2013
5
0
10,510
UPDATE:
Clients can now connect to the server. The issues were 2:
Network adapter properties -> TCPIPv4 -> Properties -> Advanced -> WINS tab -> NetBIOS over TCP/IP was disabled. I set it to default.

Also Routing and Remote Access was enabled. I disabled it and clients regained connectivity.

I know that if I now enable RRAC again, clients lose connectivity. I'm not sure what happens if NetBIOS is enabled with RRAC disabled.

Internet connection is still down, but a power cycle on the router enabled us to login and ping the router. Still can't ping externals or access internet.

I didn't check for NAT yet, we also had internal database issues as we switched back from client to server...!! A few heart attacks later, and hopefully we're almost there.

GENERAL ADVICE: don't tell your computer to boot to safe mode (especially without networking) via MSCONFIG. It's a permanent change that requires a user to be logged in to reset it. Without networking, domain users do not work - and there's an even chance local users have been disabled either by group policy or other means. Thankfully for me, ours were still enabled - it was trouble with one of the keys that meant our password entry was wrong!!