Best way to isolate packet loss between computer and router

gladdin

Honorable
Jul 17, 2013
9
0
10,510
Let me apologize ahead of time for my ignorance. Having isolated packet loss to between a computer and default gateway, and without the ability to test on another computer or troubleshoot physicals, what is the best way to determine what is causing the packet loss (computer/router/Ethernet cable)?

We can assume the packet loss is either the computer (hardware/software), Ethernet cable or router (hardware/software), correct?

Are there network tests that can be performed to further isolate (things other than assuming it's one or the other and either kicking computer into safe mode/antivirus scan/reinstall NIC drivers/powercycle router, etc.)

Would a lack of packet loss when pinging 127.0.0.1 combined with packet loss when pinging default gateway be indicative of an issue with Ethernet Cable/router? I don't feel that would be comprehensive. My goal is to determine the best way to isolate packet loss of this nature.

Edit:

I'm speaking generally (and I'm aware as a result you may not be able to provide any help). In an enterprise environment, when an end user calls in and indicates they're experiencing VPN stability issues. After disconnecting them from the VPN and isolating the packet loss to between the computer and router (without the ability of seeing interface statistics and with each user's setup being different), is there a reliable way to isolate the packet loss. What is an efficient method of "proving" the cause of the packet loss so that either A. we can convey to the user they should contact their ISP or B. escalation for troubleshooting/computer troubleshooting can be performed?

Thanks for any assistance.
 
Solution

Actually, it could be the computer, the interface between the computer's Ethernet port and the cable's RJ-45 plug, the cable, the interface between the cable's RJ-45 plug and the router's Ethernet port, or the router. 5 possibilities, not 3. Sometimes the plug doesn't fit into the port properly, or the user doesn't insert it all the way in (especially if the retaining clip has broken off), or debris gets in there, or the spring mechanism on the wire contacts in the port breaks. If the customer is plugging into a wall ethernet port, that adds yet another suspect interface.

Because the interfaces are...
Pinging 127.0.0.1 would test the software side (TCP stack).

Most network adapter makers have test software that you can download. Some work better than others. Some will even do rudimentary tests on your cable.

There is equipment that can test cables, but it is normally cheaper to just replace the cable. ( https://www.amazon.com/dp/B000QJ3G42/ ... yikes!)

Most routers have a sys log that will track detected errors. But then most routers are so cheap that people just replace them instead of messing with them.
 
It is actually fairly rare to get any form of packet loss on ethernet. We have pulled logs from building switches that have 1000's of active ports and see almost zero errors. Most time it only shows errors when a pc very first boots because the port is not really up. Most bad cables it comes up at the wrong speed or just does not come active at all.

It depends I suppose what you call a enterprise environment, most commercial switch equipment has errors counters on ports. We have ours rigged to produce snmp traps when a port gets a lot in a short period of time.

The key when talking to users be really sure they are not using wireless. It is extremely common to see packet loss/delays on wireless.
 

Actually, it could be the computer, the interface between the computer's Ethernet port and the cable's RJ-45 plug, the cable, the interface between the cable's RJ-45 plug and the router's Ethernet port, or the router. 5 possibilities, not 3. Sometimes the plug doesn't fit into the port properly, or the user doesn't insert it all the way in (especially if the retaining clip has broken off), or debris gets in there, or the spring mechanism on the wire contacts in the port breaks. If the customer is plugging into a wall ethernet port, that adds yet another suspect interface.

Because the interfaces are also suspect, that's why it's so useful to test with another computer, and to test with the cable plugged into a different port on the router.

The causes I've seen for packet loss on a LAN, from most to least common in my experience are:

  • ■A bad cable. People roll over them with chairs, pinch them around corners or in doors, so eventually some of them go bad. It's also one of the easiest to test over the phone - just ask them to try a different cable.
    ■ A bad/dying router, or if the router is overheating or overloaded (older or low-end with a weak CPU and trying to do too much, usually with third party hardware). You can eliminate the router as a suspect if you ask the customer to test ping times on a different computer plugged into the same router. If it has no packet loss, then that vindicates the router. If it does have packet loss, that doesn't prove the problem is the router but it does make it the most likely suspect. Overheating is more common than you'd think - usually due to people putting routers on the carpet or on top of other equipment. It needs to sit on a hard, flat surface away from hot air. A big clue is they report the packet loss happens more during the day.
    ■A bad/dying port on the router - they just sometimes start to die. I have no idea why It seems to happen more frequently than the computer's ethernet port dying. But I've seen it happen about a half dozen times, vs. just two ethernet cards/ports dying. Fortunately most routers have 4 LAN ports so you can just ask the customer to switch the cable to a different port.
    ■Debris in a port or the cable improperly plugged in. This usually gets covered when you ask them to try a different cable. Just ask them to blow out the port before plugging in the new cable, and make sure it's fully inserted. If all other troubleshooting fails, I'll come back to this and ask that the customer peek into the port with a flashlight to make sure the spring-loaded contact wires are all up and springing. I've twice encountered a broken/stuck contact wire which doesn't spring back up..
    ■A dying ethernet card/port. Hardest to diagnose, but moving the exact same cable from the suspect computer to another computer and having it work is a pretty strong indication. It's the most expensive/difficult fix, so I usually treat it as the last resort.
Edit: I guess I've seen a problem switch a couple times too. But those are trivial to eliminate as a cause (just plug the cable going to the switch directly into the computer).
 
Solution