I can't tell you how many enterprises I have worked with where the network team says that the network is fine, it's only the new voice or video that is having problems so it must be the application that is broken. So I drag out the test tools, run some simulated real-time traffic across the network and demonstrate that packet loss is occurring. Oops, they say, and start to look at the problem. Why is this packet loss so invisible before voice is introduced, and so problematic once it is? Its because TCP is so good at what it does.
Transport Control Protocol (TCP) is the protocol that 99% of our data applications use to move data back and forth across the network. TCP is designed to get all the data delivered, 100% correct. TCP has a built-in mechanism that ensures all the packets arrive. If the intervening network drops a packet (which is a common occurrence) the receiving computer, running the TCP protocol, asks that it be resent. Soon enough the wayward packet arrives and all is well. But as we know, real-time traffic (voice and video), because they are delivering a real-time event, don't have the luxury of resending a missing packet because the application can't wait for it. So voice and video use the UDP protocol. UDP packets are delivered by the source into the network and the network does its best to deliver them, Send and pray. Just like the post office.
Implementation problems in the network cause packet loss. These latent problems can be causing packet loss all day and all night. If the loss percentages are not very high, applications using TCP hardly notice the problem due to the efficiency of TCP in recovering those packets. But the voice and video applications suffer badly. One percent packet loss causes a noticeable degredation in voice and video quality. Real-time traffic is like the canary in the coal mine because of its sensitivity to packet loss.
So what are these problems? The biggest culprit is the half-full duplex mismatch. This is a failure of the Ethernet link to properly negotiate the duplex of the connection. When first connected, one end of an Ethernet link decides to run half-duplex and the other end full duplex. This mismatch causes packet loss in one direction across the mismatched link.
Similar packet loss problems occur with Cat 3 cables running 100 Mbps, or Ethernet cables that are longer than the specified length limit. Bad copper connections (on a T1 or T3) can cause intermittent loss too. These layer 2 problems can be found by looking at the link statistics (collision, CRC, etc.) on the switches, routers or endpoints along the path.
Another category of loss problems are switches and routers that are not up to the task. This class of problems occurs more often with video than voice due to the higher packet volume, but voice trunks can also cause these issues. Many inexpensive switches have design compromises that allow short bursts of traffic to run at the specified data rates, but will not sustain the constant high packet volume of a voice trunk or video conferencing streams. Be especially cautious about using unmanaged switches or those little desktop switches that aggregate a bunch of endpoints in the office.
Routers may also have problems supporting a high volume of real-time packets. If the real-time streams are using QoS, the router may have to process each packet with its CPU, and the CPU may be running at too high a utilization. If router CPU utilization is high or memory utilization is high, check to see if there are unnecessary access control lists (ACLs) that can be eliminated. Check to see if debugging is on and also if NetFlow statistics are being gathered. If these functions can be turned off, the router can better handle the load.
Do you have implementation problems? Look at the packet loss statistics of the phones or the video endpoints. They don't lie. The endpoint monitors the sequence numbers of the incoming RTP headers, and when one is missing it increments the loss count. If you don't have your voice or video deployed, get a synthetic test tool that will test the network with traffic that simulates a voice or video stream, or get network testing done by a consultant who knows how to do it. If you find that a path has packet loss, start isolating the components of the path until you find and resolve the issues.