Desktop video conferencing vendors are working hard to design systems that will work without network QoS. This is largely driven by the need to use the Internet, where we don’t have any QoS guarantees. But it is also filtering back into the Enterprise as well. QoS-enabled networks are more expensive than best-effort or Internet-connected networks. If this stuff will work without QoS, why not use it that way!
Because it won't work in the long run.
There are two major components to making QoS work correctly, the packet prioritization piece and the assignment of bandwidth. Most QoS mechanisms we talk about today are in fact only addressing packet prioritization. IEEE 802.1p is about priority. DiffServ is about assigning packets into classes, and then giving some of those classes priority over others. But to make sure real-time traffic works reliably, we also have to guarantee bandwidth.
Video conferencing just won't work without good bandwidth. It doesn't matter how clever your technology is, if the network squeezes you down to 128Kbps for a video stream, you get a video image with poor resolution and a low frame rate. If you want to see a high definition image at a decent frame rate, you have to supply the bandwidth.
So the problem with running video at the best effort level is that we cannot guarantee the bandwidth. Data applications that are using the best effort level have a very high peak to average bandwidth ratio. While on average they are not using a whole lot of bandwidth, occasionally (like when we click on that web link) they use a whole lot of bandwidth to deliver the next screen of data, upload that picture or fill out the report matrix.
We used to believe that as we aggregated many users together in a LAN or WAN, that these peaks of bandwidth demand would naturally be interspersed, so that when one user needs bandwidth, the other user is calmly studying his screen and not demanding anything of the network. Formal studies showed this not to be true, that in fact the burstiness of network traffic is self-similar, meaning that at various time scales and volumes we continue to see a very bursty demand by data applications on the network.
So to make video work well alongside this bursty data traffic, we will need to ensure that the best effort class has lots of overhead. It used to be the standard rule of thumb for a LAN that 30% utilization meant it was "full" and needed to be upgraded. That's because we needed the 70% overhead for data bursts. If we fill up chunks of that overhead space with video, we will really start to increase the probabilities of interference between these two traffic streams.
So, OK, let's allocate bandwidth for the video. Then let's put call admission control in place so we can ensure video does not exceed our designed video bandwidth. Or we can monitor video usage and see when it is approaching our design limits. We are not far at this point from implementing full QoS. If we just move video to another class we are done.
I think the challenge for enterprises who expect a wide deployment of desktop-based video conferencing will be the management of the bandwidth needed. If the enterprise cheats on bandwidth, either by creating a QoS class with insufficient capacity or by leaving desktop video in the best effort class, the quality of the video conferencing calls will suffer.
There is little point in trying to enable better collaboration, better relationship building, greater efficiencies and reduced travel through video conferencing if the video quality is not excellent and reliable. We can argue about whether desktop video needs to be in the AF41 class or the AF31 class or the best effort class, but the bandwidth will need to be allocated no matter where the video is carried.