I had noticed some minor latency issues on our network recently. By latency I mean significant differences in ping times, and intermittent packet loss. No one has noticed the issue or complained so it is not really an “Official Problem” however I would like to resolve this before it becomes an issue.
In Dec 2010 I upgraded and simplified our network by purchasing about 20 x 6249P switches. I need POE for the IP Phones, Wireless Access Points and in the future for Video Surveillance and School Bells. Dell incidentally did a great job with my orginal network configuration.
So here we are, six months after the upgrade wondering what is causing this latency and how we are going to solve it. I invited Brad Joyce from Suncoast Christian College to give me a hand and together we addressed this problem.
As it turns out some of our CAT 5 cable distances are simply too long. The permanent buildings are connected to the CORE by fibre optic cable, and the prefabricated temporary buildings depend on CAT 5 back to the CORE. E Block is 107m and C Block is 149m away from the core.
We discovered issues on 3 of the CAT5 Cables leading to the Primary Buildings, on one of the cables we had 1744493 recalculations. And on the CORE Switch we discovered that IPMapFordwardingTask was running at about .3% at the start of the day and reached about 30% by midday. In an ideal situation it should be less that 5%.
To solve this issue we turned the SPANNING TREE on the edge switches, this causes them to run as dumb switches which has it’s drawbacks. (We have no Loopback protection without SPANNING TREE running).
As regards a summary of what happened.At first I had to familiarise myself with your network design/layout.After this I had a look at your switches running configs.Then looked a the switch logs and counters to see if anything stood out.Found some problems with the spanning tree counters and also a high figure on the IPMapFordwardingTask was high at times indicating spanning tree recalculations happening.On your core switch there were several connections that had moderate levels (thousands) of BPDU received packets and one (port channel 9) that had millions.This was the aggregated link going to the admin building.On the admin building switch there were 3 connections with high levels of received BPDU packets (C and E the long cable lengths) and an unknown on port 2/15 going to patch panel port E13 (with millions of BPDU packets) unfortunately we were unable to find the destination, the cable length was 67 meters so that is not the cause, I suspect we will find a faulty or wrongly configured switch at the end of it when it is discovered. This is the main source of the BPDU packets and we left it unplugged.
- I can’t over emphasize the importance of good network documentation, which has certainly made things easier of us to diagnose and manage.
- Collaboration is an important part of good IT management. Find someone you trust and work together to resolve issues.
- Resolve issues before they become significant.
- Be proactive and future proof your environment by providing a robust network / infrastructure.
Our efforts today have not resolved our network problems, however it has eliminated a few options, which will make it easier for us to address this in the future.