Network Management Sucks!!!

Author
Terry Slattery
Principal Architect

Jimmy Ray Purser, a blogger and video blogger for Cisco and Network World, just did a blog with that title.   In that blog post, he makes several comments that are right on target with the deficiency of most network management systems:

– It is a lot of manual correlation
– Correlation, Correlation, Correlation. There is so much data out there that I can use to make some very informed decisions, I just need to know what it is in readable form and not MIB OID strings.

I found the follow-up comments about various products to be an interesting view into what people think of network management products.  The problem that JimmyRay describes is exactly why Netcordia’s NetMRI exists.  Just collecting data and sticking it into a database is not network management, at least for me.  I know what analysis needs to be done on the network.  I know how to diagnose problems and what data to collect to do the diagnosis.  What’s lacking are tools that allow me to build rules that do the basic analysis and correlation between various sources of data, just as JimmyRay describes (see his comment about correlating Flow management).

A good example of correlation comes in validating the root bridge of each VLAN in your infrastructure.  If all you have are configs (let’s say you have 100 switch configs), and each config contains a definition of VLAN 100, and there are commands that specify the bridge priority for each STP domain, can you use the configs alone to determine the placement of the root bridge in each STP domain (which is per VLAN when using PVST)?  You need to know which switches are in which Layer 3 subnet, each of which corresponds to an STP domain. That data is not in the configs.  You have to go to the operational data in the network to see which switches are in each STP domain.  You can then check the bridge priority of those switches in each STP domain to determine which is the root bridge.  So we’re correlating operational data with the bridge priority to determine root bridge selection.  How many STP domains do you have where the root bridge has not been properly selected?  Do you know which switch in each STP domain is currently the root?  And more importantly, which STP domains need you to select a root bridge (i.e. all the priority settings are the same, so lowest MAC address is the tie-breaker)?

The above is just one example.  Consider correlating syslog events with other operational or configuration data. There are many, many more.

I think we’re going to see network analysis and correlation become much more important in the coming few years.  The analysts are starting to understand it and why it is important.  I can’t wait to build a lot of interesting analysis rules and increase visibility into the networks I run.  BTW, NetMRI does some correlation and analysis, making it one of the few network management systems out there that actually reduces my work load instead of increasing it.

-Terry

_____________________________________________________________________________________________

Re-posted with Permission 

NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html

infoblox-logo

Leave a Reply