Ways to Automate VXLAN
A few months ago, I wrote about the tradeoffs between using a L3 switch and a router. That blog noted that there are a lot more QoS capabilities on the Cisco routers. L3 switches provide a much more constrained set of QoS features, presumably those suitable for high speed processing in chips.
If you’re trying to do QoS on Cisco gear, how does this affect you? I’ve encountered a few ways in my recent QoS adventures. I’ve also seen some shifts in use of QoS. We’ll go into all that below.
QoS is complicated. Figuring out what you ought to be doing, then designing your policy, then deploying it. Very carefully and consistently.
NetCraftsmen has traditionally worked with organizations to do the hardest part, figuring out what you’re trying to do, and coming up with a sustainable overall QoS strategy. We then work with the customer to build standard configurations for the relevant Cisco device types.
Part of what we do is try to minimize the complexity, especially the number of variations in QoS at different points in your network.
Deployment is always interesting. Deploying QoS requires precision. Our experience is that being precise enough and not missing interfaces or forgetting to deploy parts of the QoS template trips up many sites. Sites generally want to deploy QoS internally to cut costs. We partner to verify correct deployment. Alternatively, we can deploy it.
I’ve had high hopes for APIC-EM / EasyQoS, not the least of which is hoping to lower the costs of deploying QoS, while increasing the accuracy of deployment. That would empower us to offer the design and strategy services without deployment cost being a potential barrier.
I haven’t encountered any sites using EasyQoS (as far as I know). I’ve been talking up trying it where I can.
That may mean sites don’t need a consultant to do QoS, which is a win for them. I’d like to think NetCraftsmen might be able to help with QoS design and planning. I have heard concerns about EasyQoS support for Nexus, e.g. Nexus 9Ks. Googling just now suggests that is still an issue.
LiveAction has some interesting QoS and templating capabilities, and can do CBQoS-MIB based reporting, but it too suffers from Nexus configuration impairment (per recent conversation with a LiveAction staff member). I’m not aware of anything else that might be a contender for “QoS deployment tool”.
Ok, I’m using “doomed” to get your attention. Here’s what that’s all about.
Cisco router AVC (formerly NBAR) does deep-dive inspection of packets to classify them. AVC is also supported in some Cisco Wireless devices, and apparently in the 3650 / 3850 / Cat 9K switches, subject to some restrictions.
Cisco AVC can export flow information, which could be useful for security. Neat technology! I like the idea of AVC. There are some practical limitations, however…
Consider HTTPS. HTTPS is HTTPS, and the source / destination is about all that is not opaque. Can AVC identify different forms of HTTPS, encrypted web traffic? I highly doubt it. So as more and more web traffic, especially Internet-bound web traffic, shifts to HTTPS, how might that traffic be classified by AVC? Long lists of destination IPs? Updated how often? I tend to doubt any of that will work well.
Google search does not show anything about AVC working with HTTPS traffic. I did find a note that ETA (Encrypted Traffic Analysis) cannot be used on the same interface as AVC.
Tentative conclusion: Cisco AVC is handy, even needed, but only for unencrypted traffic. Please comment if you have reason to think otherwise!
Another challenge for AVC: the document listing AVC supported applications is interesting. There are a lot of entries, which is good. However, specifics about what the various items actually match are not there. E.g. ms-lync versus ms-lync-something. Is ms-lync a catch-all including the video and audio? How can I find out, other than by doing time-consuming testing?
The third item in the “doom” category is L3 switches. I’m seeing a lot of sites using Nexus L3 switches for 10 Gbps and faster WAN links, links to CoLo sites, links within and between CoLo sites, etc. That may represent a conscious decision to do without AVC, and perhaps QoS. I think of QoS as an “insurance for your fragile traffic.” So, switches can only provide limited “insurance”. Fair enough, that’s a decision factor.
I’ll have to note that despite my doing a fair amount of QoS, I have not seen sites using AVC. Others don’t do any QoS.
I end up thinking AVC can be useful for web-based and other apps if HTTPS blindness isn’t a problem for you. It does add considerable complexity, and QoS is already fairly complex. Like many choices in life or networking, one has to temper what one would like to have with what one can afford, in this case, afford in terms of time, complexity and support.
Non-use of QoS or AVC may be symptomatic of something else; QoS is complex and time-consuming, and AVC adds to the complexity. I’d have written “Cisco QoS,” but some of what I’ve seen with QoS / WAN boxes with GUIs is almost worse, encouraging micro-management of applications.
QoS helps in the narrow range where you are a bit tight on bandwidth or need to protect interactive voice and video. QoS cannot help when you’re badly tight on bandwidth (think police car with flashing lights trying to get through a massive traffic jam).
There’s a case for using QoS even with a lot of bandwidth (aggregation points, transition from high to lower speed, microbursts, etc.). Is that a fringe case? Maybe as LAN speeds increase? When I look at user ports in many sites, I see average loads in the Kbps still. As LAN speeds increase, if usage stays low, yes, dropped packets likely become much less of a concern. Other sites have users moving a lot more data, videoconferencing, etc. There, QoS is recommended!
I suspect many sites are opting to implement more bandwidth, in part hoping to avoid having to deal with QoS or holding off on QoS since there’s no clear need. This might also be the result of a conscious management decision: we’re stretched too thin, we just can’t afford to do QoS.
This also surfaces in a negative sense. I’ve run into sites with users experiencing intermittent slowness. The point I’ve had to make is that end systems and applications can easily consume a 1 Gbps link, at least in bursts. That aggregates onto e.g. 1 Gbps switch uplinks. One then needs good monitoring statistics to determine if that is in fact happening and causing the downstream user slowness.
With various voice products, it has become handy to have the application itself do the DSCP marking. Microsoft has been doing the right markings for Skype / Lync for a while. I generally follow their TechNote advice to shrink the port ranges used for various purposes. In general, I’d like total control, but applications that do the right thing, and / or allow GPO tweaking — that works too. Part of doing QoS is figuring out a sustainable and least painful way to get what you need, and maybe moving some things from the “need” bucket to the “nice to have” bucket.
Lately, we have all sorts of Internet-based voice, video, and conferencing products. Some are well documented, some poorly documented. One has clearly documented they do not use IETF and Cisco standard DSCP markings (sigh!). Ideally, an admin would be able to go to some settings page and set the markings for their organization. This sort of thing is needed, so that e.g. internal station-to-station calls can be given QoS handling, at least on-net. Conference calls, ditto.
QoS can be very helpful as insurance for fragile interactive voice and video traffic, and perhaps streaming video as well.
Cisco Nexus L3 switches can do a modest amount of QoS, which may suffice for many purposes.
Cisco AVC is available in a number of Cisco platforms, including some wireless devices, most recent routers, and some recent campus switches. AVC can provide sophisticated application awareness. It likely cannot do much with HTTPS traffic.
Comments are welcome, both in agreement or constructive disagreement about the above. I enjoy hearing from readers and carrying on deeper discussion via comments. Thanks in advance!
Hashtags: #CiscoChampion #TheNetCraftsmenWay #QoS #Cisco
Did you know that NetCraftsmen does network /datacenter / security / collaboration design / design review? Or that we have deep UC&C experts on staff, including @ucguerilla? For more information, contact us at email@example.com.
Ways to Automate VXLAN
The Changing Cisco QoS Environment
Service Chaining via Cisco Catena
Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.
John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services. Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.
He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.