A Radical Take on SDN

Author
Peter Welcher
Architect, Operations Technical Advisor

SDN has to be one of the most amazing hype flareup ever. Pour VC capital on and it behaves sort of like marketing kerosene. Is there a networking / management product left that doesn’t claim to be doing SDN?

Does SDN mean we all have to run off and learn to program, as even some Cisco training folks believe? There seem to be many Puppet and Python evangelists who say so. Is that the future? I don’t think so. This blog represents my attempt to step back from the frenzy and look at what’s happening, where it’s going, and how it affects The Rest of Us. As well as answering day to day questions like “do I need to run out and learn Puppet”.

Granted: there are some early adapter use cases that are heavy on tie-ins between Development and Operations (“DevOps”). Big Internet companies, large firms needing agility. Universities. Maybe some others. But let’s not confuse early or special with the breadth of the market.

Yes, there’s a lot of innovation. However, a lot of it is at the engineer with a better mouse-trap stage. Only I’m not convinced that the world needs a number of the mouse-trap varieties I’m seeing. That’s a point I may want to discuss (debate?) further in another blog, to avoid achieving incoherence at great length. Instead, I’d like to haul out a well-worn reference to the book, Crossing the Chasm. (Somehow, the audio track in my brain wants that to be said with a deep and significant bass voice.)

Pragmatist, guilty.

What my consulting customers would love is for someone to make their lives easier, and their networks more consistent and correct.

Sidetrack: we do network assessment. Almost nobody has implemented the usual Spanning Tree defenses. Why not? Because there’s no easy way to do it! BPDU Guard and Root Guard, ok, that’s pretty easy, just find the edge ports, the ones with “spanning-tree port type edge” (NXOS) on them. But Bridge Assurance, Loop Guard, you’ve got to carefully make sure you’ve got the right ports. Not easy when you have 100, or 900, switches. Even if you automate much of it as we have. (And we don’t say “SDN” about what we’re doing.)

Relevance: the cost of implementing the feature is far greater than the perceived value. And in fact, most perceive the cost of implementing STP defenses as just prohibitive. Despite the high cost of one or more large scale STP loop outages.

Why do so many sites like / use SolarWinds? There are better products out there (gasp!). They just cost too darn much, money or fiddling time to get it to work, or both.

So yes, we all want the big “Easy” button, as others have said.

I claim that what “the rest of us” don’t need is Yet Another $1M network management tool that doesn’t work perfectly. And where they can get support and bug fixes on the rare occasions when it doesn’t work. The problem in many organizations is just not perceived as severe enough to warrant going there. The product cost has to be low enough to not outweigh its usefulness. And the product needs to not be an exercise in “why did THAT go wrong” or it rapidly becomes useless — nobody has the time. And the product better have a good GUI.

I’m getting a bit tired of reading OpenFlow articles that start out with “we’ll handle congestion by mapping your flows for you”. (We’ll revisit this in another rant blog at some point.) It is far cheaper and simpler to throw bandwidth at most problems, unless you’re solidly up into the 10-40-100 Gbps interface world. Yeah, at some scale, the cost of 40 and 100 Gbps interfaces might cover hiring a programmer and doing some development — and supporting their output for years after.

I’ll even grant Cisco Nexus N5K’s and N7K’s aren’t cheap. They’re perceived as less risky than taking someone’s unsupported code, adding programming like configuration on top of it (current status quo), and hoping for good results. With (perceived) cryptic troubleshooting if it doesn’t work.

Historical case in point: carriers bought the OPNET or WANDL tools to design Frame Relay and ATM provider networks, doing traffic engineering. Why? Because the bandwidth was too expensive, and the cost of tool + labor to drive it cost far less! We invest in costly tools and the skills to drive them when we can save more by doing so. When we can’t save, we don’t invest.

Now don’t get me wrong. I’m a bit visionary. (Or would like to think so.)

We badly need SDN for some things. The Internet of Everything is going to be large scale, more of everything. It may start with programming hypervisors and/or NIC chips to do more networking functions (VMware dvSwitch + vSE, or Cisco 1000v / ASA 1000v / VSG on steroids). To manage at large scale, anybody with a consumer product that interacts over the Internet is going to need some of the skills currently being talked about. And OpenFlow flow munging is going to require a darn good GUI. One that figures out how to present it as other than classic switch + router + firewall + SLB with some priority rules for what order things happen — its the human configuration paradigm that’s the hard part there!

So what are the precursor requirements for SDN to cross the chasm? I’m coming up with:

  • Relatively low cost
  • Reliable
  • Good GUI and simple to use
  • Solves problems for the customer
  • Supported by vendor

Now look at what Cisco is doing. They’re doing the old IBM trick of backing all the horses. So they’re doing the OpenFlow / OpenStack / ONE thing. That meets the DevOps and wanna-program crowd, at least part way.

The other thing they’re doing is acquiring SDN / GUI products (two firms last December). And doing GUI, e.g. the former Cloupia, DCNM, now the DFA. In other words, they’re innovating more on the pragmatist customer side of the chasm! Aren’t they? Learning and innovating by shipping product to fuel tight coupling to meet customer needs.

Granted, that assumes they will listen well. And manage to overcome any legacy strong opinions about Cisco + GUI in the R&S space.

I think all this may have implications for time-to-market for OpenFlow-based solutions. The pre-chasm base may tolerate Puppet and scripting ties to in effect execute config commands. How long after that before there’s a solid GUI that does a whole gamut of things? Years? What’ll be the state of chipsets and switches by the time that arrives.

What do you think? Comments welcomed!

Coming Attraction!

Well, that last bit started shading into another blog topic. More on that later.

There’s another thing Cisco and VMware are doing well. One that I haven’t seen discussed in quite the way I’m thinking of it. (Challenge to the reader: what do they have in common that is different from what everyone else is doing?) I’ll have more to say about that in another blog …

Life Log

I’m glad to be transmitting. My blog ideas queue was starting to tail drop.

Disclosure

The vendors for Network Field Day 5 (#NFD5) paid for my travel expenses and perhaps small items, so I wish to disclose that in my blogs now. The vendors in question are: Cisco, Brocade, Juniper, Plexxi, Ruckus, and SolarWinds. I’d like to think that my blogs aren’t influenced by that. Yes, the time spent in presentations and discussion gets me and the other attendees looking at and thinking about the various vendors’ products, marketing spin, and their points of view. I intend to try to remain as objective as possible in my blogs. I’ll concede that cool technology gets my attention!

Stay tuned!

Twitter: @pjwelcher

Leave a Reply