Designing for Cisco Nexus 9K
This is the second of at least two blogs about Cisco Application Centric Infrastructure (ACI). The first covered the broad set of announcements with a focus on the hardware, and can be found here. It contained a set of links about ACI and the launch. This one focuses more on the ACI part.
I’ve been using some of my scant spare time to obtain more information about the ACI part, where Cisco has so far described the big picture but not much in the way of details. The information gathered has slightly supplemented some information gathered informally from some of the Insieme team.
I found the TechWise TV video about ACI well done and informative. Recommended! It is here. It covered the hardware from 21:18 to 34:32, use cases and designs from there to 44:00, and Mike Dvorkin (@dvorkinista) (resident visionary / genius / Insieme founder) from 44:00 to 52:30, at which point Joe Onisick (@jonisick) stepped in to coimplete the picture.
There is a rather large page of links at http://www.janheijdra.com/2013/11/cisco-application-centric.html?m=1. Some of them new to me, some just breaking out the titles of the individual documents on the Cisco ACI pages.
There is a moderately informative (and short) video about the AVS virtual switch at http://www.youtube.com/watch?v=VbUho9Kdnxs&feature=youtu.be. Takeaways for me: think 1000v where the APIC (controller) replaces the VSM, fully integrated into ACI. There will be migration tools for 1000v to ACI migration. Licensing (free “upgrade”?) did not appear to get answered.
Matt Oswalt (co-TFD @ ACI Launch attendee and smart person) must be an even faster thinker / typist than I am — or used his travel time better. Anyway, he’s written a couple of super blogs, with a third on the way. The first one tackles hardware, just as I did, the second goes more into the design / fabric side of things. Good photos and diagrams, too. See:
FireFly Educate, which specializes in datacenter training (and for whom I occasionally have taught Nexus classes) just announced a 2 day Nexus 9000 class. See the FireFly Educate website for this class and more. It covers standalone mode with hands-on labs and some discussion of ACI. I suspect there may be some rapid course development going on, extending the materials on the ACI and APIC front.
It’s probably not surprising, that after rushing to get the prior blog published, some ideas popped into my head, more to say about the topics there. Rather than update the prior blog, I’m putting those items here.
1) Pricing. The 288 x 40 G ports for $75K sounds great. That price is probably discounted and may be optimized in other ways. For example, 288 x $1K (approximate heavily discounted 40G BiDir optics) = $288K. So optics are presumably not included in that wonderful sounding $75K price that was in the slides and various blogs.
2) Positioning. What are the pros/cons of N9K in standalone? Where would I still want a pre-N9K Nexus, where would I want a N9K? What do I give up by using a N9K in standalone mode? Where’s the chart of major features that the N5K, N6K, N7K, N7700, and N9K NX-OS code versions support?
3) How exactly do the “ACI fabric” switches do forwarding? Matt Oswald’s second blog above has some interesting tidbits he picked up.
4) Arista’s Monday announcement had the message: start putting in SMF for 100 Gbps links. One wonders if by using more wavelengths, Cisco can extend the 40 G BiDir optics to do 100 G. That would likely need 5 wavelengths in each direction on one strand, all in a small transceiver. Unlikely?
I should note, all this is subject to change, it is my best interpretation of what I’ve heard.
<These are probably not precisely what was said, I didn’t record them at the time.>
SDN is a subset of ACI. (I also think I heard ACI positioned as 2nd generation SDN.)
There is no value in being a VLAN plumber.
We don’t want to be in the jet and sonic boom situation. (Referring to overlays as replicating the existing problem in software, adding complexity, i.e. trailing rather than the driver?)
(OpenFlow, generation 1 SDN) is not abstracting, it is just centralizing complexity.
APIC = Application Policy Infrastructure Controller.
It is claimed to be capable of managing 1M endpoints. It operates independently from switch control or data planes. It is emphatically not in the data path, which would preclude scaling. One key point repeatedly made was that, unlike some SDN, the APIC is a policy controller. It does not do giant routing and switching tables (e.g. act like OpenFlow). Since forwarding logic and data plane remain distributed, the architecture is highly scalable.
The APIC will normally start with 3 controllers running active / passive, with data stored as at least 2 shards on two different controllers. It scales to 12 controllers for very large networks.
APIC will support N7K, N2K, ASA, and ASR, but a Nexus 9K is required to build the fabric.
APIC is due out second quarter 2014. Pricing? Licensing costs for NX-OS Plus supporting APIC / fabric? (Free? Cisco wants you hooked on ACI?)
APIC controls and indirectly gets the ACI fabric components configured. The ACI fabric is stateless, and integrated with L4-7.
My current understanding is that the ACI fabric will use standard VLAN, VXLAN, and apparently route, switch (i.e. not OpenFlow, MPLS). Hardware VXLAN gatewaying.
The ACI chip complements Broadcom chipset doing the packet forwarding. The ACI chip does telemetry, provides real-time stats, able to locate packet loss at the atomic level, identify which nodes creating delay or packet loss.
The comparison is UCS, not surprising since Insieme co-founder Mike Dvorkin was co-designer of the UCS management.
The ACI profile defines the high-level relationships in the network. When a component of an application (think web/application/DB servers, also DNS, DHCP, and other services) is instantiated, its agent gets the policy information from the APIC. It then handles the policy in terms of specifics that are relevant to it. So a physical device might end up configured with VLAN information and IP address, and a virtual device might be put into a port group, with the relationship between services instantiated as a VXLAN.
I’m not a DevOps person, so this sounds somewhat like what I’ve heard about Puppet or Chef. I’m understanding the difference as the policy not specify detailed agent state to the agent, i.e. the APIC (like most managers) delegates responsibility and details.
One concern I have in such an environment is whether it ends up feeling like “pushing rope”. Some prior experiences with tools, including CiscoWorks, left me watching for changes to get deployed, and then scratching my head to figure out why they didn’t happen. So far I’ve heard APIC will be fast and will have tracking of state. And will know (or be able to quickly obtain) the present state of every component.
One analogy about APIC: instantiation is like what happens when your cell phone associates with a new cell tower. It rapidly gets the information it needs, and you keep on talking. Similarly, with ACI, when you instantiate a component on a physical or virtual server, it gets instantiated and everything that needs to talk to it can do so.
I have to note, this is not the application communicating with the network. It is the app developers communicating the interactions their application servers will need, and other requirements. This has been described as solving the translation problem, in that developers and server admins think in one set of terms, network people another, and perhaps business people in a third.
One key item: APIC works with whitelists. That means that unless the profile says A talks to B, A will not be able to talk to B. That provides accurate security from Day 1.
Concerning management, ACI is intended to provide business level information, as well as real-time information about devices and links, latency, packet loss, etc. One person used Hadoop as an analogy. I understood this as APIC doing a map-reduce like operation, and basically getting processed information back from the components it needed information from. I also heard (hey, bar talk) “quantum entanglement”, to make the point that the packet loss data is atomic (fine-grained) enough that by comparing packet counters, packet loss can be accurately measured. (This was described as being incredibly simple once you approached it the right way– what I sometimes refer to as a Blinding Flash of the Not-So-Obvious).
The ACI scheme uses one-writer shared memory with no locking. This vastly reduces the impact of gathering data, be it SNMP or otherwise. There will be little CPU impact, e.g. to retrieving a full BGP routing table, unlike present routers. (Which spin the CPU up due to having to walk the routing table and restructure and send the information in lexicographic order.)
APIC will have API’s and support all the usual suspects, errrh tools. Python, embedded Puppet agent, Chef, CFEngine, etc.
I’ll note this is not the application communicating with the network. It is the application vendor or developer writing a profile describing the applications relationships and needs. I presume we can alter some elements of that profile if we need to, i.e. our network policy differs from the developer’s.
We live in interesting times.
This is an epic, mammoth undertaking.
The proof will be in the execution.
This trip to NYC has been quite a bit of fun. I’d like to thank Cisco, especially Amy Lewis (@CommsNinja), Tom Hollingsworth (@networkingnerd), and Tech Field Day for inviting me! And the rest of the TFD folks, for the interesting discussions!
Designing for Cisco Nexus 9K
Disaster Recovery and Lead Time: Are You Prepared?
Network Monitoring: Top Performance Items to Watch, Part 2