Cisco NX-OS Data Center Features

1
CISCO NX-OS DATA
CENTER FEATURES
Jack Ross
CCIE #16728
2
Agenda
• Brief Hardware Overview
• Software Versions
• NX-OS Layer 2
• NX-OS Layer 3
• FabricPath
• Virtual Device Contexts (VDC’s)
• Fiber Channel Over Ethernet (FCoE)
• Overlay Transport Virtualization (OTV)
• Virtual Port Channels (vPC’s)
3
Nexus 7000 Overview
Nexus 7000/7700
– Typically DC core or aggregation
– High performance, density, & availability
– Unified I/O
FCoE switch but not a FC switch
– Redundant Power, Line Cards and Supervisors
4
Nexus 7000 Platform Overview
• Currently 7 form factors
– 7018, 7010, 7009, 7004, 7718, 7710,7706
• Currently 2 types of line cards
– M Series Cards - Layer 3 cards
• Feature rich cards
– F Series Cards - Layer 2 cards*
• Performance oriented cards
5
Line Card Features
• M Series Specific
– Layer 3 Routing
– FEX
– OTV
– TrustSec
• F Series Specific
– FabricPath
– vPC+
– FCoE
6
Nexus 5000 Overview
Nexus 5000/5500
– Typically End of Row (EoR) aggregation or Top
of Rack (ToR) access
– Typically Layer 2 but can do limited Layer 3 with add on
daughter card in the 5500 Series
Unified I/O
– Both FCoE and native FC switching
– Redundant power but not supervisors
7
Nexus 5000 Platform Overview
• Currently 2 Generations
– 1st Gen - Nexus 5000 – 5010 & 5020
– 2nd Gen - Nexus 5500 – 5548 & 5596
• Mainly layer 2 switching
– 5500 can support L3 add-in card
• Supports Unified I/O
– Both FCoE Forwarder (FCF) and native FC switching
• 5500 supports Unified Ports (UP models)
– Ports can run as Ethernet or native Fibre Channel
– Ethernet ports allocated at port 1 and counts up
– Fibre Channel ports allocated at last port and counts down
– Requires a reboot to re-allocate port’s role (like UCS FI)
8
Nexus 2000
• Fabric Extender (FEX)
• Acts as a remote line card of 7K or 5K chassis
• All management performed on Parent Switch
– No console or VTY ports on FEX
– NX-OS automatically downloaded from Parent
• No local switching
– Essentially a VN-Tag/802.1BR switch, not an Ethernet
switch
– Traffic between local ports on FEX must flow “north” via
uplink to Parent and then “south” back down
– Can impact design decision of platform placement
9
Software Versions for CCIE Lab
• NX-OS v6.0(2) on Nexus 7000 Switches (6.2(6) latest)
• NX-OS v5.1(3) on Nexus 5000 Switches
• NX-OS v4.2(1) on Nexus 1000v
• NX-OS v5.2(2) on MDS 9222i Switches
• UCS Software release 2.0(1x) for UCS-6248
Fabric Interconnect
10
Nexus NX-OS Basics
Nexus at its core is a Layer 2/3 Switch
Similar in many aspects to Catalyst IOS
– VLANs, Trunking, VTP, Rapid PVST, MST,
EtherChannel, PVLANs, UDLD, FHRPs, IGPs,
BGP, etc.
Key new features beyond Catalyst IOS
– FEX, vPC, Fabric Path, OTV, FC Switching, FCoE, etc
11
NX-OS Port Channels/EtherChannels
• Unlike IOS, NX-OS does not support PAgP
– Channels must be statically on or LACP negotiated
no “switchport mode desireable”
– LACP must be enabled with feature lacp
• One of the “killer apps” of NX-OS is
Virtual Port Channels (vPC)
– Multi-Chassis EtherChannel (MEC/MCEC)
– Analogous to 3750 Cross StackWise Channel & 6500
Virtual Switching System (VSS)
12
NX-OS Spanning Tree
• Unlike IOS, NX-OS does not support legacy CST/PVST+
– Default STP mode is Rapid-PVST+
– i.e. per-VLAN, but uses 802.1w Rapid STP
– Also supports 802.1s Multiple Spanning Tree (MST)
• NX-OS defines three STP port types
– spanning-tree port-type [normal | edge | network]
13
NX-OS Switchport Types
spanning-tree port-type normal
– Normal ports act like Catalyst IOS ports
– Default STP port type, run Rapid Per VLAN STP
spanning-tree port-type edge
– Edge ports are STP PortFast ports
spanning-tree port-type network
– Network ports run STP Bridge Assurance
14
Bridge Assurance
• All STP Network Ports send BPDUs regardless of STP port state
– Legacy 802.1d only sends BPDUs from Root Bridge
downstream
– Primary goal is to protect against unidirectional links
– BPDU becomes a bidirectional keepalive
– Replaces LoopGuard functionality
• Secondary result is same functional effect as VTP Pruning
– VLANs stop forwarding on trunk links that you do not receive
BPDUs for that VLAN in
• Enabled on interfaces with spanning-tree port type network
15
Bridge Assurance Diagram
Bridge Assurance
20
30
40
N5K-1
20, 30
N5K-2
10
20
30
VLANS 10, 20, 30
switchport trunk allowed vlan
VLANS 20, 30, 40
16
NX-OS Layer 3
Like Catalyst IOS, NX-OS supports…
– Native layer 3 routed interfaces
I.e. no switchport
Switched Virtual Interfaces (SVIs)
• I.e. VLAN interfaces
• Must be enabled with feature interface-vlan
17
NX-OS Routing Protocols
• Like IOS, NX-OS supports routing with…
– Static routing
– RIPv2 & RIPng
– EIGRP & EIGRPv6
– OSPF & OSPFv3
– IS-IS
– BGP
– Policy Routing
– No network command in IGP’s, activated on link
• Not all protocols use the same license
18
NX-OS VRF
• Like IOS, NX-OS Virtual Routing & Forwarding Instances are
used to create separate logical routing tables
– Layer 3 interfaces in different VRFs cannot exchange traffic
by default
• NX-OS VRFs behave slightly different than IOS, as…
– All layer 3 interfaces are automatically in VRF table “default”
– MGMT0 is automatically in vrf “management”
– VRFs are defined as vrf context
– Static routes are defined under the vrf context
– Dynamic routing is VRF aware, but configured under the
same process
– Exec mode routing-context vrf can change the default VRF
for verifications
19
NX-OS Redistribution
• Unlike IOS, route-maps are required to perform
redistribution on NX-OS
– Same route-map match/set logic as IOS
• Redistribution does not include directly connected
interfaces
– Requires redistribute direct route-map…
20
Fabric Path
• Pre Standard version of TRILL
• Essentially Layer 2 Ethernet Routing
• Uses ISIS to route Layer 2 Frames instead of using STP
• Can build arbitrary Topologies – Full Mesh, Partial Mesh,
Triangle, Square, etc
• Adds a TTL in Layer 2
• When using with vPC referred to as vPC+
21
Fabric Path Configuration
• Very few commands necessary
• Enable FabricPath
– install feature-set fabricpath
– feature-set fabricpath
• Configure FabricPath VLANs
– mode fabricpath under VLAN
• Configure FabricPath Core Ports
– switchport mode fabricpath
22
Virtual Device Contexts (VDC)
VDCs used to virtualize physical hardware of Nexus 7000
– Loosely analogous to SDRs in IOS XR or Contexts in ASA
• VDCs also virtualize control plane protocols of Nexus 7000
– Not analogous to VLANs or VRFs in IOS
– Separate control plane per VDC
• VLAN 40 in VDC 1 is not VLAN 40 in VDC 2
• OSPF PID 20 in VDC 1 is not OSPF PID 20 in VDC 2
23
Why VDC’s?
Multiple logical roles per physical chassis
– E.g. Core & Aggregation/Distribution on same box
Multi-Tenancy
– E.g. VDCs as a managed service to customers
Test Lab Environment for later Production Use
Required for certain features – FCoE/Storage
24
VDC Caveats
Some features can’t co-exist in same VDC
– OTV and VLAN interfaces (SVIs)
– F2 cards and M1/F1
– FCoE requires its own “Storage” VDC
Hardware and Software version dependent, check the
release notes
25
VDC Maximums
• 4 VDCs per chassis with SUP 1
• 4+1 VDCs per chassis with SUP 2
• 8+1 VDCs per chassis with SUP 2E*
• No internal cross VDC communication
– E.g. no route leaking like in VRFs
– Physical cable can be used to connect VDCs
26
The Default VDC
• Default VDC “1” always exists and cannot be removed
• Used to create and manage other VDCs
– Controls VDC port allocations
• All ports allocated to default VDC at initialization
– Controls VDC resource allocations
• Number of VLANs, VRFs, Routing table memory, etc.
• Can be used for normal data plane operations
– “Recommended” for management of chassis only
27
Default VDC Tasks
Some tasks can only be performed in the default VDC
– VDC creation/deletion/suspend
– Resource allocation – interfaces, memory, MAC’s
– NX-OS Upgrade across all VDCs
– ISSU or EPLD Upgrade
– Ethanalyzer captures – control plane traffic
– Feature-set installation for Nexus 2000, FabricPath, FCoE
– Control Plane Policing (CoPP)
– Port Channel load balancing hash
– Hardware IDS check control
– ACL Capture feature enable
– System-Wide QoS
28
Converged Ethernet or FCoE
• Lots of terms that essentially mean the
same thing
– Unified Fabric
– Unified Wire
– Converged Ethernet
– Converged Enhanced Ethernet
– Data Center Ethernet
– Data Center Bridging
• What they all really mean…
– You are running the physical framing for both Ethernet and Fibre
Channel over the same physical links
29
FCoE Terms
• FCoE Initialization Protocol (FIP)
• FCoE Forwarder (FCF)
• ENode – End Device
• Virtual Fibre Channel (VFC) Interface
30
FC, FCoE, FCIP and iSCSI
31
How FCoE Works
• FCoE replaces layer 1 & 2 transport for FC
• All upper layer FC services remain
– Domain IDs, FSPF, FCNS, FLOGI, Zoning.
• New FCoE Initialization Protocol (FIP) to negotiate
between Fabric and Node
– Fabric is the FCF
– Node is the ENode
32
FCoE Control and Data Planes
• FIP is the control plane of FCoE
• FCoE is the actual data plane
• FIP
– New EtherType 0x8914
– Used to discover FCFs and perform FLOGI
_ UCS C when FCoE turned on uses LLDP to begin
negotiation
• FCoE
– New Ethertype 0x8906
– Min length of 2240 bytes, FC has larger payload
• Implies Jumbo Frames are required
33
OTV Basics
• Overlay Transport Virtualization (OTV)
– Layer 2 VPN over IPv4
• Specifically OTV is…
– IPv4/IPv6 over Ethernet… over MPLS… over GRE…
over IPv4…
34
OTV vs. other Layer 2 DCI’s
Layer 2 DCI is needed for Virtual Machine Workload Mobility i.e. VMware
VMotion
• Many possible options for L2 DCI
– Dark Fiber (CWDM/DWDM)
– Layer 2 Transport Protocol (L2TPv3)
– Any Transport over MPLS (AToM)
– Virtual Private LAN Services (VPLS)
– Bridging over GRE – Spanning Tree Bridge Group
• These options can be used for DCI, but OTV was made for DCI
– Optimizes ARP flooding over DCI
– Demarc of the STP domain
– Can overlay multiple VLANs without complicated design
– Allows multiple edge routers without complicated design
35
OTV Terms
• OTV Edge Device
– Edge router(s) running OTV
• Authoritative Edge Device (AED)
– Active edge router for a particular VLAN
– Allows multiple redundant edge routers while
preventing loops
• Extend VLANs
– VLAN being bridged over OTV
• Site VLAN
– Internal VLAN used to elect AED
36
OTV Terms Continued
• Site Identifier
– Unique ID per DC site, shared between AEDs
• Internal Interface
– Layer 2 interface where traffic to be encapsulated is
received
• Overlay Interface
– The logical OTV tunnel interface that performs the OTV
encapsulation
• OTV Join Interface
– The Layer 3 physical link or port-channel that you use
to route upstream towards the DCI
37
OTV Overview
VLANS
10 - 70
VLANS
50 - 90
OTV Overview
NK7-1-1
OTV Overlay
Logical Address
AED
VLANS
60 - 70
Site
VLAN 3
OTV Overlay
Logical Address
NK7-2-1
DCI
Any Layer 3
Connection
Site
VLAN 4
NK7-1-2
Server 1
VLAN 60
OTV Overlay
Logical Address
NK7-2-2
AED
VLANS
50 -59
VLANS
10 - 70
OTV Overlay
Logical Address
Join
Interface
Extend
VLANS
50 -70
Internal
Interface
VLANS
50 - 90
Server 2
VLAN 60
38
OTV Control Plane
• Uses IS-IS to advertises MAC addresses between AEDs
– Is it’s own transport and extensible
• ISIS Encapsulated as Control Group Multicast
– IS-IS over Ethernet over MPLS over GRE over
IPv4 Multicast
– Implies that DCI must support ASM Multicast
39
OTV Data Plane
• Uses both Unicast and Multicast Transport
• Multicast Control Group
– Multicast or Broadcast Control Plane Protocols
– E.g. ARP, OSPF, EIGRP, etc.
• Unicast Data
– Normal Unicast is encapsulated as Unicast between AEDs
• Multicast Data Group
– Multicast Data flows are encapsulated as SSM Multicast
– Implies AEDs use IGMPv3 for (S,G) Joins
40
OTV Adjacency Server
• Normally OTV requires that the DCI runs multicast
– Needed to find and form IS-IS adjacencies and to tunnel
multicast data traffic
• OTV Adjacency Server removes multicast requirement
– One (or more) AEDs are configured as the adjacency
server
– All other AEDs register with the adjacency server
– Now all endpoints are known
• All control and data plane traffic is now unicast encapsulated
– Will result in “Head End Replication” when more than 2
DC’s connected over the DCI
41
OTV DCI Optimizations
• Other DCI options bridge all traffic over DCI
– STP, ARP, L2 Flooding, broadcast storms, etc.
• OTV reduces unnecessary flooding by…
– Proxy ARP/ICMPv6 Cache on AED
– Terminating the STP domain on AED
42
vPC Port Channels
– Port Channels, EtherChannels, & NIC Teaming/Bonding
terms used interchangeably
– Regardless of vendor, 802.3ad (LACP) refers to Port Channeling
• Used to aggregate bandwidth of multiple links between devices
– E.g. 4 x physical 1GigE links form a 4GigE logical Port Channel
• Appears as one logical link from STP’s perspective
– Avoids active/standby and allows active/active
43
vPC Port Channels
• Data flows are load balanced between member links
– Single flow cannot exceed BW of any physical member link
• E.g. increases lanes on the highway but not the speed limit
• Does not perform LFI like PPP Multilink
• Flows are load balanced based on L2, L3, & L4 header
information
– SRC/DST VLAN, MAC, IP, & TCP/UDP Port
• Default is SRC/DST L3 for IPv4/IPv6 and SRC/DST MAC
for non IP
– Can result in over/under subscribed links
44
Port Channels
• Port Channeling was original between only 2 devices
– 1 downstream device & 1 upstream device
• E.g. end host to Catalyst 3550 via 2 x FE links
– Increases BW but still has single point of failure
• Multi Chassis EtherChannel (MCEC/MEC) is between
3 devices
– 1 downstream device & 2 upstream devices
• E.g. end host to 2 x Catalyst 3750s via 2 x GigE links
– Increases BW and resiliency
– Logically appears the same as a 2 device Port Channel
45
Multi Chassis Ethernet Channels
• 3750 StackWise & 6500 VSS single control plane
– StackWise via Stacking Cable to connect BP
– VSS via Virtual Switch Link (VSL)
• vPC uses two separate control planes
– Configurations managed independently
– Separate control plane protocol instances
• STP, FHRPs, IGPs, BGP, etc.
– Synchronization via a Peer Link
• Similar logic to VSS’s VSL
46
vPC Peer Switches
• vPC made up of 2 physical switches
– The vPC Peers
• vPC Peers each have…
– vPC Peer Link
– vPC Peer Keepalive Link
– vPC Member Ports
47
vPC Overview
N7K-1
N7K-2
Peer Keepalive
N5K-1
N5K-2
Peer Link
Member Ports
Member Ports
Access Switch
48
vPC Peer Link
• Layer 2 trunk link used to sync control plane between vPC peers
– CAM table, ARP cache, IGMP Snooping DB, etc.
– Uses Cisco Fabric Service over Ethernet (CFSoE) protocol
– Used to elect a vPC Primary and vPC Secondary Role
• Normally not used for the data plane
– Peer Link generally much lower BW than aggregate of
vPC Member Ports
– If Peer Link used in the data plane, it is the bottleneck
49
vPC Peer Keepalive
• Layer 3 link used as heartbeat in the control plane
– Used to prevent active/active or “Split Brain” vPC Roles
– Not used in the vPC data plane
– Uses unicast UDP port 3200
• Peer Keepalive Link can be…
– Mgmt0 port
• Back to back or over routed infrastructure
• Ideally in an isolated VRF
50
vPC Member Ports
• Data plane port channel towards downstream neighbor
• Each vPC Peer has at least one member port per vPC
– Can be more, up to hardware platform limits
• From perspective of downstream neighbor, upstream
vPC Peers are one switch
– Physical result is a triangle
– Logical result is a point-to-point Port Channel with no
STP blocking ports
51
vPC Order of Operations
• Enable feature vpc
• Create a vPC domain
• Configure the vPC peer keepalive link
• Create the vPC peer link
• Move member ports to a vPC
– Configurations must be consistent to avoid Type 1 and
Type 2 errors
52
vPC Loop Prevention
• Goal of vPC is to hide redundant links from STP
– Could result in layer 2 flooding loops
• Loops are prevented via “vPC Check” behavior
– Frames received in the vPC Peer Link cannot flood out a vPC
Member Port while the remote vPC Peer has active vPC
Members in the same vPC
• vPC Check Exception
– If vPC Peer’s Member Ports are down, the
vPC Member Ports become “Orphan Ports” and the
vPC Check is disabled
– vPC Peer Link is essentially a last resort connection
53
vPC and FHRP
• Nexus 7000 is typically L2 & L3 network boundary
– N7K is vPC Peer but also end host’s FHRP Default Gateway
• FHRP behavior changes to accommodate active/active forwarding
over vPC
– Traffic received in vPC Member Port of FHRP Standby to FHRP
Virtual MAC is not forwarded over Peer Link to Active FHRP member
– Essentially HSRP Standby acts as HSRP Active
• FHRP vPC can break in certain non-standard vendor applications
– Frames sent to FHRP Standby with physical DST MAC of FHRP
Active are sent out the Peer Link
– peer-gateway allows FHRP Standby to forward frames on
behalf of DST MAC of FHRP Active without going over Peer Link
54
vPC and Multicast
• When source is reachable via vPC Member Port, both
vPC Peers act as PIM DR
– Called “Dual DR” or “Proxy DR”
• Allows either vPC Primary or Secondary to receive traffic
from source and forward it north without having to cross the
vPC Peer Link
– Respects vPC check rule