OpenFlow Deployment Anecdotes and Solutions David Erickson Stanford
14 Slides207.61 KB
OpenFlow Deployment Anecdotes and Solutions David Erickson Stanford University October 17th, 2011
Datacenter Network Research Cluster Beacon (OF Controller) Non-OpenFlow OpenFlow 160 Servers XenServer 5.6 20 Hardware OpenFlow Switches 160 Software OpenFlow Switches
Gotchas Flooding Inband switch control Performance
Flooding Gotchas OpenFlow does not provide spanning tree Plan for topology with loops or multiple external net connections DNRC filters out all broadcast packets – ARP bcast - unicast module for known hosts – DHCP bcast - unicast module – Hosts send gratuitous ARPs every 60s for discovery
Flooding Gotchas Problem #1: Hosts appeared to be bouncing around the network
Problem #1 Host to Internet Beacon (OF Controller) Non-OpenFlow OpenFlow
Flooding Gotchas Problem #1: Hosts appeared to be bouncing around the network Issue: MAC timeout at the non-OpenFlow switch
Problem #1 ARP timeout Beacon (OF Controller) MAC Entry Timeout Non-OpenFlow OpenFlow
Flooding Gotchas Problem #1: Hosts appeared to be bouncing around the network Issue: MAC timeout at the non-OpenFlow switch Solution: Static MAC mapping on switch plus fallback ingress MAC filtering in Beacon
Inband Gotchas Problem #2: Gratuitous ARPs from Hosts never making it to controller, fine from VMs Issue: Open vSwitch inband algorithm auto forwarded them with ‘hidden’ tables/rules Solution: Modified inband algorithm to be more selective on the ARPs it auto forwards
Inband Gotchas Problem #3: Open vSwitch timing out and reconnecting every few minutes Particularly challenging Symptoms: – OVS log/wireshark showed echo request being sent, but never replied to – Beacon log showed incoming echo request and immediate replys sent
Problem #3 OVS disconnecting ARP Timeout Echo Rep ARP Req Beacon (OF Controller) ARP Echo Req Req Non-OpenFlow OpenFlow
Inband Gotchas Problem #3: Open vSwitch timing out and reconnecting every few minutes Issue: ARP timeout on controller machine resulted in ARP requests being encapped and returned to controller Solution: Static ARP entries on controller, could also add static entries to always deliver ARP requests
Performance Gotchas Benchmark hardware under expected use case Slow switch CPU can cause: – Unexpected delays, packets popping up in odd places – Switch livelock – Slow steady state convergence DNRC source routes based on VLAN tag with some reactive routing in host’s OVS