PARALLELIZING NETWORK PACKET PROCESSING IN SOFTWARE ROUTER Team Members:

PARALLELIZING NETWORK
PACKET PROCESSING IN
SOFTWARE ROUTER
Team Members:
Amriteshwar Singh, Bibin Thomas, Divya Sasidharan,
Hars Vardhan, and Pritesh Baviskar
Parallel Systems and Architecture CS6399
Term Project
Overview









Introduction
Project Description
Software Router v/s Hardware Router
Parallelizing Across Servers
Parallelizing Within Server
Project Methodology
Experiment Results
Conclusion
References
Introduction





What is a Software Router.
How to go about developing one.
Across Server Parallelism.
With-in Server Parallelism.
Route-Bricks Architecture.
Project Description

Objective
 Understanding
and evaluating the clustered software
router architecture.
 Exploiting parallelism

Challenges
 High
Performance
 Re-programmability
Software Router v/s Hardware Router
External Lines
Server
Server
1
2
Server
R bps
R bps
Server
Inter Server
Connection
Server
N
Server
Internal Switching Fabric
Traditional Router Architecture



Server
Server
Software Router Architecture
Single Server Software Router
Cluster Router Architecture
Key features



Each server processes packets at a rate proportional to R.(cR, c=constant independent of N)
Decentralization -Load Balanced interconnects
Extensible – Increase in number of ports proportional to N , with the addition of new servers.
R bps
Parallelizing Across Servers

Valiant Load Balancing Routing.(VLB)


Full Mesh Interconnection.
Two Phase Routing




Node Roles - Source, Intermediate, Destination.
Drawbacks


Phase I - Source sends input packet to a random
intermediate node.
Phase II – Intermediate node relays to Destination node.
Additional cost of forwarding packets(3R as opposed to 2R still << NR)
Direct VLB



Observation of uniform traffic across nodes.
Avoiding Phase I overhead in VLB for randomizing the inflow.
Avoids the VLB’s per-server processing additional cost.(Max=2R)
Valiant Load Balanced Mesh
External Lines
1
1’
1’’
2
2’
2’’
3
3’
3’’
Internal Lines
R/8 bps
8
8’
8’’
Logical View:8-port VLB Mesh

Topology: k-ary n-fly


k=per server fanout ; n=logk N
Configurations (Feasibility Study)



Current Servers – 1 router port and 5NIC’s per server (Max N=32 ports)
Additional NIC’s – 1 router port and 20 NIC’s
(Max N=128 ports)
Faster Servers – 2 router ports and 20 NIC’s
(Max N=2048 ports)
Parallelism within server

Exploring the design considerations
 Server
architecture
 Workload balancing across cores
 Single queue or multiple queues
 Batch processing
Server Architecture (Traditional)
Server Architecture (NUMA)
Load balancing across cores
 Accessing the queue to transmit/receive the packet
Rule : Each network queue must be accessed by single core
 Processing of the packets
Rule: Parallel approach must be adopted
Queuing technique
Single queue or Multiple Queue
Parallelism within server

Exploring the design considerations
 Server
architecture
 Division of work among cores
 Single queue or multiple queue
 Batch processing
Experimental Results
• para1 – para8.utdallas.edu machine used
• Mesh interconnect up to 16 nodes.
• Valiant Load Balancing
• MPI with multi-threading
• Threads on each node:
• Packet Generator (data-in)
• Packet receiver within System
• Table lookup and Sender (data-out)
• MPI Source Code
•Results
Experiment Results
No. of nodes v/s Rate
Processing Rates (Mbps)
6000
5000
4000
3000
Rate(Mbps)
2000
1000
0
0
5
10
Number of Nodes
15
20
Conclusion

Performance
 Competitive
performance when compared to a
hardware router.

Cost
 20-port

Programmability
 Use

10Gbps ~ 13K
existing MPI Lib for implementation
Relaxed Performance Guarantee
References



RouteBricks: Exploiting Parallelism To Scale Software Routers, Mihai Dobrescu and
Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone,
Allan Knies, Maziar Manesh, Sylvia Ratnasamy, “ 22nd ACM Symposium on
Operating Systems Principles (SOSP), October 2009”.
Can Software Routers Scale? , Katerina Argyraki, Salman Baset, Byung-Gon Chun,
Kevin Fall, Gianluca Iannaccone, Allan Knies, Eddie Kohler, Maziar Manesh, Sergiu
Nedveschi, Sylvia Ratnasamy, “ ACM Sigcomm Workshop - PRESTO, August 2008”.
Next Generation Intel Microarchitecture (Nehalem), White Paper, Intel Corp., 2008.
???