Choosing a Proxy - Don’t roll the D20! Leif Hedstrom

Choosing a Proxy
Don’t roll the D20!
Leif Hedstrom
Cisco WebEx
Who am I?
• Unix developer since 1985
• Yeah, I’m really that old, I learned Unix on BSD 2.9
• Long time SunOS/Solaris/Linux user
•
•
•
•
Mozilla committer (but not active now)
VP of Apache Traffic Server PMC
ASF member
Overall hacker, geek and technology addict
[email protected]
@zwoop
+lhedstrom
So which proxy cache should you
choose?
Plenty of Proxy Servers
PerlBal
And plenty of “reliable” sources…
Answer: the one that solves your
problem!
http://mihaelasharkova.files.wordpress.com/2011/05/5steploop2
But first…
• While you are still awake, and the coffee is
fresh:
My crash course in HTTP proxy
and caching!
Forward Proxy
Reverse Proxy
Intercepting Proxy
Why Cache is King
• The content fastest served is the data the user
already has locally on his computer/browser
– This is near zero cost and zero latency!
• The speed of light is still a limiting factor
– Reduce the latency -> faster page loads
• Serving out of cache is computationally cheap
– At least compared to e.g. PHP or any other higher
level page generation system
– It’s easy to scale caches horizontally
Choosing an intermediary
SMP Scalability
and performance
Ease of use
Extensible
HTTP/1.1
Features
Plenty of Proxy Servers
PerlBal
Plenty of Free Proxy Servers
PerlBal
Plenty of Free Proxy Servers
PerlBal
Plenty of Free Caching Proxy
Servers
Choosing an intermediary
SMP Scalability
and performance
The problem
• You can basically not buy a computer today
with less than 2 CPUs or cores
• Things will only get “worse”!
– Well, really, it’s getting better
• Typical server deployments today have at least
8 – 16 cores
– How many of those can you actually use??
– And are you using them efficiently??
• NUMA turns out to be kind of a bitch…
Solution 1: Multi-threading
Single CPU
Thread 1
Dual CPU
Thread 1
Thread 3
Thread 2
Thread 1
Thread 3
Thread 3
Thread 1
Thread 3
Time
Thread 2
Time
Problems with multi-threading
• It’s a wee bit difficult to get it right!
http://www.flickr.com/photos/stuartpilbrow/3345896050
Problems with multi-threading
"When two trains approach each other at a
crossing, both shall come to a full stop
and neither shall start up again until
the other has gone."
From Wikipedia, Illogical statute passed by Kansas legislation .
Solution 2: Event Processing
Scheduled
events
Network
events
Disk I/O
events
Queue
Event
Loop
Disk
handler
HTTP state
machine
Can generate new events
Accept
handler
Problems with Event Processing
• It hates blocking APIs and
calls!
– Hating it back doesn’t help :/
• Still somewhat complicated
• It doesn’t scale on SMP by
itself
Where are we at ?
Processes
Threads
Evented
Apache TS Nginx
Squid Varnish
1
1 - <n>
1
Based on cores 1
1
Lots
Yes
Yes
Yes *)
1 - <n>
Yes
*) Can use blocking calls, with (large) thread pool
Proxy Cache test setup
•
•
•
•
•
AWS Large instances, 2 CPUs
All on RCF 1918 network (“internal” net)
8GB RAM
Access logging enabled to disk (except on Varnish)
Software versions
–
–
–
–
–
Linux v3.2.0
Traffic Server v3.3.1
Nginx v1.3.9
Squid v3.2.5
Varnish v3.0.3
• Minimal configuration changes
• Cache a real (Drupal) site
ATS configuration
• etc/traffficserver/remap.config:
map / http://10.118.154.58
• etc/trafficserver/records.config:
CONFIG proxy.config.http.server_ports STRING 80
Nginx configuration try 1, basically
defaults (broken, don’t use)
worker_processes 2;
access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \
max_size=16384m inactive=600m;
proxy_temp_path /mnt/nginx_temp;
server {
listen
80;
location / {
proxy_pass http://10.83.145.47/;
proxy_cache my-cache;
}
Nginx configuration try 2 (works but
really slow, 10x slower)
worker_processes 2;
access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \
max_size=16384m inactive=600m;
proxy_temp_path /mnt/nginx_temp;
gzip on;
server {
listen
80;
location / {
proxy_pass http://10.83.145.47/;
proxy_cache my-cache;
proxy_set_header Accept-Encoding "";
}
Nginx configuration try 3 (works and
reasonably fast, but WTF!)
worker_processes 2;
access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m \
max_size=16384m inactive=600m;
proxy_temp_path /mnt/nginx_temp;
server {
listen
80;
set $ae "";
if ($http_accept_encoding ~* gzip) {
set $ae "gzip";
}
location / {
proxy_pass http://10.83.145.47/;
proxy_cache my-cache;
proxy_set_header If-None-Match "";
proxy_set_header If-Modified-Since "";
proxy_set_header Accept-Encoding $ae;
proxy_cache_key $uri$is_args$args$ae;
}
location ~ /purge_it(/.*) {
proxy_cache_purge example.com $1$is_args$args$myae
}
Thanks to Chris Ueland at NetDNA for the snippet
Squid configuration
http_port 80 accel
http_access allow all
cache_mem 4096 MB
workers 2
memory_cache_shared on
cache_dir ufs /mnt/squid 100 16 256
cache_peer 10.83.145.47 parent 80 0 no-query originserver
Varnish configuration
backend default {
.host = "10.83.145.47”;
.port = "80";
}
Performance AWS 8KB HTML (gzip)
10,000
25.0
8,000
20.0
Throughput
7,000
6,000
15.0
5,000
12.16
4,000
3,000
7.40
9.20
7.92
2,000
10.0
5.0
1,000
0
0.0
ATS 3.3.1
Nginx 1.3.9
hack
Squid 3.2.5
QPS
Varnish 3.0.3 Varnish 3.0.3
varnishlog -w
Latency
Time to
s firt res p onse (ms)
22.81
9,000
Performance AWS 8KB HTML (gzip)
10,000
100.00%
9,000
90.00%
Throughput
63%
60%
80.00%
70.00%
60.00%
5,000
50.00%
4,000
40.00%
3,000
30.00%
2,000
20.00%
1,000
10.00%
0
0.00%
ATS 3.3.1
Nginx 1.3.9
hack
Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3
varnishlog -w
QPS
CPU usage
CPU used (dual core)
73%
7,000
6,000
83%
82%
8,000
16,000
18.0
14,000
16.41 16.0
14.0
Throughput
12,000
12.0
10,000
9.10
8,000
7.27
6,000
4,000
10.0
8.0
6.0
5.93
4.95
4.0
2,000
2.0
0
0.0
ATS 3.3.1
Nginx 1.3.9
hack
Squid 3.2.5
QPS
Varnish 3.0.3 Varnish 3.0.3
varnishlog -w
Latency
Time to
s firt res p onse (ms)
Performance AWS 500 bytes JPG
Performance AWS 500 bytes JPG
16,000
100.00%
90.00%
Throughput
12,000
84%
78%
77%
77%
76%
80.00%
70.00%
10,000
60.00%
8,000
50.00%
6,000
40.00%
30.00%
4,000
20.00%
2,000
10.00%
0
0.00%
ATS 3.3.1
Nginx 1.3.9
hack
Squid 3.2.5 Varnish 3.0.3 Varnish 3.0.3
varnishlog -w
QPS
CPU usage
CPU used (dual core)
14,000
Choosing an intermediary
HTTP/1.1
Features
RFC 2616 is not optional!
• Neither is the new BIS revision!
• Understanding HTTP and how it relates to
Proxy and Caching is important
– Or you will get it wrong! I promise.
How things can go wrong: Vary!
$ curl -D - -o /dev/null -s --compress http://10.118.73.168/
HTTP/1.1 200 OK
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:48 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
X-Powered-By: PHP/5.4.9
X-Drupal-Cache: HIT
Etag: "1355334762-0-gzip"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
How things can go wrong: Vary!
$ curl -D - -o /dev/null -s http://10.118.73.168/
HTTP/1.1 200 OK
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:57 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
X-Powered-By: PHP/5.4.9
X-Drupal-Cache: HIT
Etag: "1355334762-0-gzip"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
EPIC FAIL!
Note: no gzip support
What type of proxy do you need?
• Of our candidates, only two fully supports all
proxy modes!
CoAdvisor HTTP protocol quality tests
for reverse proxies
Varnish 3.0.3
49%
Squid 3.2.5
81%
Failures
Viola ons
Success
Nginx 1.3.9
51%
ATS 3.1.3
68%
0
100
200
300
400
500
600
CoAdvisor HTTP protocol quality tests
for reverse proxies
25%
Varnish 3.0.3
Squid 3.2.5
6%
Failures
Viola ons
Success
27%
Nginx 1.3.9
15%
ATS 3.1.3
0
100
200
300
400
500
600
Choosing an intermediary
Ease of use
Extensible
My subjective opinions
ATS – The good
• Good HTTP/1.1 support, including SSL
• Tunes itself very well to the system / hardware
at hand
• Excellent cache features and performance
– Raw disk cache is fast and resilient
• Extensible plugin APIs, quite a few plugins
• Used and developed by some of the largest
Web companies in the world
ATS – The bad
•
•
•
•
Load balancing is incredibly lame
Seen as difficult to setup (I obviously disagree)
Developer community is still too small
Code is complicated
– By necessity? Maybe …
ATS – The ugly
• Too many configuration files!
• There’s still legacy code that has to be
replaced or removed
• Not a whole lot of commercial support
– But there’s hope (e.g. OmniTI recently announced
packaged support)
Nginx – The good
• Easy to understand the code base, and
software architecture
– Lots of plugins available, including SPDY
• Excellent Web and Application server
– E.g. Nginx + fpm (fcgi) + PHP is the awesome,
according to a very reputable source
• Commercial support available from the people
who wrote and know it best. Huge!
Nginx – The bad
• Adding extensions implies rebuilding the
binary
• By far the most configurations required “out
of the box” to even do anything remotely
useful
• It does not make good attempts to tune itself
to the system
• No good support for conditional requests
Nginx – The ugly
• The cache is a joke! Really
• The protocol support as an HTTP proxy is
rather poor. It fares the worst in the tests, and
can be outright wrong if you are not very
careful
• From docs: “nginx does not handle "Vary"
headers when caching.” Seriously?
Squid – The Good
• Has by far the most HTTP features of the
bunch. I mean, by far, nothing comes even
close
• It also is the best HTTP conformant proxy
today. It has the best scores in the CoAdvisor
tests, by a wide margin
• The features are mature, and used pretty
much everywhere
• Works pretty well out of the box
Squid – The Bad
•
•
•
•
Old code base
Cache is not particularly efficient
Has traditionally been prone to instability
Complex configurations
– At least IMO, I hate it
Squid – The Ugly
• SMP is quite an afterthought
– Duct tape
• Why spend so many years rewriting from v2.x to
v3.x without actually addressing some of the real
problems? Feels like a boat has been missed…
• Not very extensible
– Typically you write external “helper” processes,
similar to fcgi. This is not particularly flexible, nor
powerful (can not do everything you’d want as a
helper, so might have to rewrite the Squid core)
Varnish – The Good
•
•
•
•
VCL
And did I mention VCL? Pure genius!
Very clever logging mechanism
ESI is cool, even with its limited subset
– Not unique to Varnish though
• Support from several good commercial
entities
Varnish – The Bad
• Letting the kernel do the hard work might
seem like a good idea on paper, but perhaps
not so great in the real world. But lets not go
into a BSD vs Linux kernel war …
• Persistent caching seems like an after thought
at best
• No good support for conditional requests
• What impact does “real” logging have on
performance?
Varnish – The Ugly
• There are a lot of threads in this puppy!
• No SSL. And presumably, there never will be?
– So what happens with SPDY / HTTP2 ?
• Protocol support is weak, without a massive
amount of VCL.
• And, you probably will need a PhD in VCL!
– There’s a lot of VCL hacking to do to get it to
behave well
Summary
• Please understand your problem`
– Don’t listen to @zwoop on twitter…
• Performance in itself is rarely a key
differentiator; latency, features and
correctness are
• But most important, use a proxy, preferably a
good one, if you run a serious web server
10,000
50.0
9,000
45.87 45.0
8,000
40.0
7,000
35.0
6,000
30.0
5,000
22.81
4,000
3,000
2,000
1,000
12.16
7.40
7.92
25.0
20.0
15.0
10.0
9.20
5.0
0
0.0
ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5
hack
QPS
Varnish
3.0.3
Latency
Varnish Nginx 1.3.9
3.0.3
varnishlog w
Time to
s firt res p onse (ms)
Throughput
Performance AWS 8KB HTML (gzip)
If it ain’t broken, don’t fix it
But by all means, make it less sucky!
However, when all you have is a
hammer…