Tuning Tomcat for High Throughput

Are your Tomcat threads pegged, CPU hot, and response times creeping up the moment traffic gets real? Did a harmless spike from a promo email or a trending tweet turn your shiny app into a queue of timeouts and angry logs? If you are wondering whether Tomcat is slow by nature or if a few knobs can unlock high throughput, this is for you. I have been tuning Tomcat in messy production boxes, on beefy metal and on early cloud nodes, and the same lessons keep paying off.

Short answer Tomcat can push a lot more than most defaults suggest.

Pick the right Connector first

Most teams start with the default HTTP connector and never look back. That works until it does not. Tomcat 7 brings an NIO connector that is friendly with many keep alive clients and long poll traffic. APR with the native library is great when SSL is busy or when you push static files straight from Tomcat. If you are fronted by Apache httpd with mod_jk or mod_proxy_ajp, AJP keeps the wires short. The point is simple. Connector choice maps to your traffic shape. Lots of small keep alive requests. Pick NIO. Heavy SSL. Consider APR. Plain proxy from nginx or HAProxy. Keep HTTP and keep it simple.

Do not overthink it. Start with HTTP and switch when the pattern is clear.

<!-- server.xml examples -->
<Connector port="8080"
           protocol="org.apache.coyote.http11.Http11NioProtocol"
           maxThreads="400"
           acceptCount="1000"
           connectionTimeout="8000"
           keepAliveTimeout="5000"
           maxKeepAliveRequests="100"
           compression="on"
           compressableMimeType="text/html,text/css,application/json,application/javascript"
           URIEncoding="UTF-8" />

<Connector port="8009"
           protocol="AJP/1.3"
           maxThreads="300"
           acceptCount="800"
           secretRequired="false"
           URIEncoding="UTF-8" />

Threads and the accept queue

maxThreads is the workhorse. It caps concurrent request processing. On a modern quad core with decent I O and a JVM that is not thrashing, 200 to 400 threads is a good starting lane. If your app spends time waiting on databases or web services, you can go higher. If it is CPU bound, do not. acceptCount is the bouncer at the door. When all threads are busy, new connections pile up here. Keep it large so the kernel does not refuse new clients while Tomcat is clearing work. If you see many 503s under bursts, you likely hit a small accept queue or too few threads.

Watch Busy Threads in JMX while load testing and move with data.

<Connector ... maxThreads="400" acceptCount="1000" />

Keep alive and timeouts decide how many sockets you can carry

Keep alive saves CPU and wins throughput when clients pipeline a few requests. It also keeps sockets open doing nothing if the timeout is generous. Set keepAliveTimeout tight around 3 to 5 seconds for typical web traffic behind a proxy. maxKeepAliveRequests can be 100 or more. If you run very chatty pages with many small assets and no proxy, go a bit higher. If you host slow clients on flaky networks, put a proxy in front and let it babysit them, not Tomcat.

Short waits keep the line moving.

<Connector ... keepAliveTimeout="5000" maxKeepAliveRequests="100" connectionTimeout="8000" />

Static files, sendfile, and the proxy question

If Tomcat is serving images, downloads, or large CSS and JS, either put nginx or Apache httpd in front or turn on useSendfile. Kernel sendfile lets the OS push bytes from disk to socket without bouncing through user space. It saves CPU and boosts throughput. It shines for large files and simple static paths. If your app streams from a database or does on the fly gzip, sendfile is out and a proxy will help a lot.

Tomcat is great at servlet work, do not make it your CDN.

JVM flags that matter for Tomcat performance

Heap size comes first. Set Xms and Xmx the same so the JVM does not resize under load. For big apps with many JSPs, PermGen needs love. CMS tends to keep pauses low for web traffic while Parallel can push peak throughput for pure compute. For most web stacks, CMS is a sweet spot. Turn on GC logs and confirm. Use VisualVM or JConsole to watch Old Gen during a load run. If you see Full GC spikes, increase heap or fix object churn.

Do not paste a flag zoo. Add one thing, test, then keep it or drop it.

# setenv.sh
export CATALINA_OPTS="\
 -Xms2048m -Xmx2048m \
 -XX:PermSize=256m -XX:MaxPermSize=256m \
 -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled \
 -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly \
 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/tomcat/gc.log"

Linux sockets and file descriptor basics

Tomcat cannot accept connections without file descriptors. Set your limits high and match the kernel queues to your acceptCount. Raise ulimit -n for the tomcat user. Bump somaxconn and the SYN backlog. Grow the ephemeral port range. Shorten TIME WAIT with care. tcp_tw_recycle breaks clients behind NAT so do not enable it if you have proxies or mobile users coming through carrier NAT. With these set, your connector queue will behave the way you intend.

Change one value, run a quick smoke test, then keep going.

# /etc/security/limits.conf
tomcat  soft  nofile  65535
tomcat  hard  nofile  65535

# /etc/sysctl.conf
net.core.somaxconn = 1024
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.ip_local_port_range = 10240 65535
net.ipv4.tcp_fin_timeout = 15
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_tw_reuse = 1
# net.ipv4.tcp_tw_recycle = 0   # keep this off to avoid NAT pain

# apply
sysctl -p

Load test like you mean it

Pick a simple scenario and push it hard. ab, siege, and JMeter all work. Warm the JVM, then record p95 and p99, not just the average. Turn on the AccessLogValve with microsecond timing to get a truth feed from the server side. Watch GC logs and JMX at the same time. If you are on EC2 or another cloud, test across availability zones because network jitter matters more than we wish.

One knob at a time or you will not know what fixed what.

# quick keep alive test
ab -k -n 100000 -c 200 http://yourhost:8080/health

# access log with response time in microseconds
<Valve className="org.apache.catalina.valves.AccessLogValve"
       directory="logs" prefix="access" suffix=".log"
       pattern="%h %l %u %t "%r" %s %b %D" />

Common traps that kill throughput

Too many threads will make the JVM spend time context switching while caches go cold. Slow SQL will back up every connector no matter how you tune Tomcat. Session replication with DeltaManager is heavy and burns CPU. If you run a cluster, prefer sticky sessions and keep the session tiny or stateless. Overzealous compression on already compressed content wastes cycles. Chatty logging at debug inside hot code paths will melt your throughput and your disks. Keep logs at info and add targeted debug only when you need it.

Profile the app before you blame the container.

Throughput is mostly about removing waits.

Software Architecture Software Engineering