Today I played around with the Node.js Cluster module. Node.js typically runs in a single-threaded JavaScript environment and while Node.js is very fast, it doesn't take full advantage of multi-core systems. The Cluster module allows you to build an application and fork it into multiple workers that listen on the same port and handle requests for you.
Cluster Setup
In order to build a small cluster serving your Ghost blog the following code can be used:
var cluster = require('cluster'),
numCPUs = require('os').cpus().length,
ghost = require('ghost');
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('Worker ' + worker.process.pid + ' died');
});
} else {
ghost().then(function (app) {
app.start();
}
}
This code will fork the process for each CPU your computer has. On my test machine numCPUs is 4 (the i7 is a dual core CPU with HyperThreading which reports as 4 cores). This means that instead of a single running Ghost instance I will end up with four. The Cluster module makes it possible that all instances share the same server port.
Results
I ran two benchmarks - one without the Cluster module and one with it. Both benchmarks were run on a MacBook Pro, Retina, 13-inch, Late 2013 (2,8 GHz Intel Core i7) with 16 GB 1600 MHz DDR3 and OS X 10.9.4. Apache Benchmark was used to load test Ghost with 20 concurrent requests and 100.000 requests in total.
Without Cluster
Concurrency Level: 20
Time taken for tests: 1192.632 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 252000000 bytes
HTML transferred: 227700000 bytes
Requests per second: 83.85 [#/sec] (mean)
Time per request: 238.526 [ms] (mean)
Time per request: 11.926 [ms] (mean, across all concurrent requests)
Transfer rate: 206.35 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 30
Processing: 134 238 190.1 190 4045
Waiting: 134 238 190.1 190 4044
Total: 134 238 190.1 190 4045
Percentage of the requests served within a certain time (ms)
50% 190
66% 199
75% 209
80% 226
90% 282
95% 545
98% 755
99% 961
100% 4045 (longest request)
With Cluster
Concurrency Level: 20
Time taken for tests: 523.845 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Total transferred: 251975561 bytes
HTML transferred: 227700000 bytes
Requests per second: 190.90 [#/sec] (mean)
Time per request: 104.769 [ms] (mean)
Time per request: 5.238 [ms] (mean, across all concurrent requests)
Transfer rate: 469.74 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.3 0 41
Processing: 15 104 38.6 100 669
Waiting: 15 104 38.5 100 669
Total: 16 105 38.6 100 669
Percentage of the requests served within a certain time (ms)
50% 100
66% 114
75% 123
80% 130
90% 151
95% 173
98% 203
99% 227
100% 669 (longest request)
You can easily see that the Cluster module increased the number of requests per second by a factor of 2,2 and was able to serve 50% of the requests within 100ms.
Handling Failure
In the event that one of the worker processes quits unexpectedly it would be nice to automatically add a new worker. In the above code you might have noticed that the cluster allows you to listen for an exit
event which is triggered when a worker process dies. We can use this event to fork a new worker process.
cluster.on('exit', function(worker, code, signal) {
console.log('Worker ' + worker.process.pid + ' died');
// Fork new instance
cluster.fork();
});
Conclusion
The Cluster module provides a very easy solution to scale your node application and achieve better performance on a multi-core system. It is incredibly easy to use and doesn't require any changes to your application. The module will obviously only help you if the bottleneck is your application and not the database backend.
Note that the Cluster module is currently experimental and may change at any time in the future. In fact Node v0.12 will introduce round-robin scheduling to improve the request distribution to the workers. You can read more about that in the StrongLoop Blog.