Detected a potential I/O issue
CPU Usage%
Usage:50:5507:08:05:10:15:20:25:30:35:40:45:500100200300
Memory UsageMB
RSSRSS
Total Heap AllocatedTHA
Heap UsedHU
RSSTHAHU:50:5507:08:05:10:15:20:25:30:35:40:45:500204060
Event Loop Delayms
Delay:50:5507:08:05:10:15:20:25:30:35:40:45:500510
Active Handles
Handles:50:5507:08:05:10:15:20:25:30:35:40:45:500510
Event Loop Utilization%
ELU:50:5507:08:05:10:15:20:25:30:35:40:45:50020406080100
Doctor has found a potential I/O issue:
  • There may be long-running asynchronous activities
  • This can mean that the bottleneck is not the Node process at all, but rather an I/O operation
  • Diagnose: Use clinic bubbleprof to explore asynchronous delays – run clinic bubbleprof -h to get started.
Read more

Understanding the analysis

Node.js provides a platform for non-blocking I/O. Unlike languages that typically block for IO (e.g. Java, PHP), Node.js passes I/O operations to an accompanying C++ library (libuv) which delegates these operations to the Operating System.

Once an operation is complete, the notification bubbles up from the OS, through libuv, which can then trigger any registered JavaScript functions (callbacks) for that operation. This is the typical flow for any asynchronous I/O (where as Sync API's will block, but should never be used in a server/service request handling context).

The profiled process has been observed is unusually idle under load, typically this means it's waiting for external I/O because there's nothing else to do until the I/O completes.

To solve I/O issues we have to track down the asynchronous call(s) which are taking an abnormally long time to complete.

I/O root cause analysis is mostly a reasoning exercise. Clinic.js Bubbleprof is a tool developed specifically to inform and ease this kind of reasoning.

Next Steps

  • Use clinic bubbleprof to create a diagram of the application's asynchronous flow.
    • See clinic bubbleprof --help for how to generate the profile
    • Visit the Bubbleprof walkthrough for a guide on how to use and interpret this output
  • Explore the Bubbleprof diagram. Look for long lines and large circles representing persistent delays, then drill down to reveal the lines of code responsible
  • Pay particular attention to "userland" delays, originating from code in the profiled application itself.
  • Identify possible optimization targets using knowledge of the application's I/O touch points (the I/O to and from the Node.js process, such as databases, network requests, and filesystem access). For example:
    • Look for operations in series which could be executed in parallel
    • Look for slow operations that can be optimised externally (for example with caching or indexing)
    • Consider if a large processes has good reasons for being almost constantly in the queue (for example, some server handlers)

Reference