The 'Death' of a Node.js Command Line Program

Story#

This is an ordinary Node.js script that helps you do something at some point in the future.

setTimeout(() => {
  // Do something in the future
}, 1000000)

As the story becomes more complex, this Node.js script grows in size and becomes a Node.js command-line program.

It will do some initialization when it starts, then do something after a period of time. If the Node.js process is suddenly terminated (ctrl + c), you need to do some cleanup or record the intermediate state.

// to-be.js

// Do something initialize

setTimeout(() => {
  // Do something in the future
}, 1000000)

process.on('SIGINT', () => {
  // Do something cleanup if this process is terminated
})

However, you realize that you made a mistake or no longer want to run the script you just ran. So, you frantically press ctrl + c in the terminal, but nothing happens.

We know that pressing ctrl + c in the terminal sends a SIGINT signal to the currently running process to terminate it. The reason Node.js does not exit normally is that the default behavior of Node.js when it receives the SIGINT and SIGTERM signals is to exit the current process. If you have added custom callback functions for these two signals, it will disable this default behavior (Node.js will not exit).

See the Node.js documentation for more details on Signal events.

A Brutal Workaround#

Then, you come up with a brutal workaround:

// not-to-be.js

setTimeout(() => {
  // Do something in the future
}, 1000000)

process.on('SIGINT', () => {
  // Do something cleanup if this process is terminated
  process.exit()
})

Okay, this does solve many problems. But what if you have multiple callback functions that do different cleanup work in different contexts? Any process.exit will cause the process to abruptly terminate, preventing other cleanup callback functions from executing. On the other hand, abusing process.exit can make debugging your Node.js command-line program difficult. Sometimes, you may completely forget about the process.exit you wrote earlier, and then you find your program suddenly terminates, leaving you at a loss. If you are developing a library, you should not use it and instead choose more explicit ways to terminate, such as throwing exceptions.

An "Elegant" Solution?#

So, a more "elegant" solution is:

const timer = setTimeout(() => {
  // Do something in the future
}, 1000000)

process.on('SIGINT', () => {
  // Do something cleanup if this process is terminated
  clearTimeout(timer)
})

After receiving the SIGINT signal, manually clear the previously started timer. At this point, Node.js finds that there are no asynchronous tasks running or waiting to run, so it will exit normally. This seems to make more sense.

However, it also brings new difficulties. As your CLI application becomes more complex, you start more asynchronous tasks (more timers, establishing TCP servers, running child processes, etc.). You may forget what has not been stopped, causing this "elegant" solution to sometimes be unreliable, depending on whether you have added the corresponding cleanup callback function for each task.

Why is node running#

Here is a tool to help you debug: why-is-node-running or its alternative why-is-node-still-running.

In your Node.js command-line program, you can import this library at the beginning, which uses the async_hooks API to listen to all asynchronous events. Here is a demo of this library:

const log = require('why-is-node-running') // should be your first require
const net = require('net')

function createServer () {
  const server = net.createServer()
  setInterval(function () {}, 1000)
  server.listen(0)
}

createServer()
createServer()

setTimeout(function () {
  log() // logs out active handles that are keeping node running
}, 100)

There are 5 handle(s) keeping the process running

# Timeout
/home/maf/dev/node_modules/why-is-node-running/example.js:6  - setInterval(function () {}, 1000)
/home/maf/dev/node_modules/why-is-node-running/example.js:10 - createServer()

# TCPSERVERWRAP
/home/maf/dev/node_modules/why-is-node-running/example.js:7  - server.listen(0)
/home/maf/dev/node_modules/why-is-node-running/example.js:10 - createServer()

# Timeout
/home/maf/dev/node_modules/why-is-node-running/example.js:6  - setInterval(function () {}, 1000)
/home/maf/dev/node_modules/why-is-node-running/example.js:11 - createServer()

# TCPSERVERWRAP
/home/maf/dev/node_modules/why-is-node-running/example.js:7  - server.listen(0)
/home/maf/dev/node_modules/why-is-node-running/example.js:11 - createServer()

# Timeout
/home/maf/dev/node_modules/why-is-node-running/example.js:13 - setTimeout(function () {

@breadc/death#

Continuing with the brutal workaround, but we can create a centralized event bus to use as the callback function for termination signals such as SIGINT, providing more complex cleanup functionality. So, I created @breadc/death library.

// Excerpt from https://github.com/yjl9903/Breadc/blob/main/packages/death/src/death.ts

const emitter = new EventEmitter();

const handlers = {
  SIGINT: makeHandler('SIGINT')
};

function makeHandler(signal: NodeJS.Signals) {
  return async (signal: NodeJS.Signals) => {
    const listeners = emitter.listeners(signal);

    // Iterate all the listener by reverse
    for (const listener of listeners.reverse()) {
      await listener(signal);
    }

    // Remove listener to restore Node.js default behaviour
    // and avoid infinite loop
    process.removeListener('SIGINT', handlers.SIGINT);
    process.kill(process.pid, context.kill);    
  };
}

export function onDeath(callback: OnDeathCallback): () => void {
  process.on('SIGINT', handlers.SIGINT);
  emitter.addListener('SIGINT', callback);
  return () => {
    emitter.removeListener('SIGINT', callback)
  };
}

As you can see, we replace the built-in event bus on the process with our own emitter. When registering callback functions with our own onDeath, we register our own callback function for process.on('SIGINT', ...), and then use the emitter to maintain the callback functions.

After receiving the SIGINT signal, we make a copy of the array of all callback functions and run them in reverse order. This can be understood as the callback functions for resource cleanup are either order-independent or may need to be cleaned up in the order they were allocated, so we choose reverse order.

Finally, we remove the SIGINT callback function to restore Node.js's default exit behavior and resend the received termination signal.

The Death of a Node.js Process#

The other side of the coin from "Why is node running?" is "Why does your Node.js process suddenly crash?"

The following reasons can cause a Node.js process to terminate unexpectedly:

Action	Example
Manual process exit	`process.exit(1)`
Uncaught exceptions	`throw new Error()`
Unhandled Promise rejections	`Promise.reject()`
Ignored error events	`EventEmitter#emit('error')`
Unhandled signals	`$ kill <PROCESS_ID>`

Table reference from The Death of a Node.js Process.
This blog post also contains tips on how to handle errors in Node.js that you can read for further information.

Uncaught exceptions can be caught manually with try catch at the problematic locations, or at the top level of the program's entry point with try catch, or you can listen to the uncaughtException event:

process.on('uncaughtException', error => {
  console.error(error)
})

Unhandled Promise rejections can be listened to with the unhandledRejection event:

process.on('unhandledRejection', error => {
  console.error(error)
})

Therefore, registering callback functions for uncaughtException and unhandledRejection events, as well as the SIGINT, SIGTERM, and SIGQUIT termination signals, can help you handle what should be done when a Node.js command-line program is "dying" in a more robust way.