Stack trace dump on Go

I'm from Java world. Means I spend most of the time coding things on Java and running them in Java Virtual Machine. This has own benefits and disadvantages. For example, I really wish JVM would use less memory and be faster from the start. It is awesome you can benefit from JIT eventually, but till then you should wait a code to run for thousands times before it can be translated into machine code.

At the same time, Java has bunch of benefits. I believe the most important one is a set of available tools: from small utilities to powerful IDEs to application servers.

One of such utility tools is jstack. It is use to print a stack dump of application threads, including main one. What a useful thing! Saved so many time when I tried to investigate why my application got stuck or is really hard on CPU.

And it is also one of the tools I really needed a few days ago: I had to understand why service written on Go is so eager for CPU resources. I spent some time searching for a tool that would work similarly to jstack but for a Go application. Couldn't find one. Yet found how to get a stack trace with a small change in my Go application.

Go standard library already has everything you need to print a stack dump: signals support to notify application and runtime.Stack function to get stack dump, and print it.

NOTE: signals are primarily supported on UNIX based systems, it means this approach might not work as expected on Windows.


And now you can run you program, and ask for a stack dump by sending a signal using kill command:
$ killall -SIGQUIT accountservice
and you'd see something like this


Output consists of stack traces for each goroutine. Each stack trace starts with 'goroutine', its number and current state. Stack trace, like Java one, shows the most recent operation on top, contains code file name (not a class name), line number, and also function parameters. Also stack trace contains information about goroutine creator.

Now, it should be easy to find why application got stuck, or what is it currently doing so that almost all CPU is used up. However, this solution might not work always: in case all resources are used, it might take a time until SIGQUIT notification is processed by application.

No comments: