Linux Diagnostic Information

June 19th, 2020


When the Mango process is stuck or performing poorly it can be useful to dump the thread stacks and object counts from memory to get a better picture of what is happening inside the JVM. Note that some of the JDK commands may not be on your path so you may need to find them in your jdk/bin installation directory. You may also need to prefix the jmap/jstack commands with sudo in order to run them as the Mango user.

sudo -u mango <cmd>

Find the Mango process ID (PID)

To get the full output of the command used to run Mango, along with its PID you can run

ps aux | grep java

If you only have a single java process you can easily get its PID by running

pidof java

However the best way to find the PID if you use systemd to start Mango is to run

systemctl show --property MainPID --value mango

Memory histogram

To count the number of size of each type of object in Mango’s memory use this command. It will output the count in descending order and can be useful to see what is using the majority of memory in the JVM.

jmap -histo $(systemctl show --property MainPID --value mango) > mangoMemMap.txt

Heap Dump

Heap dumps are larger files that contain a snapshot in time of the entire JVM memory space and are a way to do a more detailed analysis using tools that show the connections between all the objects in memory. When diagnosing memory problems this is the best way to ‘see’ into the heap of a running JVM. It is recommended to use JVisualVM or Eclipse MAT when analyzing the generated files.

To generate a full heap dump with objects that are ready for garbage collection use this (Note the file will be significantly bigger)

jmap -dump:format=b,file=mangoHeapFull $(systemctl show --property MainPID --value mango)
jmap -dump:live,format=b,file=mangoHeapLive $(systemctl show --property MainPID --value mango)

Thread dump / stack trace

You can use the jstack tool to see what threads are running, their state, and their stack traces. This will tell you what Mango is doing at any point in time. There is also a tool built into Mango that shows the same information under "System information", "Threads".

jstack -l $(systemctl show --property MainPID --value mango) > mangoThreads.txt

Profiling

The Radix IoT support staff may ask you to capture a "flight recording" from your running Mango process. The allows us to diagnose problems inside the application while it is running.

jcmd $(systemctl show --property MainPID --value mango) JFR.start duration=60s filename=mango.jfr

Other useful commands are JFR.check to check the status of the recording, and JFR.stop to stop recording.

After the recording file is captured, you can send it to the Radix IoT staff or view it yourself using the JDK Mission Control app.

Number of open files

The Mango timeseries database (mangoNoSql module) uses individual files for each data point. This means than Mango will attempt to open a large number of files at once, we recommend setting db.nosql.maxOpenFiles in your mango.properties file to 2x the number of data points you are using. If Mango attempts to open more files than Linux will permit, you will see an error message like this in your log file:

ERROR 2021-09-13T11:04:16,717 (com.infiniteautomation.nosql.MangoNoSqlBatchWriteBehindManager$PointWrittenEntry.writeBatch:499) - Should never happen, data loss for unknown reason
java.lang.RuntimeException: java.io.FileNotFoundException: /data/mango/databases/mangoTSDB/74/12010/759.data.rev (Too many open files)

If you use the supplied systemd service file (mango.service) and start Mango via systemd you should not see this error. The service file sets LimitNOFILE=1048576 which should be more than enough for most installations.

Number of memory mapped files

The Mango timeseries database (mangoNoSql module) uses memory mapped files to provide improved read speeds. Linux however limits the number of memory mapped files that Mango can open at once. If you hit this limit, the operating system will kill Java and you will find a hs_err_pidxxx.log file in your Mango home directory (/opt/mango). The Hotspot error log will contain a confusing message like this at the top of the file:

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate x bytes for AllocateHeap
# Out of Memory Error (allocation.cpp:46), pid=1338811, tid=1338824

You can check your Linux memory mapped file limit by running

cat /proc/sys/vm/max_map_count

You can check how many memory mapped files Mango is using by running

cat /proc/$(systemctl show --property MainPID --value mango)/maps | wc -l

Or you can also count the memory mapped files at the time of crash in the hs_err_pidxxx.log file

cat hs_err_pid1338811.log | grep /opt/mango/databases | wc -l

For more information please see this MapDB article on memory mapped files.

Running out of disk space while using the MapDB point value cache

If you run out of disk space while using the MapDB point value cache inside of Mango your JVM will probably crash with a hs_err_pidxxx.log file like this

# A fatal error has been detected by the Java Runtime Environment:
# SIGBUS (0x7) at pc=0x00007f6d80937086, pid=3534205, tid=3535458
# Problematic frame:
# v  ~StubRoutines::jshort_disjoint_arraycopy

Copyright © 2024 Radix IoT, LLC.