When the Mango process is stuck or performing poorly it can be useful to dump the thread stacks and object counts from memory to get a better picture of what is happening inside the JVM. Note that some of the JDK commands may not be on your path so you may need to find them in your jdk/bin installation directory.
You may also need to prefix the jmap/jstack commands with sudo in order to run them as the Mango user.
sudo -u mango <cmd>
Find the Mango process ID (PID)
To get the full output of the command used to run Mango, along with its PID you can run
ps aux | grep java
If you only have a single java process you can easily get its PID by running
pidof java
However the best way to find the PID if you use systemd to start Mango is to run
systemctl show --property MainPID --value mango
Memory histogram
To count the number of size of each type of object in Mango’s memory use this command. It will output the count in descending order and can be useful to see what is using the majority of memory in the JVM.
jmap -histo $(systemctl show --property MainPID --value mango) > mangoMemMap.txt
Heap Dump
Heap dumps are larger files that contain a snapshot in time of the entire JVM memory space and are a way to do a more detailed analysis using tools that show the connections between all the objects in memory. When diagnosing memory problems this is the best way to ‘see’ into the heap of a running JVM. It is recommended to use JVisualVM or Eclipse MAT when analyzing the generated files.
To generate a full heap dump with objects that are ready for garbage collection use this (Note the file will be significantly bigger)
jmap -dump:format=b,file=mangoHeapFull $(systemctl show --property MainPID --value mango)
jmap -dump:live,format=b,file=mangoHeapLive $(systemctl show --property MainPID --value mango)
Thread dump / stack trace
You can use the jstack tool to see what threads are running, their state, and their stack traces. This will tell you what Mango is doing at any point in time.
There is also a tool built into Mango that shows the same information under "System information", "Threads".
jstack -l $(systemctl show --property MainPID --value mango) > mangoThreads.txt
Profiling
The Radix IoT support staff may ask you to capture a "flight recording" from your running Mango process.
The allows us to diagnose problems inside the application while it is running.
jcmd $(systemctl show --property MainPID --value mango) JFR.start duration=60s filename=mango.jfr
Other useful commands are JFR.check
to check the status of the recording, and JFR.stop
to stop recording.
After the recording file is captured, you can send it to the Radix IoT staff or view it yourself using
the JDK Mission Control app.
Number of open files
The Mango timeseries database (mangoNoSql module) uses individual files for each data point. This means than Mango will attempt to open a large number of files at once, we recommend setting db.nosql.maxOpenFiles
in your mango.properties file to 2x the number of data points you are using.
If Mango attempts to open more files than Linux will permit, you will see an error message like this in your log file:
ERROR 2021-09-13T11:04:16,717 (com.infiniteautomation.nosql.MangoNoSqlBatchWriteBehindManager$PointWrittenEntry.writeBatch:499) - Should never happen, data loss for unknown reason
java.lang.RuntimeException: java.io.FileNotFoundException: /data/mango/databases/mangoTSDB/74/12010/759.data.rev (Too many open files)
If you use the supplied systemd service file (mango.service) and start Mango via systemd you should not see this error. The service file sets LimitNOFILE=1048576
which should be more than enough for most installations.
Number of memory mapped files
The Mango timeseries database (mangoNoSql module) uses memory mapped files to provide improved read speeds. Linux however limits the number of memory mapped files that Mango can open at once. If you hit this limit, the operating system will kill Java and you will find a hs_err_pidxxx.log file in your Mango home directory (/opt/mango). The Hotspot error log will contain a confusing message like this at the top of the file:
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate x bytes for AllocateHeap
# Out of Memory Error (allocation.cpp:46), pid=1338811, tid=1338824
You can check your Linux memory mapped file limit by running
cat /proc/sys/vm/max_map_count
You can check how many memory mapped files Mango is using by running
cat /proc/$(systemctl show --property MainPID --value mango)/maps | wc -l
Or you can also count the memory mapped files at the time of crash in the hs_err_pidxxx.log file
cat hs_err_pid1338811.log | grep /opt/mango/databases | wc -l
For more information please see this MapDB article on memory mapped files.
Running out of disk space while using the MapDB point value cache
If you run out of disk space while using the MapDB point value cache inside of Mango your JVM will probably crash with a hs_err_pidxxx.log file like this
# A fatal error has been detected by the Java Runtime Environment:
# SIGBUS (0x7) at pc=0x00007f6d80937086, pid=3534205, tid=3535458
# Problematic frame:
# v ~StubRoutines::jshort_disjoint_arraycopy