Watchdog
Introduction
The application manager features a built-in watchdog mechanism that monitors the main thread's event loop, every Quick window's render thread and all clients of the application manager's Wayland compositor for unresponsive behavior. The event loop and render thread monitoring is implemented for both the System UI as well as for any QML application running in multi-process mode.
The watchdog is implemented as a separate thread that periodically (see checkInterval
) checks the state of the monitored subsystems. If any of these fail to respond within a given time frame, the watchdog will first issue a warning (see warnTimeout
) and eventually kill (see killTimeout
) the affected thread or client. Please keep in mind, that due to the periodic nature of this check, the actual warning and killing timeout messages might be delayed by up to the checkInterval
. Killing the affected thread directly (instead of just aborting the whole process) will cause the application manager's crash handler to print a backtrace for the stuck thread, which can be very useful to diagnose freezes.
Note: The watchdog is disabled by default. You need to enable it by setting at least one of the checkInterval
configuration values in the main configuration file to a timeout that suits your specific device setup.
Systemd Support
Support for systemd's watchdog is built into the application manager as well: see {Installation}.
If enabled, the application manager will automatically detect at startup if it was launched by systemd and if the systemd unit file has the WatchdogSec
option set. If this is the case, the application manager will periodically send the requested notifications to systemd from its watchdog thread.
Logging
The watchdog logs all its messages to the am.wd
logging category. All logging is done from the separate watchdog thread and the main thread to minimize interference with the monitored threads or render loops.
The following logging levels are used:
Log Level | Description |
---|---|
info | The watchdog started (or stopped) watching an object (thread, window, Wayland client). |
warning | A warnTimeout has been exceeded. |
critical | A killTimeout has been exceeded. |
Performance Considerations
Nothing in life comes for free and the watchdog is no exception. While the overhead of the watchdog is generally very low, it does impact three areas:
- For every frame rendered, the watchdog adds three invocations of a direct signal/slot connection: each call retrieves the current system time and stores it via an atomic fetch-and-store operation.
- For every Qt event delivered in a watched thread, the watchdog adds two callbacks: each call checks the state via an atomic load, then retrieves the current system time, but only one stores it via an atomic fetch-and-store operation.
- The separate watchdog thread runs a periodic check (see
checkInterval
). It retrieves the current system time and then collects time data via atomic load operations once for each of the watched objects.
Configuration
The watchdog is configured via the watchdog
key in the main configuration file. Applications inherit these settings, but can also override any value by setting the corresponding key in their info.yaml manifest file.
The following interval and timeout values listed below let you specify the exact times with milli-seconds precision.
Setting any of the values to 0ms
(or off
) disables the respective functionality.
There's also the --disable-watchdog
command line option that makes your life easier when debugging or testing in a production environment, as it completely disables all watchdog functionality in the System UI as well as in QML applications.
Config Key | Type | Description |
---|---|---|
eventloop/checkInterval | duration | If set to a positive time duration, the main event loop will be monitored by triggering a timer every checkInterval . (default: off) |
eventloop/warnTimeout | duration | In case the check timer is not firing within warnTimeout , the watchdog will print a warning. In addition another warning will be printed if the timer does eventually fire, stating the exact duration the event loop was blocked. (default: off) |
eventloop/killTimeout | duration | In case the check timer is not firing within killTimeout , the watchdog will print a critical warning and then abort the thread running the main event loop. (default: off) |
quickwindow/checkInterval | duration | The render thread monitor works a bit differently to the event loop and Wayland one: Instead of just a single "blocked" state, three different states are monitored:
As a render thread is not always actively rendering, the watchdog will only print a warning every |
quickwindow/warnTimeout | duration | The watchdog will print a warning if a render thread is stuck in any of the syncing, rendering or swapping states for longer than warnTimeout . In addition another warning will be printed if the thread eventually leaves the state it was stuck in, stating the exact duration it was blocked. (default: off) |
quickwindow/killTimeout | duration | In case a render thread is stuck in any of the syncing, rendering or swapping states for longer than killTimeout , the watchdog will print a critical warning and then abort the thread. (default: off) |
wayland/checkInterval | duration | If set to a positive time duration, all currently active Wayland clients that use the XDG shell protocol will be pinged every checkInterval . (default: off) |
wayland/warnTimeout | duration | In case the pong reply from the Wayland client is not received within warnTimeout , the watchdog will print a warning. In addition another warning will be printed if the pong reply is eventually received, stating the exact duration the ping/pong round-trip took. (default: off) |
wayland/killTimeout | duration | In case the pong reply from the Wayland client is not received within killTimeout , the watchdog will print a critical warning and then kill the unresponsive Wayland client. For application manager apps, ApplicationObject::stop() with forceKill set to true will be invoked. Other apps will be killed by raising SIGKILL on the process id associated with the Wayland client. (default: off) |
© 2024 The Qt Company Ltd. Documentation contributions included herein are the copyrights of their respective owners. The documentation provided herein is licensed under the terms of the GNU Free Documentation License version 1.3 as published by the Free Software Foundation. Qt and respective logos are trademarks of The Qt Company Ltd. in Finland and/or other countries worldwide. All other trademarks are property of their respective owners.