Access to a database is vital for running jobs, running scheduler and cooperation with other nodes. Touching a database is also used for detection of dead process. When the JVM process of NodeB is killed, it stops touching the database and the other nodes may detect it.
0s-30s last touch on DB
NodeB or its connection to the database is down
90s NodeA sees the last touch
0-40s check-task running on NodeA detects obsolete touch from NodeB
status of NodeB is changed to stopped
, jobs running on the NodeB are solved
,
which means that their status is changed to UNKNOWN
and the event is dispatched among the Cluster nodes.
The job result is considered as error
.
cluster.node.touch.interval
Periodicity of a database touch, in milliseconds.
Default: 20000
cluster.node.touch.forced_stop.interval
An interval when the other nodes accept the last touch, in milliseconds.
Default: 60000
cluster.node.check.checkMinInterval
Periodicity of Cluster node checks, in milliseconds.
Default: 40000
cluster.node.touch.forced_stop.solve_running_jobs.enabled
A boolean value
which can switch the solving
of running jobs mentioned above.