Skip to main content

check-telemetry

Overview

Standalone telemetry publishing utility for health checks. Accepts health check results as command-line parameters and publishes them to configured sinks without performing validation. Used for testing telemetry pipelines and integrating external health checks.

Command-Line Options

OptionTypeDefaultDescription
--exit_codeInteger-Health check exit code: 0=OK, 1=WARN, 2=CRITICAL, 3=UNKNOWN (required)
--health-check-nameString-Name of the health check (required)
--nodeString-Hostname where check was executed (required)
--msgString""Descriptive message about the check result
--job-idInteger0SLURM job ID associated with the check
--sinkStringdo_nothingTelemetry sink destination
--sink-optsMultiple-Sink-specific configuration
--verbose-outFlagFalseDisplay detailed output
--log-levelChoiceINFODEBUG, INFO, WARNING, ERROR, CRITICAL
--log-folderString/var/log/fb-monitoringLog directory
--heterogeneous-cluster-v1FlagFalseEnable heterogeneous cluster support

Exit Conditions

Exit CodeCondition
OK (0)Feature flag disabled (killswitch active)
OK (0)Always exits OK regardless of telemetry success or failure

Usage Examples

Basic Telemetry Test

health_checks check-telemetry \
--exit_code 0 \
--health-check-name test_check \
--node node001 \
--msg "Test passed successfully" \
--sink stdout \
[CLUSTER] \
app