Skip to main content

check-ib-interfaces

Overview

Validates that the expected number of InfiniBand network interfaces are present and in UP operational state using the ip command with JSON output parsing.

Requirements

  • ip: iproute2 package with JSON output support (v4.13+)
  • InfiniBand Drivers: Mellanox/NVIDIA OFED or inbox drivers

Package Installation

# RHEL/CentOS
yum install iproute

# Ubuntu/Debian
apt-get install iproute2

Command-Line Options

OptionTypeDefaultDescription
--interface-numInteger8Expected number of UP InfiniBand interfaces
--timeoutInteger300Command execution timeout in seconds
--sinkStringdo_nothingTelemetry sink destination
--sink-optsMultiple-Sink-specific configuration
--verbose-outFlagFalseDisplay detailed output
--log-levelChoiceINFODEBUG, INFO, WARNING, ERROR, CRITICAL
--log-folderString/var/log/fb-monitoringLog directory
--heterogeneous-cluster-v1FlagFalseEnable heterogeneous cluster support

Exit Conditions

Exit CodeCondition
OK (0)Feature flag disabled (killswitch active)
OK (0)Interface count matches expected
WARN (1)Command execution failed
WARN (1)Exception during execution
CRITICAL (2)Interface count mismatch
CRITICAL (2)Invalid JSON output

Usage Examples

Basic Interface Count Check

health_checks check-ib check-ib-interfaces \
--interface-num 8 \
--sink otel \
--sink-opts "log_resource_attributes={'attr_1': 'value1'}" \
[CLUSTER] \
app

Custom Interface Count

health_checks check-ib check-ib-interfaces \
--interface-num 16 \
--timeout 30 \
--sink file --sink-opts filepath=/var/log/ib_interfaces.json \
[CLUSTER] \
app

Debug Mode

health_checks check-ib check-ib-interfaces \
--log-level DEBUG \
--verbose-out \
--sink stdout \
[CLUSTER] \
app