HAO-AVP: An Entropy-Gini Reinforcement Learning Assisted Hierarchical Void Repair Protocol for Underwater Wireless Sensor Networks

. 2026 Jan 20;26(2):684. doi: 10.3390/s26020684

Algorithm 1: RL-Based Routing with Entropy & Gini Reward

Input: Current Node

n_{i}

, Neighbor Set N_{i}

Output: Next Hop

n_{n e x t}

Observe State S_{t} = (E_{r e s}, D i s t_{\sin k}, Q_{l e n})

2: // Action Selection (Epsilon-Greedy)

IF r a n d o m () < ε

THEN Select random n_{n e x t}

form N_{i}

ELSE n_{n e x t} = a r g m a x_{n_{j}} Q (S_{t}, n_{j})

Forward Packet to n_{n e x t}

and Observe S_{t + 1}

6: // Reward Calculation (Core Innovation)

Compute R_{p r o g r e s s}

and R_{e n e r g y}

using Equations (23) and (24)

Compute R_{e q u i l i b r i u m}

based on Entropy Equation (27) & Gini Equation (28)

R_{t o t a l} = w 1 \cdot R_{p r o g r e s s} + w 2 \cdot R_{e n e r g y} + w 3 \cdot R_{e q u i l i b r i u m}

10: // Update Q-Value

11:

Q (S_{t}, n_{n e x t}) = (1 - α) \cdot Q (S_{t}, n_{n e x t}) + α \cdot (R_{t o t a l} + γ \cdot m a x (Q (S_{t + 1})))