Sustainability, Vol. 18, Pages 1562: HydroSNN: Event-Driven Computer Vision with Spiking Transformers for Energy-Efficient Edge Perception in Sustainable Water Conservancy and Urban Water Utilities

Fuente: Sustainability - Revista científica (MDPI)
Sustainability, Vol. 18, Pages 1562: HydroSNN: Event-Driven Computer Vision with Spiking Transformers for Energy-Efficient Edge Perception in Sustainable Water Conservancy and Urban Water Utilities
Sustainability doi: 10.3390/su18031562
Authors:
Jing Liu
Hong Liu
Yangdong Li

Digital transformation in water conservancy and urban water utilities demands perception systems that are accurate, fast, and energy-efficient and maintainable over long service lifecycles at the edge. We present HydroSNN, a neuromorphic computer-vision framework that couples an event-driven sensing pipeline with a spiking-transformer backbone to support monitoring of canals, reservoirs, treatment plants, and buried pipeline networks. By reducing always-on compute and unnecessary data movement, HydroSNN targets sustainability goals in smart water infrastructure: lower operational energy use, fewer site visits, and improved resilience under harsh illumination and weather. HydroSNN introduces three novel components: (i) spiking temporal tokenization (STT), which converts asynchronous events and optional frames into latency-aware spike tokens while preserving motion cues relevant to hydraulics; (ii) physics-guided spiking attention (PGSA), which injects lightweight mass-conservation/continuity constraints into attention weights via a differentiable regularizer to suppress physically implausible interactions; and (iii) cross-modal self-supervision (CM-SSL), which aligns RGB frames, event streams, and low-cost acoustic/vibration traces using masked prediction to reduce annotation requirements. We evaluate HydroSNN on public water-surface and event-vision benchmarks (MaSTr1325, SeaDronesSee, DSEC, MVSEC, DAVIS, and DDD20) and report accuracy, latency, and an operation-based energy proxy. HydroSNN improves mIoU/F1 over strong CNN/ViT baselines while reducing end-to-end latency and the estimated energy proxy in event-driven settings. These efficiency gains are practically relevant for off-grid or power-constrained deployments and support sustainable development by enabling continuous, low-power monitoring and timely anomaly response. These results demonstrate that event-driven spiking vision, augmented with simple physics guidance, offers a practical and efficient solution for resilient perception in smart water infrastructure.