2025-11-12T07:28:09.582420

Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis

Hoosh, Kamyshev, Ouerdane

In this paper, a novel neural network architecture is proposed to address the challenges in energy disaggregation algorithms. These challenges include the limited availability of data and the complexity of disaggregating a large number of appliances operating simultaneously. The proposed model utilizes independent component analysis as the backbone of the neural network and is evaluated using the F1-score for varying numbers of appliances working concurrently. Our results demonstrate that the model is less prone to overfitting, exhibits low complexity, and effectively decomposes signals with many individual components. Furthermore, we show that the proposed model outperforms existing algorithms when applied to real-world data.

academic

Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis

Basic Information

Paper ID: 2501.16817
Title: Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis
Authors: Sahar Moghimian Hoosh, Ilia Kamyshev, Henni Ouerdane (Skolkovo Institute of Science and Technology)
Classification: eess.SY cs.LG cs.SY
Publication Date: January 28, 2025
Paper Link: https://arxiv.org/abs/2501.16817

Abstract

This paper proposes a novel neural network architecture to address challenges in energy decomposition algorithms. These challenges include limited data availability and the complexity of decomposing signals from numerous concurrently operating devices. The proposed model leverages Independent Component Analysis (ICA) as the backbone of the neural network and employs F1 score to evaluate performance across varying numbers of concurrent devices. Results demonstrate that the model exhibits low propensity for overfitting, maintains low complexity, and effectively decomposes signals with multiple independent components. Furthermore, we demonstrate that the proposed model outperforms existing algorithms when applied to real-world data.

Research Background and Motivation

Problem Background

Non-Intrusive Load Monitoring (NILM), also known as energy disaggregation, is a technique that decomposes household total energy consumption into individual device-level components through advanced analytics. This concept was originally proposed by G. Hart in the 1980s and has received considerable attention in recent years due to its potential in improving energy efficiency, demand response, and load forecasting.

Core Challenges

Data Limitations: Limited availability of labeled data impedes deep neural network training
Complexity Issues: Decomposition complexity arising from multiple simultaneously operating devices
Algorithm Limitations: Existing algorithms consume substantial memory, are sensitive to overfitting, and are difficult to deploy on sensors
Dataset Bias: Limited device combinations in available datasets, biased toward most commonly used devices
Practical Application Difficulties: Detecting multiple device simultaneous switching and accurate estimation in noisy real-world scenarios

Research Motivation

Existing deep learning models in NILM face challenges including reduced decomposition accuracy, increased generalization error, and overfitting due to limited training data. This research aims to develop a more robust and efficient energy decomposition algorithm by combining physical principles with ICA techniques.

Core Contributions

First Application of ICA as Feature Extraction Technique: First use of ICA for feature extraction in multi-label classification models for NILM, particularly in high-frequency sampling scenarios (>1kHz)
Proposed ICA+ResNetFFN Architecture: A novel neural network architecture combining physical principles, accounting for the physical characteristics of energy decomposition problems
Comprehensive Performance Evaluation: Systematic evaluation of algorithm performance across varying numbers of simultaneously operating devices
Synthetic Data Generation Method: Generation of linearly separable synthetic device categories based on Kirchhoff's law
Experimental Validation: Demonstration of method superiority on both real and synthetic data

Methodology Details

Task Definition

Input: Aggregated power signal X (voltage and current signals) Output: Binary vector indicating whether corresponding device categories are present in the mixed signal Constraints: Handling scenarios with 1 to nclasses devices operating simultaneously, considering device repetition (e.g., multiple chargers, light bulbs)

Model Architecture

ICA+ResNetFFN Architecture

Aggregated Signal X → ICA Decomposition → Linear Projection → ResNet Block Sequence → Multi-label Classification

Core Steps:

ICA Decomposition: Using FastICA implementation to obtain unmixing matrix U, decomposing aggregated signal X into nclasses+1 components:
```
X' = XU^T
```
where "+1" accounts for the Gaussian component
Linear Projection: Projecting X' to space with dimension dmodel:
```
Xd = X'W^T + b = XUW^T + b
```
ResNet Processing: Xd passes through nblocks sequences of paired linear layers, incorporating ReLU activation and residual connections

Parameter Settings: dmodel = 64, nblocks = 15, total parameters = 65,000

Physical Principle Support

ICA selection is grounded in the following physical principles:

Kirchhoff's Law: Aggregated signal follows iagg(t) = Σk ik(t)
Linear Mixture Assumption: ICA assumes source signals are linearly mixed, consistent with power grid physical characteristics
Source Separation: Aggregated signal is a linear mixture of individual source contributions

Comparison with Baseline Methods

1. Temporal Pooling NILM (TP-NILM)

Encoder-temporal pooling-decoder structure
Convolutional and max pooling layers extract 256-dimensional features
Average pooling layers with four different filter configurations

2. FIT-PS+LSTM

Frequency Invariant Transform Periodic Signal (FIT-PS) feature extraction
Signal segmentation based on fundamental frequency using zero-crossing points
LSTM network processes temporal features

3. Fryze+CNN

Feature extraction based on Fryze power theory
Decomposes active current into orthogonal components: i(t) = ia(t) + if(t)
Four-block CNN structure with channels 16, 32, 64, 128

Experimental Setup

Datasets

PLAID Dataset

Scale: 1,800 samples, 30kHz sampling rate, 16 device categories
Preprocessing: Resampled to 3kHz, extracting 19,000 regions of interest
Split Ratio: 70% training, 10% validation, 20% testing

Synthetic Dataset

Generation Method: Artificial combination of individual device measurement signals based on Kirchhoff's law
Characteristics: Linearly separable categories, reducing class imbalance
Device Repetition: Considering 1-10 device repetitions (e.g., multiple chargers, light bulbs)
Random Generation: Each category appears with equal probability in mixed signals

Evaluation Metrics

Primary Metric: F1 score (sample-averaged)
Detailed Analysis: F1 score distribution for 1 to nclasses simultaneously operating devices
Ideal Target: Uniform F1 score distribution across different device quantities

Implementation Details

Hardware Environment: 2× RTX 2080 Ti GPUs, 128GB RAM
Training Time: 45 minutes per experiment
Comparison Models: 6 models (4 deep learning + 2 classical machine learning)

Experimental Results

Main Results

Synthetic Data Experiments

Model	F1 Score
ICA+ResNetFFN	0.95
Random Forest	0.93
k-NN	0.88
FIT-PS+LSTM	0.72
Fryze+CNN	0.68
Temporal Pooling NILM	0.67

Real Data Experiments

Model	F1 Score
ICA+ResNetFFN	0.77
Random Forest	0.76
k-NN	0.75
Fryze+CNN	0.64
FIT-PS+LSTM	0.62
Temporal Pooling NILM	0.60

Key Findings

1. Convergence Performance

ICA+ResNetFFN: Exhibits lowest validation loss and highest F1 score with smoother convergence
Other Models: Significant performance degradation with 2-10 concurrent devices

2. Robustness Analysis

Synthetic Data: Proposed method maintains consistent F1 scores across varying device quantities
Real Data: While no longer perfectly uniform, outperforms other algorithms in regions where they degrade

3. t-SNE Visualization Analysis

Real Data: Complex device category structure with multiple data point clusters or overlaps
Synthetic Data: Linearly separable categories with clear structure
Overlap Causes: Devices share common electrical components (e.g., washing machines and kettles both have heating elements)

Traditional Methods

k-NN Algorithm: Uses steady-state features for device identification, but performs poorly on unknown devices
Classical Machine Learning: Performs well on ICA features but lacks deep feature extraction capability

Deep Learning Methods

LSTM Networks: Improves classification accuracy combined with FIT-PS representation, but requires validation set for optimal initialization
CNN Methods: Deep convolutional networks based on image segmentation techniques, but feature space expansion comes at the cost of reduced temporal resolution
Temporal Pooling: Extends feature dimensions for multi-label classification, but exhibits higher computational complexity

Advantages of This Work

Physics-Guided Design: ICA selection grounded in Kirchhoff's law
Low Complexity: Relatively simple architecture design
Overfitting Resistance: Superior generalization capability
Multi-device Handling: Effective processing of numerous concurrent devices

Conclusions and Discussion

Main Conclusions

ICA Effectiveness: Using ICA as feature extraction method significantly improves NILM performance
Importance of Physical Principles: Model design considering data physical characteristics is crucial
Synthetic Data Value: Linearly separable synthetic data aids in guiding optimal architecture development
Performance Superiority: Outperforms existing baseline methods on both real and synthetic data

Limitations

Device Quantity Constraints: Current work focuses only on three-device classification
Data Dependency: Requires abundant training samples to address all possible device combinations
Real Data Challenges: Complex structure and overlap issues in real device categories require further resolution
Generalization Capability: Performance on larger numbers of devices requires further verification

Future Directions

Extended Device Quantities: Verify method performance across more device categories
Improved Feature Extraction: Address device overlap issues in real data
Real-time Applications: Optimize algorithms for real-time monitoring requirements
Cross-domain Generalization: Enhance model adaptability across different power grid environments

In-Depth Evaluation

Strengths

Strong Innovation: First combination of ICA with deep learning for NILM with clear physical theoretical support
Comprehensive Experiments: Thorough evaluation on synthetic and real data with multiple baseline comparisons
In-depth Analysis: t-SNE visualization explains performance differences
Practical Value: Low-complexity design facilitates real-world deployment
Convincing Results: Significantly outperforms existing methods across multiple metrics

Weaknesses

Device Scale Limitations: Validation only on 16 device categories, lacking large-scale verification
Insufficient Theoretical Analysis: Lacks theoretical explanation for ICA+ResNet combination effectiveness
Missing Complexity Analysis: Lacks detailed time and space complexity analysis
Limited Robustness Testing: Insufficient evaluation of robustness to noise, device aging, and other practical factors

Impact

Academic Contribution: Provides new research perspectives and methods for NILM field
Practical Value: Simple and effective architecture design with real-world application potential
Reproducibility: Python implementation code provided for easy reproduction and extension
Inspirational Significance: Demonstrates importance of physics-guided model design

Applicable Scenarios

Smart Homes: Household energy management and monitoring systems
Industrial Monitoring: Factory equipment energy consumption analysis
Power Grid Management: Distribution network load disaggregation and forecasting
Energy Efficiency Applications: Device-level monitoring-based energy optimization

References

This paper cites 16 relevant references covering classical NILM work (Hart, 1992), deep learning methods, feature extraction techniques, and related datasets, providing solid theoretical foundation and comparison benchmarks.

Overall Assessment: This is an innovative work in the NILM field that proposes an effective solution by combining physical principles with deep learning. While exhibiting certain limitations in device scale and theoretical analysis, its core ideas and experimental results provide valuable contributions to the field's development.