2025-11-12T07:28:09.582420

Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis

Hoosh, Kamyshev, Ouerdane
In this paper, a novel neural network architecture is proposed to address the challenges in energy disaggregation algorithms. These challenges include the limited availability of data and the complexity of disaggregating a large number of appliances operating simultaneously. The proposed model utilizes independent component analysis as the backbone of the neural network and is evaluated using the F1-score for varying numbers of appliances working concurrently. Our results demonstrate that the model is less prone to overfitting, exhibits low complexity, and effectively decomposes signals with many individual components. Furthermore, we show that the proposed model outperforms existing algorithms when applied to real-world data.
academic

Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis

Basic Information

  • Paper ID: 2501.16817
  • Title: Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis
  • Authors: Sahar Moghimian Hoosh, Ilia Kamyshev, Henni Ouerdane (Skolkovo Institute of Science and Technology)
  • Classification: eess.SY cs.LG cs.SY
  • Publication Date: January 28, 2025
  • Paper Link: https://arxiv.org/abs/2501.16817

Abstract

This paper proposes a novel neural network architecture to address challenges in energy decomposition algorithms. These challenges include limited data availability and the complexity of decomposing signals from numerous concurrently operating devices. The proposed model leverages Independent Component Analysis (ICA) as the backbone of the neural network and employs F1 score to evaluate performance across varying numbers of concurrent devices. Results demonstrate that the model exhibits low propensity for overfitting, maintains low complexity, and effectively decomposes signals with multiple independent components. Furthermore, we demonstrate that the proposed model outperforms existing algorithms when applied to real-world data.

Research Background and Motivation

Problem Background

Non-Intrusive Load Monitoring (NILM), also known as energy disaggregation, is a technique that decomposes household total energy consumption into individual device-level components through advanced analytics. This concept was originally proposed by G. Hart in the 1980s and has received considerable attention in recent years due to its potential in improving energy efficiency, demand response, and load forecasting.

Core Challenges

  1. Data Limitations: Limited availability of labeled data impedes deep neural network training
  2. Complexity Issues: Decomposition complexity arising from multiple simultaneously operating devices
  3. Algorithm Limitations: Existing algorithms consume substantial memory, are sensitive to overfitting, and are difficult to deploy on sensors
  4. Dataset Bias: Limited device combinations in available datasets, biased toward most commonly used devices
  5. Practical Application Difficulties: Detecting multiple device simultaneous switching and accurate estimation in noisy real-world scenarios

Research Motivation

Existing deep learning models in NILM face challenges including reduced decomposition accuracy, increased generalization error, and overfitting due to limited training data. This research aims to develop a more robust and efficient energy decomposition algorithm by combining physical principles with ICA techniques.

Core Contributions

  1. First Application of ICA as Feature Extraction Technique: First use of ICA for feature extraction in multi-label classification models for NILM, particularly in high-frequency sampling scenarios (>1kHz)
  2. Proposed ICA+ResNetFFN Architecture: A novel neural network architecture combining physical principles, accounting for the physical characteristics of energy decomposition problems
  3. Comprehensive Performance Evaluation: Systematic evaluation of algorithm performance across varying numbers of simultaneously operating devices
  4. Synthetic Data Generation Method: Generation of linearly separable synthetic device categories based on Kirchhoff's law
  5. Experimental Validation: Demonstration of method superiority on both real and synthetic data

Methodology Details

Task Definition

Input: Aggregated power signal X (voltage and current signals) Output: Binary vector indicating whether corresponding device categories are present in the mixed signal Constraints: Handling scenarios with 1 to nclasses devices operating simultaneously, considering device repetition (e.g., multiple chargers, light bulbs)

Model Architecture

ICA+ResNetFFN Architecture

Aggregated Signal X → ICA Decomposition → Linear Projection → ResNet Block Sequence → Multi-label Classification

Core Steps:

  1. ICA Decomposition: Using FastICA implementation to obtain unmixing matrix U, decomposing aggregated signal X into nclasses+1 components:
    X' = XU^T
    

    where "+1" accounts for the Gaussian component
  2. Linear Projection: Projecting X' to space with dimension dmodel:
    Xd = X'W^T + b = XUW^T + b
    
  3. ResNet Processing: Xd passes through nblocks sequences of paired linear layers, incorporating ReLU activation and residual connections

Parameter Settings: dmodel = 64, nblocks = 15, total parameters = 65,000

Physical Principle Support

ICA selection is grounded in the following physical principles:

  • Kirchhoff's Law: Aggregated signal follows iagg(t) = Σk ik(t)
  • Linear Mixture Assumption: ICA assumes source signals are linearly mixed, consistent with power grid physical characteristics
  • Source Separation: Aggregated signal is a linear mixture of individual source contributions

Comparison with Baseline Methods

1. Temporal Pooling NILM (TP-NILM)

  • Encoder-temporal pooling-decoder structure
  • Convolutional and max pooling layers extract 256-dimensional features
  • Average pooling layers with four different filter configurations

2. FIT-PS+LSTM

  • Frequency Invariant Transform Periodic Signal (FIT-PS) feature extraction
  • Signal segmentation based on fundamental frequency using zero-crossing points
  • LSTM network processes temporal features

3. Fryze+CNN

  • Feature extraction based on Fryze power theory
  • Decomposes active current into orthogonal components: i(t) = ia(t) + if(t)
  • Four-block CNN structure with channels 16, 32, 64, 128

Experimental Setup

Datasets

PLAID Dataset

  • Scale: 1,800 samples, 30kHz sampling rate, 16 device categories
  • Preprocessing: Resampled to 3kHz, extracting 19,000 regions of interest
  • Split Ratio: 70% training, 10% validation, 20% testing

Synthetic Dataset

  • Generation Method: Artificial combination of individual device measurement signals based on Kirchhoff's law
  • Characteristics: Linearly separable categories, reducing class imbalance
  • Device Repetition: Considering 1-10 device repetitions (e.g., multiple chargers, light bulbs)
  • Random Generation: Each category appears with equal probability in mixed signals

Evaluation Metrics

  • Primary Metric: F1 score (sample-averaged)
  • Detailed Analysis: F1 score distribution for 1 to nclasses simultaneously operating devices
  • Ideal Target: Uniform F1 score distribution across different device quantities

Implementation Details

  • Hardware Environment: 2× RTX 2080 Ti GPUs, 128GB RAM
  • Training Time: 45 minutes per experiment
  • Comparison Models: 6 models (4 deep learning + 2 classical machine learning)

Experimental Results

Main Results

Synthetic Data Experiments

ModelF1 Score
ICA+ResNetFFN0.95
Random Forest0.93
k-NN0.88
FIT-PS+LSTM0.72
Fryze+CNN0.68
Temporal Pooling NILM0.67

Real Data Experiments

ModelF1 Score
ICA+ResNetFFN0.77
Random Forest0.76
k-NN0.75
Fryze+CNN0.64
FIT-PS+LSTM0.62
Temporal Pooling NILM0.60

Key Findings

1. Convergence Performance

  • ICA+ResNetFFN: Exhibits lowest validation loss and highest F1 score with smoother convergence
  • Other Models: Significant performance degradation with 2-10 concurrent devices

2. Robustness Analysis

  • Synthetic Data: Proposed method maintains consistent F1 scores across varying device quantities
  • Real Data: While no longer perfectly uniform, outperforms other algorithms in regions where they degrade

3. t-SNE Visualization Analysis

  • Real Data: Complex device category structure with multiple data point clusters or overlaps
  • Synthetic Data: Linearly separable categories with clear structure
  • Overlap Causes: Devices share common electrical components (e.g., washing machines and kettles both have heating elements)

Traditional Methods

  • k-NN Algorithm: Uses steady-state features for device identification, but performs poorly on unknown devices
  • Classical Machine Learning: Performs well on ICA features but lacks deep feature extraction capability

Deep Learning Methods

  • LSTM Networks: Improves classification accuracy combined with FIT-PS representation, but requires validation set for optimal initialization
  • CNN Methods: Deep convolutional networks based on image segmentation techniques, but feature space expansion comes at the cost of reduced temporal resolution
  • Temporal Pooling: Extends feature dimensions for multi-label classification, but exhibits higher computational complexity

Advantages of This Work

  1. Physics-Guided Design: ICA selection grounded in Kirchhoff's law
  2. Low Complexity: Relatively simple architecture design
  3. Overfitting Resistance: Superior generalization capability
  4. Multi-device Handling: Effective processing of numerous concurrent devices

Conclusions and Discussion

Main Conclusions

  1. ICA Effectiveness: Using ICA as feature extraction method significantly improves NILM performance
  2. Importance of Physical Principles: Model design considering data physical characteristics is crucial
  3. Synthetic Data Value: Linearly separable synthetic data aids in guiding optimal architecture development
  4. Performance Superiority: Outperforms existing baseline methods on both real and synthetic data

Limitations

  1. Device Quantity Constraints: Current work focuses only on three-device classification
  2. Data Dependency: Requires abundant training samples to address all possible device combinations
  3. Real Data Challenges: Complex structure and overlap issues in real device categories require further resolution
  4. Generalization Capability: Performance on larger numbers of devices requires further verification

Future Directions

  1. Extended Device Quantities: Verify method performance across more device categories
  2. Improved Feature Extraction: Address device overlap issues in real data
  3. Real-time Applications: Optimize algorithms for real-time monitoring requirements
  4. Cross-domain Generalization: Enhance model adaptability across different power grid environments

In-Depth Evaluation

Strengths

  1. Strong Innovation: First combination of ICA with deep learning for NILM with clear physical theoretical support
  2. Comprehensive Experiments: Thorough evaluation on synthetic and real data with multiple baseline comparisons
  3. In-depth Analysis: t-SNE visualization explains performance differences
  4. Practical Value: Low-complexity design facilitates real-world deployment
  5. Convincing Results: Significantly outperforms existing methods across multiple metrics

Weaknesses

  1. Device Scale Limitations: Validation only on 16 device categories, lacking large-scale verification
  2. Insufficient Theoretical Analysis: Lacks theoretical explanation for ICA+ResNet combination effectiveness
  3. Missing Complexity Analysis: Lacks detailed time and space complexity analysis
  4. Limited Robustness Testing: Insufficient evaluation of robustness to noise, device aging, and other practical factors

Impact

  1. Academic Contribution: Provides new research perspectives and methods for NILM field
  2. Practical Value: Simple and effective architecture design with real-world application potential
  3. Reproducibility: Python implementation code provided for easy reproduction and extension
  4. Inspirational Significance: Demonstrates importance of physics-guided model design

Applicable Scenarios

  1. Smart Homes: Household energy management and monitoring systems
  2. Industrial Monitoring: Factory equipment energy consumption analysis
  3. Power Grid Management: Distribution network load disaggregation and forecasting
  4. Energy Efficiency Applications: Device-level monitoring-based energy optimization

References

This paper cites 16 relevant references covering classical NILM work (Hart, 1992), deep learning methods, feature extraction techniques, and related datasets, providing solid theoretical foundation and comparison benchmarks.


Overall Assessment: This is an innovative work in the NILM field that proposes an effective solution by combining physical principles with deep learning. While exhibiting certain limitations in device scale and theoretical analysis, its core ideas and experimental results provide valuable contributions to the field's development.