torchsom: The Reference PyTorch Library for Self-Organizing Maps
Berthier, Shokry, Moreaud et al.
This paper introduces torchsom, an open-source Python library that provides a reference implementation of the Self-Organizing Map (SOM) in PyTorch. This package offers three main features: (i) dimensionality reduction, (ii) clustering, and (iii) friendly data visualization. It relies on a PyTorch backend, enabling (i) fast and efficient training of SOMs through GPU acceleration, and (ii) easy and scalable integrations with PyTorch ecosystem. Moreover, torchsom follows the scikit-learn API for ease of use and extensibility. The library is released under the Apache 2.0 license with 90% test coverage, and its source code and documentation are available at https://github.com/michelin/TorchSOM.
academic
torchsom: The Reference PyTorch Library for Self-Organizing Maps
This paper introduces torchsom, an open-source Python library based on PyTorch that provides a reference implementation for Self-Organizing Maps (SOMs). The library offers three primary functionalities: (1) dimensionality reduction, (2) clustering, and (3) user-friendly data visualization. Through its PyTorch backend, the library enables (1) fast and efficient SOM training with GPU acceleration, and (2) seamless extensible integration with the PyTorch ecosystem. Furthermore, torchsom follows the scikit-learn API design paradigm for ease of use and extensibility. The library is released under the Apache 2.0 license with 90% test coverage.
Although Self-Organizing Maps (SOMs) remain an important and enduring machine learning technique with significant value in modern data analysis, existing Python SOM implementations suffer from notable deficiencies:
Outdated Technical Architecture: Lack of GPU acceleration support
Insufficient Ecosystem Integration: Difficulty integrating with modern deep learning frameworks
Poor User Experience: Absence of user-friendly APIs and visualization capabilities
Maintenance Issues: Existing libraries are poorly maintained with incomplete documentation
First Comprehensive PyTorch-based SOM Library: Provides complete SOM implementation supporting GPU acceleration and modern deep learning workflow integration
Standardized API Design: Follows scikit-learn API style for consistent user experience
Rich Visualization Tools: Provides 9 categories of visualization functionality supporting rectangular and hexagonal topologies
Built-in Clustering Functionality: Integrates K-means, GMM, and HDBSCAN clustering algorithms
High-Quality Software Engineering: 90% test coverage, complete documentation, modular design
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps
Kohonen, T. (1990). The self-organizing map
Vettigli, G. (2018). MiniSom: Minimalistic implementation of Self Organizing Maps
Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python
Overall Assessment: This is a high-quality software engineering paper that significantly enhances SOM's usability and performance through modernized implementation. While algorithmic innovation is limited, its engineering value and practical significance are noteworthy, providing an excellent example of applying traditional machine learning algorithms in modern computing environments.