MiniSom
Self Organizing Maps
MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details.
Updates about MiniSom are posted on Twitter.
Installation
Just use pip:
pip install minisom
or download MiniSom to a directory of your choice and use the setup script:
git clone https://github.com/JustGlowing/minisom.git
python setup.py install
How to use it
In order to use MiniSom you need your data organized as a Numpy matrix where each row corresponds to an observation or as list of lists like the following:
data = [[ 0.80, 0.55, 0.22, 0.03],
[ 0.82, 0.50, 0.23, 0.03],
[ 0.80, 0.54, 0.22, 0.03],
[ 0.80, 0.53, 0.26, 0.03],
[ 0.79, 0.56, 0.22, 0.03],
[ 0.75, 0.60, 0.25, 0.03],
[ 0.77, 0.59, 0.22, 0.03]]
Then you can train MiniSom just as follows:
from minisom import MiniSom
som = MiniSom(6, 6, 4, sigma=0.3, learning_rate=0.5) # initialization of 6x6 SOM
som.train(data, 100) # trains the SOM with 100 iterations
You can obtain the position of the winning neuron on the map for a given sample as follows:
som.winner(data[0])
For an overview of all the features implemented in minisom you can browse the following examples: https://github.com/JustGlowing/minisom/tree/master/examples
Export a SOM and load it again
A model can be saved using pickle as follows
import pickle
som = MiniSom(7, 7, 4)
# ...train the som here
# saving the som in the file som.p
with open('som.p', 'wb') as outfile:
pickle.dump(som, outfile)
and can be loaded as follows
with open('som.p', 'rb') as infile:
som = pickle.load(infile)
Note that if a lambda function is used to define the decay factor MiniSom will not be pickable anymore.
Explore parameters
You can use this dashboard to explore the effect of the parameters on a sample dataset: https://share.streamlit.io/justglowing/minisom/dashboard/dashboard.py
Examples
Here are some of the charts you'll see how to generate in the examples:
Seeds map | Class assignment |
Handwritteng digits mapping | Hexagonal Topology |
Color quantization | Outliers detection |
Other tutorials
- Self Organizing Maps on the Glowing Python by me ;-)
- Lecture notes from the Machine Learning course at the University of Lisbon
- Introduction to Self-Organizing by Derrick Mwiti
- Self Organizing Maps on gapminder data [in German]
- Discovering SOM, an Unsupervised Neural Network by Gisely Alves
- Video tutorials made by the GeoEngineerings School: Part 1; Part 2; Part 3; Part 4
- Video tutorial Self Organizing Maps: Introduction by SuperDataScience
- MATLAB Implementations and Applications of the Self-Organizing Map by Teuvo Kohonen (Inventor of SOM)
How to cite MiniSom
@misc{vettigliminisom,
title={MiniSom: minimalistic and NumPy-based implementation of the Self Organizing Map},
author={Giuseppe Vettigli},
year={2018},
url={https://github.com/JustGlowing/minisom/},
}
Who uses Minisom?
- Gorgoglione, Angela, Alberto Castro, Vito Iacobellis, and Andrea Gioia. A Comparison of Linear and Non-linear Machine Learning Techniques (PCA and SOM) for Characterizing Urban Nutrient Runoff. Sustainability 13, no. 4. 2021.
- Mazin A, Hawkins SH, Stringfield O, Dhillon J, Manley BJ, Jeong DK, Raghunand N. Identification of sarcomatoid differentiation in renal cell carcinoma by machine learning on multiparametric MRI. Nature, Scientific Reports. 2021.
- Qi J, Ma G, Navarro-Alarcon D, Zhang H, Lyu Y. Towards Latent Space Based Manipulation of Elastic Rods using Autoencoder Models and Robust Centerline Extractions. arXiv:2101.07513. 2021.
- Julianna C. Oliveira, Eduardo Zorita, Vimal Koul, Thomas Ludwig, Johanna Baehr.Forecast opportunities for European summer climate ensemble predictions using Self-Organising Maps. CI2020: Proceedings of the 10th International Conference on Climate Informatics. 2020.
- Gorgoglione, A., Castro, A., Gioia, A., & Iacobellis, V. Application of the Self-organizing Map (SOM) to Characterize Nutrient Urban Runoff. International Conference on Computational Science and Its Applications. Springer, Cham, 2020.
- Bonelli Toro, A. G., and M. P. Gómez. Machine Learning Applied to Acoustic Emission tor Tool Wear Classification during Milling of Composite Materials. An International Forum For The AE Science and Technology. Vol. 37. 2020.
- Mancini, R., Ritacco, A., Lanciano, G., & Cucinotta, T. XPySom: High-Performance Self-Organizing Maps. Proceedings of IEEE 32nd International Symposium on Computer Architecture and High Performance Computing. 2020.
- Chen, Yang, Nami Ashizawa, Seanglidet Yean, Chai Kiat Yeo, and Naoto Yanai. Self-Organizing Map assisted Deep Autoencoding Gaussian Mixture Model for Intrusion Detection. arXiv preprint arXiv:2008.12686 2020.
- Athanasakis E, Data-Analysis in environmental and traffic data for Thessaloniki Greece. Master Thesis at Aristotle University of Thessaloniki. 2020.
- Schillaci G, Ciria A, Lara B. Tracking Emotions: Intrinsic Motivation Grounded on Multi-Level Prediction Error Dynamics. IEEE ICDL-Epirob 2020. 2020.
- Massaro, Alessandro, Giuseppe Mastandrea, Luigi D'Oriano, Giuseppe Rocco Rana, Nicola Savino, and Angelo Galiano. Systems for an intelligent application of Automated Processes in industry: a case study from “PMI IoT Industry 4.0” project. 2020 IEEE International Workshop on Metrology for Industry 4.0 & IoT. 2020.
- Ko, Ili, Desmond Chambers, and Enda Barrett. Feature dynamic deep learning approach for DDoS mitigation within the ISP domain. International Journal of Information Security. 2020.
- Wesam Salah Alaloul and Abdul Hannan Qureshi. MData Processing Using Artificial Neural Networks. In Dynamic Data Assimilation-Beating the Uncertainties. IntechOpen, 2020.
- Gennadi Lessin, Luca Polimene, Yuri Artioli, Momme Butenschön, Darren R. Clark, Ian Brown, Andrew P. Rees. Modeling the seasonality and controls of nitrous oxide emissions on the northwest European continental shelf. Journal of Geophysical Research: Biogeosciences. 2020.
- Jorge Amaya, Romain Dupuis, Maria Elena Innocenti, and Giovanni Lapenta. Visualizing and Interpreting Unsupervised Solar Wind Classifications. Frontiers in Astronomy and Space Sciences Space Physics. 2020.
- Sandipan Dey. Python Image Processing Cookbook: Over 60 recipes to help you perform complex image processing and computer vision tasks with ease. Packt Publishing Ltd, April 2020.
- Odestål, Oscar and Palmqvist Sjövall, Anna. Adaptive Reference Images for Blood Cells using Variational Autoencoders and Self-Organizing Maps. Master Thesis, Lund University. 2020.
- Hadleigh D. Thompson Stephen J. Déry Peter L. Jackson Bernard E. Laval. A synoptic climatology of potential seiche‐inducing winds in a large intermontane lake: Quesnel Lake, British Columbia, Canada. International Journal of Climatology. 2020.
- Benyamin Motevalli, Baichuan Sun, Amanda S. Barnard. Understanding and Predicting the Cause of Defects in Graphene Oxide Nanostructures Using Machine Learning. The Journal of Physical Chemistry C. 2020.
- Daniel L. Donaldson, Dilan Jayaweera. Effective solar prosumer identification using net smart meter data. International Journal of Electrical Power & Energy Systems Volume 118. 2020.
- Pauli Tikka, Moritz Mercker, Ilya Skovorodkin, Ulla Saarela, Seppo Vainio, Veli-Pekka Ronkainen, James P. Sluka, James A. Glazier, Anna Marciniak-Czochra, Franz Schaefer. Computational Modelling of Nephron Progenitor Cell Movement and Aggregation during Kidney Organogenesis. Pre-print on biorxiv.org. 2020.
- Felix M. Riese, Sina Keller, Stefan Hinz. Supervised and Semi-Supervised Self-Organizing Maps for Regression and Classification Focusing on Hyperspectral Data. Remote Sensing, special Issue Advanced Machine Learning Approaches for Hyperspectral Data Analysis. 2020.
- Giobergia, Flavio, and Elena Baralis. Fast Self-Organizing Maps Training. 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019.
- Silva, Roberto F., Gustavo M. Mostaço, Fernando Xavier, Antonio Mauro Saraiva, and Carlos E. Cugnasca. COMPARISON OF THE K-MEANS AND SELF-ORGANIZING MAPS TECHNIQUES TO LABEL AGRICULTURAL SUPPLY CHAIN DATA. Digitizing Agriculture, 12th EFITA International Conference. 2019.
- Üstünkök, Tolga, Ozan Can Acar, and Murat Karakaya. Image Tag Refinement with Self Organizing Maps.. 2019 1st International Informatics and Software Engineering Conference (UBMYK). IEEE, 2019.
- Rohana, N. A., Yusof, N., Uti, M. N., and Din, A. H. M. Exploring spatio-temporal wave pattern using unsupervised technique. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-4/W16, 543–548. 2019.
- Nguyen, Thanh Hai. Metagenome-Based Disease Classification with Deep Learning and Visualizations Based on Self-organizing Maps. International Conference on Future Data and Security Engineering. Springer, Cham, 2019.
- Ujjawal Kamal Panchal, Sanjay Verma. Identification of Potential Future Credit Card Defaulters from Non Defaulters using Self Organizing Maps. 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). 2019.
- Mengxia Luo, Can Yang, Xiaorui Gong, Lei Yu. FuncNet: A Euclidean Embedding Approach for Lightweight Cross-platform Binary Recognition. Security and Privacy in Communication Networks, 15th EAI International Conference, SecureComm 2019.
- Melvin Gelbard. A Data Mining Approach to the Study of Dynamic Changes in Brain White Matter. Master Thesis, University of Edinburgh. 2019.
- E Mele, C Elias, A Ktena. Machine Learning Platform for Profiling and Forecasting at Microgrid Level. Electrical, Control and Communication Engineering. 2019.
- György Kovács. Smote-variants: A python implementation of 85 minority oversampling techniques. Neurocomputing, 2019 - Elsevier.
- Catalin Stoean, Ruxandra Stoean, Roberto Antonio Becerra-García, Rodolfo García-Bermúdez, Miguel Atencia, Francisco García-Lagos, Luis Velázquez-Pérez, Gonzalo Joya. Unsupervised Learning as a Complement to Convolutional Neural Network Classification in the Analysis of Saccadic Eye Movement in Spino-Cerebellar Ataxia Type 2. IWANN 2019: Advances in Computational Intelligence pp 26-37, 2019.
- I Ko, D Chambers, E Barrett . Feature dynamic deep learning approach for DDoS mitigation within the ISP domain. International Journal of Information Security, 2019.
- Leonardo Barreto, Edjard Mota. Self-organized inductive reasoning with NeMuS. June 2019.
- Casavantes, Marco, Roberto López, Luis Carlos González-Gurrola, and Manuel Montes-y-Gómez. UACh-INAOE at HASOC 2019: Detecting Aggressive Tweets by Incorporating Authors' Traits as Descriptors. In FIRE (Working Notes). 2019.
- Marco Casavantes, Roberto Lopez, and Luis Carlos Gonzalez. UACh at MEX-A3T 2019: Preliminary Results on Detecting Aggressive Tweets by Adding Author Information Via an Unsupervised Strategy. Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), 2019.
- G Roma, O Green, PA Tremblay. Adaptive Mapping of Sound Collections for Data-driven Musical Interfaces. 19th edition of NIME, 2019.
- H. D. Thompson. Wind climatology of Quesnel Lake, British Columbia. Master Thesis, University of Northern British Columbia, Prince George, BC. 2019.
- Florent Forest, Mustapha Lebbah, Hanene Azzag and Jérôme Lacaille. Deep Architectures for Joint Clustering and Visualization with Self-Organizing Maps. [email protected] 2019 (Learning Data Representation for [email protected]) Macau China. 2019.
- Stephanie Kas. Multiparameter Analysis of the Belle II Pixeldetector’s Data. Bachelor Thesis, University of Giessen, July 2019.
- Katharina Dort. Search for Highly Ionizing Particles with the Pixel Detector in the Belle II Experiment. Master Thesis, University of Giessen, May 2019.
- Rahul Kumar. Machine Learning Quick Reference: Quick and essential machine learning hacks for training smart data models. Packt Publishing Ltd, 31 Jan 2019.
- Michaela Vystrčilova. Similarity methods for music recommender systems. Bachelor Thesis in Computer Science, Charles University, 2019.
- Felix M. Riese, Sina Keller. SUSI: Supervised Self-Organizing Maps for Regression and Classification in Python.
- Dogo, E. M., et al. Sensed Outlier Detection for Water Monitoring Data and a Comparative Analysis of Quantization Error Using Kohonen Self-Organizing Maps. 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS). IEEE, 2018.
- Y. Xie, L.Le, Y. Zhou, V. V. Raghavan. Deep Learning for Natural Language Processing. Chapter of Computational Analysis and Understanding Natural Languages, Elsevier, 2018
- Enea Mele, Charalambos Elias, Aphrodite Ktena. Electricity use profiling and forecasting at microgrid level. IEEE 59th International Scientific Conference on Power and Electrical Engineering of Riga Technical University (RTUCON), 2018.
- Chintan Shah, Anjali Jivani. A Hybrid Approach of Text Summarization Using Latent Semantic Analysis and Deep Learning. 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2018.
- Katsutoshi Masai. Facial Expression Classification Using Photo-reflective Sensors on Smart Eyewear. Keio University, Doctoral Thesis, 2018.
- Katsutoshi Masai, Kai Kunze, Yuta Sugiura, Maki Sugimoto. Mapping Natural Facial Expressions Using Unsupervised Learning and Optical Sensors on Smart Eyewear. Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, 2018 ACM.
- Ili Ko, Desmond Chambers, Enda Barrett. A Lightweight DDoS Attack Mitigation System within the ISP Domain Utilising Self-organizing Map. Proceedings of the Future Technologies, 2018 Springer.
- T. M. Nam et al. Self-organizing map-based approaches in DDoS flooding detection using SDN. 2018 International Conference on Information Networking (ICOIN), 2018.
- Li Yuan Implementation of Self-Organizing Maps with Python. Master Thesis, University of Rhode Island, 2018.
- Ying Xie, Linh Le, Yiyun Zhou, Vijay V.Raghavan. Deep Learning for Natural Language Processing. Elsevier Handbook of Statistics, 2018.
- André de Vasconcelos Santos Silva. Sparse Distributed Representations as Word Embeddings for Language Understanding. Master Thesis, University Institute of Lisbon, 2018.
- Vincent Fortuin, Matthias Hüser, Francesco Locatello, Heiko Strathmann, and Gunnar Rätsch. Deep Self-Organization: Interpretable Discrete Representation Learning on Time Series. 2018.
- John Mwangi Wandeto. Self-Organizing Map Quantization Error Approach for Detecting Temporal Variations in Image Sets. Doctoral Thesis, University of Strasbourg, 2018.
- Birgitta Dresp-Langley, John Mwangi Wandeto, Henry Okola Nyongesa. Using the quantization error from Self‐Organizing Map (SOM) output for fast detection of critical variations in image time series. ISTE OpenScience, 2018.
- John M. Wandeto, Henry O. Nyongesa, Birgitta Dresp-Langley. Detection of Structural Change in Geographic Regions of Interest by Self Organized Mapping: Las Vegas City and Lake Mead across the Years. 2018.
- Denis Mayr Lima Martins, Gottfried Vossen, Fernando Buarque de Lima Neto. Learning database queries via intelligent semiotic machines. IEEE Latin American Conference on Computational Intelligence (LA-CCI), 2017.
- Udemy online course. Deep Learning A-Z™: Hands-On Artificial Neural Networks
- Fredrik Broch Elgaaen, Nicholas Mowatt Larssen. Data mining i banksektoren - Prediksjonsmodellering og analyse av kunder som sier opp boliglån. University of Oslo, May 2017.
- Óscar Clavería González, Enric Monte Moreno, Salvador Torra Porras. A self-organizing map analysis of survey-based agents׳ expectations before impending shocks for model selection: The case of the 2008 financial crisis. International Economics Volume 146, Pages 40–58. August 2016.
- Sameen Mansha, Faisal Kamiran, Asim Karim, Aizaz Anwar. A Self-Organizing Map for Identifying InfluentialCommunities in Speech-based Networks. Proceeding CIKM '16 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pages 1965-1968. 2016.
- Sameen Mansha, Zaheer Babar, Faisal Kamiran, Asim Karim. Neural Network Based Association Rule Mining from Uncertain Data. Neural Information Processing Volume 9950 of the series Lecture Notes in Computer Science pp 129-136. 2016.
- Makiyama, Vitor Hirota, M. Jordan Raddick, and Rafael DC Santos. Text Mining Applied to SQL Queries: A Case Study for the SDSS SkyServer. 2nd Annual International Symposium on Information Management and Big Data. 2015.
- Remi Domingues. Machine Learning for Unsupervised Fraud Detection. Royal Institute of Technology School of Computer Science and Communication KTH CSC. 2015.
- Ivana Kajić, Guido Schillaci, Saša Bodiroža, Verena V. Hafner, Learning hand-eye coordination for a humanoid robot using SOMs. Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction Pages 192-193.
Guidelines to contribute
- In the description of your Pull Request explain clearly what does it implements/fixes and your changes. Possibly give an example in the description of the PR. In cases that the PR is about a code speedup, report a reproducible example and quantify the speedup.
- Give your pull request a helpful title that summarises what your contribution does.
- Write unit tests for your code and make sure the existing tests are up to date.
pytest
can be used for this:
pytest minisom.py
- Make sure that there a no stylistic issues using
pycodestyle
:
pycodestyle minisom.py
- Make sure your code is properly commented and documented. Each public method needs to be documented as the existing ones.