A Matlab toolbox for Deep Learning.
Deep Learning is a new subfield of machine learning that focuses on learning deep hierarchical models of data. It is inspired by the human brain’s apparent deep (layered, hierarchical) architecture. A good overview of the theory of Deep Learning theory is Learning Deep Architectures for AI
For a more informal introduction, see the following videos by Geoffrey Hinton and Andrew Ng.
- The Next Generation of Neural Networks (Hinton, 2007)
- Recent Developments in Deep Learning (Hinton, 2010)
- Unsupervised Feature Learning and Deep Learning (Ng, 2011)
If you use this toolbox in your research please cite:
Prediction as a candidate for learning deep hierarchical models of data (Palm, 2012)
Directories included in the toolbox
NN/ – A library for Feedforward Backpropagation Neural Networks
CNN/ – A library for Convolutional Neural Networks
DBN/ – A library for Deep Belief Networks
SAE/ – A library for Stacked Auto-Encoders
CAE/ – A library for Convolutional Auto-Encoders
util/ – Utility functions used by the libraries
data/ – Data used by the examples
tests/ – unit tests to verify toolbox is working
For references on each library check REFS.md
Deep Learning Research Groups
Some labs and research groups that are actively working on deep learning:
University of Toronto – Machine Learning Group (Geoffrey Hinton, Rich Zemel, Ruslan Salakhutdinov, Brendan Frey, Radford Neal)
Université de Montréal – MILA Lab (Yoshua Bengio, Pascal Vincent, Aaron Courville, Roland Memisevic)
Google Research – Jeff Dean, Geoffrey Hinton, Samy Bengio, Ilya Sutskever, Ian Goodfellow, Oriol Vinyals, Dumitru Erhan, Quoc Le et al
Google DeepMind – Alex Graves, Karol Gregor, Koray Kavukcuoglu, Andriy Mnih, Guillaume Desjardins, Xavier Glorot, Razvan Pascanu, Volodymyr Mnih et al
Facebook AI Research(FAIR) – Yann Lecun, Rob Fergus, Jason Weston, Antoine Bordes, Soumit Chintala, Leon Bouttou, Ronan Collobert, Yann Dauphin et al.
Twitter’s Deep Learning Group – Hugo Larochelle, Ryan Adams, Clement Farabet et al
Microsoft Research – Li Deng et al
UCLA – Alan Yuille
University of Washington – Pedro Domingos‘ group
IDIAP Research Institute – Ronan Collobert‘s group
University of California Merced – Miguel A. Carreira-Perpinan‘s group
University of Helsinki – Aapo Hyvärinen‘s Neuroinformatics group
Université de Sherbrooke – Hugo Larochelle‘s group
University of Guelph – Graham Taylor‘s group
University of Michigan – Honglak Lee‘s group
Technical University of Berlin – Klaus-Robert Muller‘s group
Baidu – Kai Yu‘s and Andrew Ng’s group
Aalto University – Juha Karhunen and Tapani Raiko group
U. Amsterdam – Max Welling‘s group
CMU – Chris Dyer
U. California Irvine – Pierre Baldi‘s group
Ghent University – Benjamin Shrauwen‘s group
University of Tennessee – Itamar Arel‘s group
IBM Research – Brian Kingsbury et al
University of Bonn – Sven Behnke’s group
Gatsby Unit @ University College London – Maneesh Sahani, Peter Dayan
Computational Cognitive Neuroscience Lab @ University of Colorado Boulder
Deep Learning software
- Theano – CPU/GPU symbolic expression compiler in python (from MILA lab at University of Montreal)
- Torch – provides a Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu)
- Pylearn2 – Pylearn2 is a library designed to make machine learning research easy.
- Blocks – A Theano framework for training neural networks
- Tensorflow – TensorFlow™ is an open source software library for numerical computation using data flow graphs.
- MXNet – MXNet is a deep learning framework designed for both efficiency and flexibility.
- Caffe -Caffe is a deep learning framework made with expression, speed, and modularity in mind.Caffe is a deep learning framework made with expression, speed, and modularity in mind.
- Lasagne – Lasagne is a lightweight library to build and train neural networks in Theano.
- Keras– A theano based deep learning library.
- Deep Learning Tutorials – examples of how to do Deep Learning with Theano (from LISA lab at University of Montreal)
- DeepLearnToolbox – A Matlab toolbox for Deep Learning (from Rasmus Berg Palm)
- Cuda-Convnet – A fast C++/CUDA implementation of convolutional (or more generally, feed-forward) neural networks. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the back-propagation algorithm.
- Deep Belief Networks. Matlab code for learning Deep Belief Networks (from Ruslan Salakhutdinov).
- RNNLM– Tomas Mikolov’s Recurrent Neural Network based Language models Toolkit.
- RNNLIB-RNNLIB is a recurrent neural network library for sequence learning problems. Applicable to most types of spatiotemporal data, it has proven particularly effective for speech and handwriting recognition.
- matrbm. Simplified version of Ruslan Salakhutdinov’s code, by Andrej Karpathy (Matlab).
- deeplearning4j– Deeplearning4J is an Apache 2.0-licensed, open-source, distributed neural net library written in Java and Scala.
- Estimating Partition Functions of RBM’s. Matlab code for estimating partition functions of Restricted Boltzmann Machines using Annealed Importance Sampling (from Ruslan Salakhutdinov).
- Learning Deep Boltzmann Machines Matlab code for training and fine-tuning Deep Boltzmann Machines (from Ruslan Salakhutdinov).
- The LUSH programming language and development environment, which is used @ NYU for deep convolutional networks
- Eblearn.lsh is a LUSH-based machine learning library for doing Energy-Based Learning. It includes code for “Predictive Sparse Decomposition” and other sparse auto-encoder methods for unsupervised learning. Koray Kavukcuoglu provides Eblearn code for several deep learning papers on thispage.
- deepmat– Deepmat, Matlab based deep learning algorithms.
- MShadow – MShadow is a lightweight CPU/GPU Matrix/Tensor Template Library in C++/CUDA. The goal of mshadow is to support efficient, device invariant and simple tensor library for machine learning project that aims for both simplicity and performance. Supports CPU/GPU/Multi-GPU and distributed system.
- CXXNET – CXXNET is fast, concise, distributed deep learning framework based on MShadow. It is a lightweight and easy extensible C++/CUDA neural network toolkit with friendly Python/Matlab interface for training and prediction.
- Nengo-Nengo is a graphical and scripting based software package for simulating large-scale neural systems.
- Eblearn is a C++ machine learning library with a BSD license for energy-based learning, convolutional networks, vision/recognition applications, etc. EBLearn is primarily maintained by Pierre Sermanet at NYU.
- cudamat is a GPU-based matrix library for Python. Example code for training Neural Networks and Restricted Boltzmann Machines is included.
- Gnumpy is a Python module that interfaces in a way almost identical to numpy, but does its computations on your computer’s GPU. It runs on top of cudamat.
- The CUV Library (github link) is a C++ framework with python bindings for easy use of Nvidia CUDA functions on matrices. It contains an RBM implementation, as well as annealed importance sampling code and code to calculate the partition function exactly (from AIS lab at University of Bonn).
- 3-way factored RBM and mcRBM is python code calling CUDAMat to train models of natural images (from Marc’Aurelio Ranzato).
- Matlab code for training conditional RBMs/DBNs and factored conditional RBMs (from Graham Taylor).
- mPoT is python code using CUDAMat and gnumpy to train models of natural images (from Marc’Aurelio Ranzato).
- neuralnetworks is a java based gpu library for deep learning algorithms.
- ConvNet is a matlab based convolutional neural network toolbox.
- Elektronn is a deep learning toolkit that makes powerful neural networks accessible to scientists outside the machine learning community.
- OpenNN is an open source class library written in C++ programming language which implements neural networks, a main area of deep learning research.
- NeuralDesigner is an innovative deep learning tool for predictive analytics.
Symbolic Music Datasets
- Piano-midi.de: classical piano pieces (http://www.piano-midi.de/)
- Nottingham : over 1000 folk tunes (http://abc.sourceforge.net/NMD/)
- MuseData: electronic library of classical music scores (http://musedata.stanford.edu/)
- JSB Chorales: set of four-part harmonized chorales (http://www.jsbchorales.net/index.shtml)
- MNIST: handwritten digits (http://yann.lecun.com/exdb/mnist/)
- NIST: similar to MNIST, but larger
- Perturbed NIST: a dataset developed in Yoshua’s class (NIST with tons of deformations)
- CIFAR10 / CIFAR100: 32×32 natural image dataset with 10/100 categories ( http://www.cs.utoronto.ca/~kriz/cifar.html)
- Caltech 101: pictures of objects belonging to 101 categories (http://www.vision.caltech.edu/Image_Datasets/Caltech101/)
- Caltech 256: pictures of objects belonging to 256 categories (http://www.vision.caltech.edu/Image_Datasets/Caltech256/)
- Caltech Silhouettes: 28×28 binary images contains silhouettes of the Caltech 101 dataset
- STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. http://www.stanford.edu/~acoates//stl10/
- The Street View House Numbers (SVHN) Dataset – http://ufldl.stanford.edu/housenumbers/
- NORB: binocular images of toy figurines under various illumination and pose (http://www.cs.nyu.edu/~ylclab/data/norb-v1.0/)
- Imagenet: image database organized according to the WordNethierarchy (http://www.image-net.org/)
- Pascal VOC: various object recognition challenges (http://pascallin.ecs.soton.ac.uk/challenges/VOC/)
- Labelme: A large dataset of annotated images, http://labelme.csail.mit.edu/Release3.0/browserTools/php/dataset.php
- COIL 20: different objects imaged at every angle in a 360 rotation(http://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php)
- COIL100: different objects imaged at every angle in a 360 rotation (http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php)
- Arcade Universe – An artificial dataset generator with images containing arcade games sprites such as tetris pentomino/tetromino objects. This generator is based on the O. Breleux’s bugland dataset generator.
- A collection of datasets inspired by the ideas from BabyAISchool:
- Datasets generated for the purpose of an empirical evaluation of deep architectures (DeepVsShallowComparisonICML2007):
- Labelled Faces in the Wild: 13,000 images of faces collected from the web, labelled with the name of the person pictured (http://vis-www.cs.umass.edu/lfw/)
- Toronto Face Dataset
- Olivetti: a few images of several different people (http://www.cs.nyu.edu/~roweis/data.html)
- Multi-Pie: The CMU Multi-PIE Face Database (http://www.multipie.org/)
- Face-in-Action (http://www.flintbox.com/public/project/5486/)
- JACFEE: Japanese and Caucasian Facial Expressions of Emotion (http://www.humintell.com/jacfee/)
- FERET: The Facial Recognition Technology Database (http://www.itl.nist.gov/iad/humanid/feret/feret_master.html)
- mmifacedb: MMI Facial Expression Database (http://www.mmifacedb.com/)
- IndianFaceDatabase: http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/)
- (e.g. The Yale Face Database (http://vision.ucsd.edu/content/yale-face-database) and The Yale Face Database B (http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html)).
- 20 newsgroups: classification task, mapping word occurences to newsgroup ID (http://qwone.com/~jason/20Newsgroups/)
- Reuters (RCV*) Corpuses: text/topic prediction (http://about.reuters.com/researchandstandards/corpus/)
- Penn Treebank : used for next word prediction or next character prediction (http://www.cis.upenn.edu/~treebank/)
- Broadcast News: large text dataset, classically used for next word prediction (http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC97S44)
- Wikipedia Dataset
- Multidomain sentiment analysis dataset: http://www.cs.jhu.edu/~mdredze/datasets/sentiment/
- TIMIT Speech Corpus: phoneme classification (http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S1)
- Aurora : Timit with noise and additional information
- MovieLens: Two datasets available from http://www.grouplens.org. The first dataset has 100,000 ratings for 1682 movies by 943 users, subdivided into five disjoint subsets. The second dataset has about 1 million ratings for 3900 movies by 6040 users.
- Jester: This dataset contains 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,421 users.
- Netflix Prize: Netflix released an anonymised version of their movie rating dataset; it consists of 100 million ratings, done by 480,000 users who have rated between 1 and all of the 17,770 movies.
- Book-Crossing dataset: This dataset is from the Book-Crossing community, and contains 278,858 users providing 1,149,780 ratings about 271,379 books.