Deep Learning applies deep neural networks (DNNs) to a variety of challenging problem areas – many of which have been revolutionized by this application, especially the processing of images, text, and speech.
Neural networks (NNs) have existed for many decades, but it was only in the last two decades that breakthroughs have occurred that made deep neural networks possible. NNs consist of at least two layers of neurons which each feed their results to the next network layer. However, state-of-the-art computer vision DNNs have 20 to 100 layers with millions of connections and can take multiple days to train on clusters of dozens of high-powered computers. Each training image may take more than a billion calculations to adjust the model’s many connections, and each of the thousands of images will be presented to the model hundreds of times during training.
Text is a sequence of tokens (words, symbols, numbers) and the same DNN techniques can be used for other kinds of sequences. For example, sequences of events, sequences of sounds (music, speech), sequences of measurements over time (time series), etc. Images are two-dimensional spatial arrangements of pixels and 2-d DNN techniques can be used for other spatial tasks such as geospatial, temporal-geospatial, and can be extrapolated to other spatial tasks in 3-d and higher dimensions. So, the basic techniques of Image Deep Learning (Computer Vision) and Text Mining are applicable to most Deep Learning applications.