DL — Deep Learning

Deep Learning

Deep Learning (DL) is a subfield of machine learning that uses artificial neural networks (ANNs) with many hidden layers to learn hierarchical representations of data — enabling breakthroughs in natural language processing, image recognition, speech synthesis, and protein structure prediction that were out of reach for classical ML.

Deep learning architectures — including convolutional neural networks (CNNs) for vision, recurrent neural networks (RNNs) for sequences, and the transformer for language — each extract increasingly abstract features layer by layer, and are trained end-to-end via backpropagation on GPUs, scaling effectively with more data and compute.

Deep learning is the foundation of virtually every modern AI system: transformer-based large language models (LLMs), diffusion models for image generation, and reinforcement learning systems aligned through RLHF all rely on deep neural networks optimised by gradient descent.

🔍 Click image to zoom

Deep Learning — complete visual overview

Frequently Asked Questions

How many layers makes a network "deep"?

There is no hard rule, but in practice a network with two or more hidden layers is considered deep. Modern transformer-based LLMs have anywhere from 12 to 96 or more layers. The key characteristic is that depth enables hierarchical feature learning, where lower layers capture simple patterns and higher layers capture abstract concepts.

What hardware does deep learning require?

Deep learning training requires GPUs (Graphics Processing Units) or TPUs (Tensor Processing Units) because training involves billions of repeated matrix multiplications that CPUs are too slow to execute in reasonable time. For inference (running a trained model), smaller models can run on CPUs or mobile chips, but large models like LLMs require dedicated GPU or NPU hardware. Cloud providers such as AWS, Google Cloud, and Azure offer on-demand GPU instances for deep learning workloads.

What is backpropagation in deep learning?

Backpropagation is the algorithm used to train deep neural networks. After the network makes a prediction, backpropagation computes how much each parameter contributed to the error by applying the chain rule of calculus backwards through all network layers. The computed gradients are then used by an optimiser (such as Adam or SGD) to update weights in the direction that reduces the error. This cycle of forward pass, error measurement, and backward pass repeats millions of times during training.

DL

Frequently Asked Questions

How many layers makes a network "deep"?

What hardware does deep learning require?

What is backpropagation in deep learning?

See Also