What are Artificial Neural Networks

How do artificial neural networks work?

They are the basis of all successful AI applications of the last decade: Artificial neural networks. But how do they actually work?

Artificial neural networks are arguably the most influential technology of the last decade. They form the fundamental building block for deep learning, which is at the center of the current AI boom. Neural networks unblock smartphones using facial recognition, translate texts, detect diseases such as cancer on images or generate deep fakes.

The mathematical foundations of the artificial neuron emerged as early as the 1940s. The algorithm for the first neural network - called the perceptron - was written in 1958. But only with the wide availability of high computing power and large amounts of data in the last ten years did the triumph of neural networks begin.

Neural networks: math, layers, architectures

While artificial neural networks are loosely inspired by their biological counterparts, they don't have much to do with the brain's electrochemical computing. Because numbers are processed within the artificial neural network instead of neurotransmitters. Neural networks are mathematical constructs that approximate almost any mathematical function and can thus solve complex mathematical problems.

An artificial neural network usually consists of several layers: an input layer via which external data such as images, audio data or text are fed into the network, one or more intermediate layer (s) (hidden layers) that process the data and an output -Layer that outputs the results of the function performed by the network.

Each layer in turn consists of artificial neurons that are connected to one another via so-called weightings. Neurons and weights have certain numerical values ​​that change in the course of training.

In an image recognition AI, each layer usually processes its own patterns, based on the inputs of the upstream neurons, for example first edges, then textures, later parts of objects and finally entire objects. | Image: Deepmind

The number of artificial neurons, layers and connections between the neurons does not usually change during the training of a neural network, but the brain is subject to permanent change depending on external influences until old age. The entire package of neurons, layers and their connections is also known as the architecture of the network.

There are a variety of architectures within AI research and practice. Fjodor van Veen from the Asimov Institute has put together a practical overview.

Neurons, weights and activation function

The connections between the neurons are called weights. Weights are a numerical value that determines how the output of one neuron affects the input of the next neuron. If the value is zero, the upstream neuron has no influence.

The most important element of the artificial neural network are the artificial neurons. Each neuron after the input layer receives inputs from the other neurons in the network, multiplies these inputs with the values ​​of the weights, adds up all the values ​​obtained in this way and then transfers the sum to a so-called activation function.

The activation function determines the output of the neurons, which is used in the next network layer for the input of the subordinate neurons or which outputs the result in the last neuron layer.

Schematic representation of an artificial neuron. The inputs are multiplied by the weightings, summed up by the transfer function and passed on to the activation function Image: Perhelion, NeuronModel German, CC BY-SA 3.0

A special neuron is the so-called bias neuron. It has no additional input within the neural network and always has the value 1. In this way, as a neutral entity in combination with the weighting, it can shift the results in the network in a specified direction. Usually there is a bias neuron for each layer, which is connected to all neurons in that layer.

Since the value of the bias neuron does not change, it can influence the activation function independently of the rest of the network, for example shifting it to the left or right on a graph or compressing and stretching it (translation). The weighting between the bias neuron and the associated layer is learned during training.

The training of artificial neural networks

Before training, artificial neural networks assign random values ​​to the weights, usually between -1 and 1. The key to powerful AI: The neural network must learn the correct values ​​for the weights.

The network is trained for this, for example with examples such as labeled pictures of cats, dogs and bananas. Labeled means that the image of a cat also contains the word “cat” in the metadata. With each training example, the network adjusts its weightings in order to be able to better recognize the class “cat” on an image.

Specifically, with each training iteration, the weighting of the network is modified in such a way that the prediction error, for example for the cat class, is minimized. To do this, the network calculates the deviation of its own prediction from the data point provided, such as the image labeling, taking into account all the weights.

The value obtained in this way symbolizes the cost of the difference between the desired and actual forecast. The function that calculates these values ​​is therefore called the cost function. Then the weightings of the network are applied to itself (recursively) so that the costs become smaller in the next training run.

For a better understanding it can help to imagine the cost function as a graph: There are high positive and high negative costs, depending on the output of the network. During training, the network tries to find values ​​for the weightings that lead to the local minimum on the cost curve, i.e. generate minimum costs.

This process called “Gradient Descent” gradually adapts the weighting of the network so that the forecast error is as small as possible in the end and the forecast is therefore as accurate as possible. Those who want to better understand the math details should take a look at the explanation of the fantastic 3Blue1Brown.

Neural Networks - The Solution to All Problems?

Neural networks have become the standard within AI research today and can be found in image, writing and pattern recognition; they recognize, translate and generate language, control complex processes, provide forecasts, form the basis for early warning systems, and model biological and economic systems or beat people in board and video games.

However, they are only a sub-area of ​​AI research and deep learning is just one variant of machine learning. Whether they are the means of choice to get closer to the great goal of general artificial intelligence (explanation) is controversial in the professional world.

AI researchers such as Gary Marcus criticize the blind trust in neural networks, other researchers such as Turing Prize winner Geoffrey Hinton believe that with deep learning they have found the key to super AI - if the progress continues as fast and fundamentally as in the last few years.

Anyone who is convinced that neural networks lead into the future of AI, however, has to put up with the question of what is still missing for the big breakthrough: Is it better hardware? Are there any fundamental principles missing? Or do we need more neurobiology for better AI?

Either way, neural networks existed for almost 50 years before researchers were able to make practical use of them.

Frank Rosenblatt, the inventor of the perceptron, said in an interview with the New York Times as early as 1959: “Later perceptrons will be able to recognize people and call their names. Printed pages, handwritten letters and even voice commands are within easy reach. Only one more development step, a difficult step, is necessary so that a device can hear speech in one language and immediately translate it into speech or text in another language. "

And he saw even more potential in the artificial neurons: In principle, according to Rosenblatt, it is possible to build perceptrons that reproduce themselves and are aware of their existence. Perhaps it will take another 50 years for Rosenblatt's second prognosis to come true.

Read more about Artificial Intelligence:

steady2

How do artificial neural networks work? was last modified: April 25th, 2021 by Maximilian Schreiner

MIXED.de Podcast: VR, AR and KI - new every week

Listen now: All episodes


Note: Links to online shops in articles can be so-called affiliate links. If you buy via this link, MIXED.de receives a commission from the provider. For you, the price doesn't change.