Click here to Skip to main content
15,885,852 members
Articles / Artificial Intelligence

AI : Neural Network for beginners (Part 1 of 3)

Rate me:
Please Sign up or sign in to vote.
4.84/5 (284 votes)
16 May 2007CPOL9 min read 1M   548   92
AI : An introduction into Neural Networks

Introduction

This article is Part 1 of a series of 3 articles that I am going to post. The proposed article content will be as follows:

  1. Part 1: This one, will be an introduction into Perceptron networks (single layer neural networks)
  2. Part 2: Will be about multi layer neural networks, and the back propogation training method to solve a non-linear classification problem such as the logic of an XOR logic gate. This is something that a Perceptron can't do. This is explained further within this article
  3. Part 3: Will be about how to use a genetic algorithm (GA) to train a multi layer neural network to solve some logic problem

Let's start with some biology

Nerve cells in the brain are called neurons. There is an estimated 1010 to the power(1013) neurons in the human brain. Each neuron can make contact with several thousand other neurons. Neurons are the unit which the brain uses to process information.

So what does a neuron look like

A neuron consists of a cell body, with various extensions from it. Most of these are branches called dendrites. There is one much longer process (possibly also branching) called the axon. The dashed line shows the axon hillock, where transmission of signals starts

The following diagram illustrates this.

Image 1

Figure 1 Neuron

The boundary of the neuron is known as the cell membrane. There is a voltage difference (the membrane potential) between the inside and outside of the membrane.

If the input is large enough, an action potential is then generated. The action potential (neuronal spike) then travels down the axon, away from the cell body.

Image 2

Figure 2 Neuron Spiking

Synapses

The connections between one neuron and another are called synapses. Information always leaves a neuron via its axon (see Figure 1 above), and is then transmitted across a synapse to the receiving neuron.

Neuron Firing

Neurons only fire when input is bigger than some threshold. It should, however, be noted that firing doesn't get bigger as the stimulus increases, its an all or nothing arrangement.

Screenshot - BrainNeuronFiring.png

Figure 3 Neuron Firing

Spikes (signals) are important, since other neurons receive them. Neurons communicate with spikes. The information sent is coded by spikes.

The input to a Neuron

Synapses can be excitatory or inhibitory.

Spikes (signals) arriving at an excitatory synapse tend to cause the receiving neuron to fire. Spikes (signals) arriving at an inhibitory synapse tend to inhibit the receiving neuron from firing.

The cell body and synapses essentially compute (by a complicated chemical/electrical process) the difference between the incoming excitatory and inhibitory inputs (spatial and temporal summation).

When this difference is large enough (compared to the neuron's threshold) then the neuron will fire.

Roughly speaking, the faster excitatory spikes arrive at its synapses the faster it will fire (similarly for inhibitory spikes).

So how about artificial neural networks

Suppose that we have a firing rate at each neuron. Also suppose that a neuron connects with m other neurons and so receives m-many inputs "x1 …. … xm", we could imagine this configuration looking something like:

Image 4

Figure 4 Artificial Neuron configuration

This configuration is actually called a Perceptron. The perceptron (an invention of Rosenblatt [1962]), was one of the earliest neural network models. A perceptron models a neuron by taking a weighted sum of inputs and sending the output 1, if the sum is greater than some adjustable threshold value (otherwise it sends 0 - this is the all or nothing spiking described in the biology, see neuron firing section above) also called an activation function.

The inputs (x1,x2,x3..xm) and connection weights (w1,w2,w3..wm) in Figure 4 are typically real values, both postive (+) and negative (-). If the feature of some xi tends to cause the perceptron to fire, the weight wi will be positive; if the feature xi inhibits the perceptron, the weight wi will be negative.

The perceptron itself, consists of weights, the summation processor, and an activation function, and an adjustable threshold processor (called bias here after).

For convenience the normal practice is to treat the bias, as just another input. The following diagram illustrates the revised configuration.

Image 5

Figure 5 Artificial Neuron configuration, with bias as additinal input

The bias can be thought of as the propensity (a tendency towards a particular way of behaving) of the perceptron to fire irrespective of its inputs. The perceptron configuration network shown in Figure 5 fires if the weighted sum > 0, or if you're into math-type explanations

Image 6

Activation Function

The activation usually uses one of the following functions.

Sigmoid Function

The stronger the input, the faster the neuron fires (the higher the firing rates). The sigmoid is also very useful in multi-layer networks, as the sigmoid curve allows for differentation (which is required in Back Propogation training of multi layer networks).

Image 7

or if your into maths type explanations

Image 8

Step Function

A basic on/off type function, if 0 > x then 0, else if x >= 0 then 1

Image 9

or if your into math-type explanations

Image 10

Learning

A foreword on learning

Before we carry on to talk about perceptron learning lets consider a real world example :

How do you teach a child to recognize a chair? You show him examples, telling him, "This is a chair. That is not a chair," until the child learns the concept of what a chair is. In this stage, the child can look at the examples we have shown him and answer correctly when asked, "Is this object a chair?"

Furthermore, if we show to the child new objects that he hasn't seen before, we could expect him to recognize correctly whether the new object is a chair or not, providing that we've given him enough positive and negative examples.

This is exactly the idea behind the perceptron.

Learning in perceptrons

Is the process of modifying the weights and the bias. A perceptron computes a binary function of its input. Whatever a perceptron can compute it can learn to compute.

"The perceptron is a program that learn concepts, i.e. it can learn to respond with True (1) or False (0) for inputs we present to it, by repeatedly "studying" examples presented to it.

The Perceptron is a single layer neural network whose weights and biases could be trained to produce a correct target vector when presented with the corresponding input vector. The training technique used is called the perceptron learning rule. The perceptron generated great interest due to its ability to generalize from its training vectors and work with randomly distributed connections. Perceptrons are especially suited for simple problems in pattern classification."

Professor Jianfeng feng, Centre for Scientific Computing, Warwick university, England.

The Learning Rule

The perceptron is trained to respond to each input vector with a corresponding target output of either 0 or 1. The learning rule has been proven to converge on a solution in finite time if a solution exists.

The learning rule can be summarized in the following two equations:

b = b + [ T - A ]

For all inputs i:

W(i) = W(i) + [ T - A ] * P(i)


Where W is the vector of weights, P is the input vector presented to the network, T is the correct result that the neuron should have shown, A is the actual output of the neuron, and b is the bias.

Training

Vectors from a training set are presented to the network one after another.

If the network's output is correct, no change is made.

Otherwise, the weights and biases are updated using the perceptron learning rule (as shown above). When each epoch (an entire pass through all of the input training vectors is called an epoch) of the training set has occured without error, training is complete.

At this time any input training vector may be presented to the network and it will respond with the correct output vector. If a vector, P, not in the training set is presented to the network, the network will tend to exhibit generalization by responding with an output similar to target vectors for input vectors close to the previously unseen input vector P.

So what can we use do with neural networks

Well if we are going to stick to using a single layer neural network, the tasks that can be achieved are different from those that can be achieved by multi-layer neural networks. As this article is mainly geared towards dealing with single layer networks, let's dicuss those further:

Single layer neural networks

Single-layer neural networks (perceptron networks) are networks in which the output unit is independent of the others - each weight effects only one output. Using perceptron networks it is possible to achieve linear seperability functions like the diagrams shown below (assuming we have a network with 2 inputs and 1 output)

Image 11

It can be seen that this is equivalent to the AND / OR logic gates, shown below.

Image 12

Figure 6 Classification tasks

So that's a simple example of what we could do with one perceptron (single neuron essentially), but what if we were to chain several perceptrons together? We could build some quite complex functionality. Basically we would be constructing the equivalent of an electronic circuit.

Perceptron networks do however, have limitations. If the vectors are not linearly separable, learning will never reach a point where all vectors are classified properly. The most famous example of the perceptron's inability to solve problems with linearly nonseparable vectors is the boolean XOR problem.

Multi layer neural networks

With muti-layer neural networks we can solve non-linear seperable problems such as the XOR problem mentioned above, which is not acheivable using single layer (perceptron) networks. The next part of this article series will show how to do this using muti-layer neural networks, using the back propogation training method.

Well that's about it for this article. I hope it's a nice introduction to neural networks. I will try and publish the other two articles when I have some spare time (in between MSc disseration and other assignments). I want them to be pretty graphical so it may take me a while, but i'll get there soon, I promise.

What Do You Think ?

Thats it, I would just like to ask, if you liked the article please vote for it.

Points of Interest

I think AI is fairly interesting, that's why I am taking the time to publish these articles. So I hope someone else finds it interesting, and that it might help further someones knowledge, as it has my own.

History

v1.0 17/11/06

Bibliography

Artificial Intelligence 2nd edition, Elaine Rich / Kevin Knight. McGraw Hill Inc.

Artificial Intelligence, A Modern Approach, Stuart Russell / Peter Norvig. Prentice Hall.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United Kingdom United Kingdom
I currently hold the following qualifications (amongst others, I also studied Music Technology and Electronics, for my sins)

- MSc (Passed with distinctions), in Information Technology for E-Commerce
- BSc Hons (1st class) in Computer Science & Artificial Intelligence

Both of these at Sussex University UK.

Award(s)

I am lucky enough to have won a few awards for Zany Crazy code articles over the years

  • Microsoft C# MVP 2016
  • Codeproject MVP 2016
  • Microsoft C# MVP 2015
  • Codeproject MVP 2015
  • Microsoft C# MVP 2014
  • Codeproject MVP 2014
  • Microsoft C# MVP 2013
  • Codeproject MVP 2013
  • Microsoft C# MVP 2012
  • Codeproject MVP 2012
  • Microsoft C# MVP 2011
  • Codeproject MVP 2011
  • Microsoft C# MVP 2010
  • Codeproject MVP 2010
  • Microsoft C# MVP 2009
  • Codeproject MVP 2009
  • Microsoft C# MVP 2008
  • Codeproject MVP 2008
  • And numerous codeproject awards which you can see over at my blog

Comments and Discussions

 
QuestionThanks, and your ideas for the future? Some of my own... Pin
arrantsymmetry26-Aug-17 8:03
arrantsymmetry26-Aug-17 8:03 
I am an average coder, jack of all master of none I suppose but hey, good enough to hold some senior positions etc. My point being that many people would like an explanation that you can understand without having gone to college for years etc. I am self taught, coding since I was 15 or so, but I struggle with the math symbolism since although the logic is simple enough the symbols are pretty esoteric and mostly taught only in formal education. My view is that you shouldn't have to know high level math to code, as is obvious from the ease with which you can throw together a gui these days. I still remember the days of ASM registers and interrupt calls...I am 32 BTW. And even in those days you didn't need high level math to do binary algebra for instance, yeah it was a bonus but mostly common sense. In my view you've done that, you've made a complex subject available for beginners, and potentially laymen as well. I have looked for YEARS for an adequite explanation for ANNS etc on the net and your article has finally brought closure to my insatiable curiosity. Thank you for that. PS I don't think AI in this form is the future in and of itself. From a layman's point of view, it seems that what is missing from AI is a core learning algorithm that works. Biology is made to withstand not just the well functioning of one type of system but the possible addition of many different peripherals(some environmental factor changes and we(the human body) now have to know how to operate a fin instead of an arm, or a wing, or infrared vision etc) so the neurology and wiring has to be adaptable, perhaps TOO adaptable for our purposes. Neurons specialise, they are culled and nurtured and tailored for certain roles(like cells in general), which is why parts of the brain do different things and why damage to one part is expressed in certain ways, certain syndromes that are common to a type of injury/damage to a certain area. However the brain can still recover, different neurons can be put into that role, it is uncomfortable and often painful but it can be done, because of the amazing organisational structure that exists in the human body and in software as you have described above. But do we need that ability, to create AI. In my opinion NO. We need the equivalent of the next layer up on the stack. The application layer perhaps. Or even the business object layer. In my view we are focusing on the mechanics where we should be focusing on the processes. Imagine we could map every single decision chain in a thought, in all thoughts all day. Having that data, would you choose the neural net as the underlying software model to map that? Hell no. And yet that is what we are trying to replicate. Yes languages like lisp etc map decision chains, but not with the right algorithms(we have not succeeded have we?). IMHO we are still searching for that. And when we do find it, I think it will be because we found a way to "listen in" and map human thought to an algorithm/s sure those algorithms will still have inputs such as hunger pain etc etc that will fire and be weighted and produce responses, kind of like event handlers, but like event handlers we need to be able to look at the actual code inside and adjust accordingly. In a way the black box nature of Neural Nets goes against the scientific principle, think of back propagation, imagine running an experiment like that, record the output, retrofit the weights to obtain the same input lol. basically simply adjusting weights to compensate for total lack of insight.Further I don't think it will even be THAT complex, the solution that is, when it comes, I don't think it will even require Higher grade math to understand, after all, every child does it(thinks). Thanks again.
Bugbias should be at outside of sigma? Pin
Member 1294962021-Jan-17 1:54
Member 1294962021-Jan-17 1:54 
Praisethis article is GOLD! Pin
Member 1291943823-Dec-16 5:04
Member 1291943823-Dec-16 5:04 
QuestionHELP Pin
Member 1268889717-Aug-16 7:05
Member 1268889717-Aug-16 7:05 
GeneralMy vote of 5 Pin
csharpbd8-Jul-16 10:50
professionalcsharpbd8-Jul-16 10:50 
PraiseGreat !!! Pin
Armando de la Torre9-Dec-15 14:28
Armando de la Torre9-Dec-15 14:28 
GeneralThank you Pin
Nitin S7-Sep-15 3:19
professionalNitin S7-Sep-15 3:19 
QuestionMomentum Pin
BotReject27-Jun-15 0:04
BotReject27-Jun-15 0:04 
QuestionAction potential Pin
Member 1159694810-Apr-15 4:36
Member 1159694810-Apr-15 4:36 
QuestionOutput Data it's Discrete/Step Pin
Member 1141397429-Jan-15 21:55
Member 1141397429-Jan-15 21:55 
QuestionHopfield neural network Pin
Member 110927048-Oct-14 0:02
Member 110927048-Oct-14 0:02 
AnswerRe: Hopfield neural network Pin
Sacha Barber8-Oct-14 0:47
Sacha Barber8-Oct-14 0:47 
GeneralRe: Hopfield neural network Pin
Member 110927048-Oct-14 1:02
Member 110927048-Oct-14 1:02 
Suggestionsigma formula of perceptron Pin
behnam izadi6320-Oct-13 10:53
behnam izadi6320-Oct-13 10:53 
GeneralMy vote of 5 Pin
Member 1010970114-Jun-13 12:11
Member 1010970114-Jun-13 12:11 
GeneralVery good article Pin
alicebenz16-Apr-13 0:33
alicebenz16-Apr-13 0:33 
GeneralMy vote of 5 Pin
shoaibkhan0017-Oct-12 7:21
shoaibkhan0017-Oct-12 7:21 
GeneralMy vote of 5 Pin
Salem2718-May-12 3:34
Salem2718-May-12 3:34 
GeneralMy vote of 3 Pin
Robert.Clark196021-Apr-12 6:45
Robert.Clark196021-Apr-12 6:45 
GeneralMy vote of 5 Pin
Remixman25-Dec-11 2:34
Remixman25-Dec-11 2:34 
GeneralMy vote of 5 Pin
firemyst22-Oct-10 2:01
firemyst22-Oct-10 2:01 
GeneralMy vote of 5 Pin
afush8916-Oct-10 13:17
afush8916-Oct-10 13:17 
GeneralPLS ASSIST Pin
gbenga oso20-Apr-10 18:30
gbenga oso20-Apr-10 18:30 
GeneralRe: PLS ASSIST Pin
Sacha Barber20-Apr-10 20:17
Sacha Barber20-Apr-10 20:17 
GeneralRe: PLS ASSIST Pin
Paddy Wack27-Apr-17 7:48
Paddy Wack27-Apr-17 7:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.