Artificial neuron

From Wikipedia, the free encyclopedia

An artificial neuron, also called node, Nv neuron, binary neuron or McCulloch-Pitts neuron, is an abstraction of biological neurons and the basic unit in an artificial neural network. The Artificial Neuron receives one or more inputs (representing the one or more dendrites) and sum them to produce an output (synapse). Usually the sums of each node are weighted, and the sum is passed through a non-linear function known as an activation or transfer function. The canonical form of transfer functions is the sigmoid, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. Generally, transfer functions are monotonically increasing.

1 Basic structure
2 History
3 Types of transfer functions
4 See also
5 Bibliography

[edit] Basic structure

For a given artificial neuron, let there be m inputs with signals x₁ through x_m and weights w₁ through w_m.

The output of neuron k is:

$y_k = \varphi \left( \sum_{j=0}^m w_{kj} x_j \right)$

Where $\varphi$ (Phi) is the transfer function.

The output propagates to the next layer (through a weighted synapse) or finally exits the system as part or all of the output.

[edit] History

The original artificial neuron is the Threshold Logic Unit first proposed by Warren McCulloch and Walter Pitts in 1943. As a transfer function, it employs a threshold or step function taking on the values 1 or 0 only.

[edit] Types of transfer functions

The transfer function of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any multi-layer perceptron using a linear transfer function has an equivalent single-layer network; a non-linear function is therefore necessary to gain the advantages of a multi-layer network.

Below, u refers in all cases to the weighted sum of all the inputs to the neuron, i.e. for n inputs,

$u = \sum_{i = 1}^n w_{i} x_{i}$

where w is a vector of synaptic weights and x is a vector of inputs.

[edit] Step function

The output y of this transfer function is binary, depending on whether the input meets a specified threshold, θ. The "signal" is sent, i.e. the output is set to one, if the activation meets the threshold.

$y = \left\{ \begin{matrix} 1 & \mbox{if }u \ge \theta \\ 0 & \mbox{if }u < \theta \end{matrix} \right.$

See: Step function

[edit] Linear combination

The output unit y is a linearly weighted sum of its outputs plus a bias term, similar to θ above, which is independent of the inputs.

$y = \left(u + b\right)$

Networks based on this formulation are known as perceptrons. Typically the above transfer function in its pure form would only be useful in a regression setting. For a binary classification setting, the sign of the output denotes the class predicted; in this case it is more sensible (and more convenient in the context of the learning algorithm) to consider positive outputs to be 1 and negative outputs to be 0, thus reducing the transfer function to that of the step function above, where $θ = - b$ .

See: Perceptron

[edit] Sigmoid

A fairly simple non-linear function, the sigmoid also has an easily calculated derivative, which is used when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimise the computational load of their simulations.

See: Sigmoid function

[edit] Criticism

The Artificial neuron is generally criticized for not having a correct biophysical mapping. Although it does capture the numerous incoming dendrites, it fails to provide multiple output synapses. This helps speed up computation, but at the loss of biophysical accuracy.

[edit] Pseudocode Algorithm

The following is a simple pseudocode implementation of a single TLU which takes boolean inputs (true or false), and returns a single boolean output when activated. An object oriented model is used. No method of training is defined, since several exist. If a purely functional model were used, the class TLU below would be replaced with a function TLU with input parameters threshold, weights, and inputs that returned a boolean value.

 class TLU defined as:
  data member threshold : number
  data member weights : list of numbers of size X
  function member fire( inputs : list of booleans of size X ) : boolean defined as:
   variable T : number
   T ← 0
   for each i in 1 to X :
    if inputs(i) is true :
     T ← T + weights(i)
    end if
   end for each
   if T > threshold :
    return true
   else:
    return false
   end if
  end function
 end class