An Introduction to Spiking Neural Networks[Working]

1 What’s it all about

TL;DR: A more neuroscientifically realistic model of artificial neural networks.

Current generation of widely applied ANN models are much more artificial than intelligence. I don’t want to really elaborate on this but most applied ANNs can be seen as universal function approximators, and since I don’t believe in a functionalism version of mind theory so ANNs are not of much difference from traditional statistical learning methods for me. It’s not dynamic, by and large, thus limits its applicability to even temporal sequential data.

One (and almost only) great example of neuroscience inspired ANN model is the fabled convolutional neural networks. Purported to be derivative of how columns in the primary visual cortex works. But it stretches the point to say we are anywhere near making useful neuromorphic models. Sensory systems are mere utility and largely feed forward, thus the ease of studying and modeling. They are on the very low tier of peripheral system supporting the brain. Things get funky really quick if we try to climb up the ladder to even how the lower level of sub-symbolic processing works. And beyond that we have to bridge the gap between the sub-symbolic systems and the huge unexplored ??? zones of neural processing, for which we don’t even have the adequate philosophical tools to grasp the issue.

There will always be another under satisfied lunatic. I’m not quite sure if I’m on board yet but lets take some look, not gonna hurt.

One cool thing about SNNs is, there’s no well established model/framework/training method at all because it’s still in its infancy, so there’s a wide plethora of wildly different models and methods tried to tackle this challenge, and I doubt there will ever be one grand unified methodology like gradient propagating for currently applied ANNs. Sky is the limit!

2 The idea

To be real is to be realistic. We are no god so we mimic god(-sic) works by looking at the brain. The gross history of ANN development is after higher biological realism.

2.1 Towards biological realism [2]

First generation ANNs, synchronous linear activation with binary output.

Second generation ANNs, synchronous non-linear activation with real output.(we are here)

Third(?) generation ANNs, asynchronous, non-linear threshold-firing model with binary spike train output.

For a little bit detailed description check out the referenced article. The spirit is that we push it to higher biological fidelity whenever we can.

Do note that the real-valued output of 2nd gen ANNs is not a deviation from biological realism and binary output of purported 3rd gen ANNs is not a regression from it. Current synchronous networks with decidedly non-temporal, real-valued outputs can be view as a smoothed approximation of firing rates of binary spiking trains, incorporating limited temporal information without introducing too much additional complexity. It can be seen as a special case of SNNs whose precise timing of spikes were discarded.

2.2 Why?

Brain is a crazy and chaotic place. Feed forward neural networks are beautiful pieces of mathematical entities with desirable properties that give you a sense of control-ability, and peace of mind(People doing deep learning(like me) might disagree, read any book about neural dynamics and computational neuroscience and you’ll see what I mean). That’s the reason why almost nobody is doing SNNs.(NIPS2019 only accepted 2/1430 papers that have anything to do with spiking neurons, and one of them is about using autoencoders to model sequential spiking neural data).

(This can be totally wrong)By getting out of the comfort zone of feed forward networks, we can have much bigger variation on network topology, neural coding scheme, and dynamics. By thinking out of the box of back-prop trainability, we can get a step closer to understanding how the brain actually think and learns, being a giant self-organized holistic system largely relied on local interactions.

p.s. The biggest financial incentive behind SNN research is it’s locality and event-driven nature, assumed to cost far less energy when instantiated in an dedicated chip with neuromorphic sensors, than current version of networks. Not like there are many people care about knowing how the brain works lol.

3 Angles of attack

3.0 Methodological Considerations

Spiking neural network is a paradigm shift for neuronal information processing, it goes without saying that they are not drop-in replacements for current neuronal models(remember the good days when you stacking neuronal models like Lego and your software handles everything for you?). SNNs in general are utterly hard to simulate and train, without introducing considerable amount of reduction and simplification.

The only way to do some meaningful works on SNN based computing is by divide and conquer, in and incremental way. And there are different start points.

3.1 By extension of 2nd gen. ANNs

If it’s too hard to make SNN work from scratch then it’s sensible to make modifications on current working neuronal network models, add in some SNN flavor and hope it would gain some desirable features of SNNs.

  • Gradient based methods are just great and everybody loves them, how nice it would be if we could train Spiking neural networks with backpropagation? There are a lot work on trying to adapt SNNs for gradient based training, with additional constraints.
  • Remember that SNNs are more use less energy and 2nd. gen. ANNs are, in a way, special cases of SNNs? Can we convert trained ANNs to SNNs, bypassing training while harvesting it’s energy efficiency? You just might.

3.2 Reservoir Computing

If SNNs are so hard to train, how about we don’t train them at all? How is that going to be useful? Actually that’s exactly what reservoir computing is doing. I’m not going to talk much about liquid state machine or echo state networks, so I’ll give their general ideas here.

Basically, you throw a rock into a bucket(hence reservoir) and let the water wave evolve and bounce for a fixed period of time, then freeze the wave. Now you try to infer the size and shape of the rock by looking at the fixed wave pattern.

For reservoir computing, you have a pool of randomly initialized spiking neurons(Both connectivity and other parameters are random) that you can feed information into. After information propagates through the networks for a predetermined time period, you try to use the result activation pattern of the spiking neuron soup as input for a traditional trainable feedforward network, as “read-out” neurons, to carry out classification or regression tasks.

We can treat the pool of interacting neurons as performing high dimensional embedding for the input. Maybe the reservoir can have certain clustering effect on the input but it’s not very clear how we gonna benefit from a untrained spiking neural networks. (further investigation pending)

Reservoir computing method is applied to many real life problems like spoken word recognition, spatio-temporal spike pattern classification, motion prediction or motor control. It’s purported to generate smoother movement for motor control problems.

Augmentation of vanilla reservoir computing methods including:

  • Add a feedback input to the reservoir from the read-outs
  • Apply unsupervised learning rules for SNNs on the reservoir, like Hebbian learning or spike time dependent plasticity(STDP)

4 Neuroscience facts

Some facts that may contribute to thinking about SNNs.

Edit: I was compiling a list of neuroscientific facts on neural information processing that are not respected by current neural networks models, but it soon get out of control because there are too many crazy intricacies in our brain that are too hard to prioritize and categorize in an informative manner, so I’d leave this part for another post.

5 Spiking Neurons and Networks

Finally we enter the description of SNN it self.

One fun fact about neuronal modeling is it’s almost fractal.When you dig deeper, there’s always more detail to be modeled. We can never reach absolute biological realism but it would also be stupid to do so, too. The important thing is to try your best to understand how the brain works and make reasonable abstraction, preserving the most principal principle/mechanisms behind the thinking brain, and utilize them at your best. We are not copying brain, but learning from it.

For a more comprehensive description of neuronal dynamics please refer to the great textbook[4] where all sorts of neuronal models of different granularity are described.

There’s no established practice on how to make a SNN, so you are free^ to choose different combinations of various neuronal models, network topologies and learning rules. I’ll describe some simple and commonly used or explored examples.

^ Please bear in mind that by free I mean you are doing whatever as you wish but there’s no guarantee it will work. Because natural neural networks are temporal system evolving continuously, we have to choose proper model granularity and time resolution for us to run the network. Relevant abstraction is of utmost importance. More sophisticated neuronal model is not necessarily better, unless you have unlimited computational power.

5.1 Neuron models

5.1.1 (Leaky) Integrate and Fire Model

to a first and rough approximation, neuronal dynamics can be conceived as a summation process (sometimes also called “integration” process” combined a mechanism that triggers action potentials above some critical voltage.

This is the simplest and historically most common biologically realistic model, but it’s powerful enough to capture many aspect of how natural neuron works.

An intuitive understanding of IF(leaky if there’s a hole leaking water) model is very well summarized in this picture:

It simply integrates the time-dependent input current and fires if a threshold(Is the threshold fixed or even exist at all? This is an interesting topic deserves another post) is reached. Let’s derive the formula for a simplified LIF model, from biological neurons.

IF models are defined by two sub-components:

  • An equation that describes the evolution of the membrane potential
  • A mechanism to generate spikes

By ‘simplified’ I mean:

  • We use a linear differential equation to describe the evolution of the membrane potential
  • We use a fixed threshold for spike firing

First we define some variables for our model;

  • \mathcal{V}:firing threshold
  • u_i(t):membrane potential of neuron i
  • t_i^f: time of fire when the voltage u_i(t) reaches \mathcal{V} from below.
  • u_{rest}: membrane potential in absence of any input, thus the name: resting potential.
  • I(t):Inject current directly applied by experimenter or from other neurons.

Because action potentials fired by same neuron always have the same form, no information is conveyed by how the spike is shaped. Thus we can reduce neuronal firing activities to a train of spiking events, fully specified by their timing.

Cellular membrane is pretty good at insulating, we can treat it as a capacitor with capacitance C. Also because it’s not perfect insulator, we denote leaking resistance by R(you can intuitively see that capacitor is doing the integrating while resistor doing leaking. hence leaky integrate):

The capacitor and resistor runs in parallel to connect extracellular and intracellular fluid, possibly driven by currentI(t).

From the law of current conservation we split the driving current into two parts:

I(t)=I_R+I_C

In which resistor current if find by calculating:

I_R=\frac{u_R}{R}; u_R=u(t)-u_{rest}

Also capacitor current:

I_C=\frac{dq}{dt}=C\frac{du}{dt}

Thus:

I(t)=\frac{u(t)-u_{rest}}{R}+C\frac{du}{dt}

That’s pretty much it. Simple is it? But we need some adaptation for use in neuronal computation models. First, we don’t really care about membrane potential’s relativity to resting potential so we substitute u(t)-u_{rest} with u(t). Second, we need to further divide driving current into external injected current and currents from different input neurons:I(t)=i_o(t)+\sum_{j}{w_ji_j(t)}

By substitute and reorganize we get:

C\frac{du}{dt}=-\frac{u(t)}{R}+(i_o(t)+\sum_{j}{w_ji_j(t)})

where i_o(t) is the external driving current and i_j(t)is the inject current from j-th synaptic input with w_jbeing it’s synaptic strength. (for R\rightarrow \infty the formula describes IF model that is not leaking)

For the firing mechanism part, in this simplified setting we simply reset the membrane potential u(t) to 0 or a fixed value u_{rest} when it reaches \mathcal{V}, and then send a spike down it’s axon(to other neurons). To respect the refractoriness of biological neurons, we can clamp the membrane potential for a fixed period of time after firing.

Notice in this representation, driving currents(i_o(t)+\sum_{j}{w_ji_j(t)}) includes undefined time-dependent functions, you are free to choose different input functions for your convenience.

One simple option is: treat input current as infinitely short pulse that deliver constant charge q, then we can substitute the entire input current component with:

q\sum\delta(t)

where \delta(t) is the Dirac \delta-function representing one spike.

Another common practice is to use:

i_j(t)=\int_{0}^{\infty}S_j(s-t)exp(-\frac{s}{\tau_s})ds

where S_j(t) is an arbitrary complex presynaptic spike train function from inputj. \tau_s is the synaptic time constant controls how fast the input is exponentially decayed.

By now we can see the power of LIF models largely depend on how you choose specific representation for input functions. Simple as it is, generalized LIF model can even accurately predict the spike train of biological neuron(with constraints)!

5.1.2 Hodgkin-Huxley Model

5.2 Network topology

I hate to say this but we really don’t know enough about neural wiring in the brain. Simple feedforward networks rarely appear in biological networks. Even peripheral system and lower level of somatosensory system, having a gross feedforward structure, contains a lot of feedback loops that control how the network behaves. As we get closer to the central neural system, the picture get more and more murky.

Also, neurons are usually modeled as extentless points. This biological inaccuracy may have some impacts on modeling neuronal information processing. Thousands of input synapses on different dendritic locations interact with each other non-linearly. which is in itself a very complex computational process, very much like logical gates in integrated circuits. The assumption of linearly summable input is only for convenience but not biologically warranted.

Besides its spatial complexity, the brain is also evolving temporally. Everybody have a different brain scheme, not only controlled by their genetic code, but also by their experience. Genetic code decides the overall structure and wiring of different modules, but the maturation of neural network and local topology is intrinsically experience-driven. This topologically dynamic nature of brain is even harder to capture. How the brain is formed and matured from prenatal to adolescence is an fascinating topic in itself.

So we’re back at square 2, try to make baby steps from currently well understood networks. The choice of topology also affects how hard it is to train the network, the deeper and more freely recurrent it is, the harder to train the network.

5.2.1 Feedforward Network

In feedforward networks the information flow is one-directional. FF networks are the backbone of many biological peripheral systems, where a lot of information is passed to the central networks from sensory systems. Thus FF networks are usually applied to model low-level sensory systems.

FF networks are also generally easier to train than networks with recurrent connections, many supervised learning methods are only applicable to strictly FF spiking neural networks. But the lack of feedback loops can seriously limit the network’s capability.

5.2.2 Recurrent Networks

Recurrent networks is a big spectrum spanning from simpler ones augmented from FF networks, to stoschastically connected networks, where the sense of direction is completely lost. In recurrent networks neurons interact with each other through reciprocal connections. This enables the networks to have temporal internal states, resulting in richer dynamics and potentially higher computational capabilities than FF networks. But it’s way harder to train or even control and stabilize the network.

Because of the coexistence of positive and negative feedback, recurrent neural networks can have very rich internal dynamics where theories of complex dynamical systems come into consideration. (one good example of instability of biological neural networks is epilepsy).

The difficulty of training a recurrent SNN limits its application in real life problems. Currently RSNNs are generally used in brain dynamic modeling, to investigate the biological neuronal information processing.

5.3.3 Hybrid Networks

see 3.2 Reservoir Computing for example.

6 Information Processing in SNN

6.1 Biological Perspective

Brain is a fascinating information processing machinery, but how is information actually encoded in the brain?

6.1.1 Time Resolution

First we need to determine the required temporal resolution of signaling in the neural network.

  • Rate Coding: Behind the 2nd. gen. networks

The rate coding theory, i.e. information is encoded in the firing frequency of neurons, is the dominant paradigm for both theoretical neural information processing and artificial neural networks for decades. Describing neural coding with firing rates smoothed out the neuronal output by discarding precise timing of each spikes. This enabled us to use real-valued output on artificial neurons, to respect some temporal aspects of biological networks with non-temporal networks. But newer evidence shows that while being important for sensorimotor systems, rate firing theory can not account for many aspects of higher level information processing in biological networks.

  • Pulse Coding: Precise timing matters

many behavioral responses are completed too quickly for the underlying sensory processes to rely on the estimation of neural firing rates over extended time windows.

Recent neurophysiological results suggest that both information processing (pulse coding) and learning(spike-timing-dependent plasticity) in biological networks are heavily dependent on the precise timing of individual spikes rather than on their firing rate.

(pause)

6.1.2 Information Coding

6.2 Computational Perspective

6.2.1 Input Coding

6.2.2 Internal Coding

6.2.3 Output Coding

7 SNN and Learning

7.1 By conversion

7.2 Supervised

7.3 Unsupervised

7.4 Reinforcement

7.5 Other possibilities?

 

References

[1] 2003 – Article – Spiking Neural Networks an Introduction – Jilles Vreeken – A legacy introduction to SNN, more about spiking neurons and less about networks/training, etc.

[2] 2009 – Article – Third Generation Neural Networks: Spiking Neural Networks – Samanwoy Ghosh-Dastidar, et al. –  Another introduction with feed forward focus, not worth reading if already familiar with spiking neurons. Listed for the sake of reference

[3] 2011 – Article – Introduction to Spiking Neural Networks, Information Processing, Learning and Applications – Filip Ponulak, et al. – A more extensive and recent introduction to this topic, recommended.

[4] 2014 – Book – Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition – Wulfram Gerstner,et al. – Great textbook on the basics and modeling of neuronal dynamics, from detailed single neuron modeling to network modeling with reasonably simplified neuron models.

Leave a Reply

Your email address will not be published. Required fields are marked *