# AN ANALOG VLSI INTEGRATE-AND-FIRE NEURAL NETWORK FOR SOUND SEGMENTATION

## MARK A. GLOVER, ALISTER HAMILTON

Department of Electrical Engineering, University of Edinburgh, Kings Buildings, Mayfield Road Edinburgh EH9 3JL, Scotland, E. U. (mag/alister@ee.ed.ac.uk)

#### LESLIE S. SMITH

Department of Computing Science and Mathematics, University of Stirling, Stirling FK9 4LA, Scotland, E. U. (lss@cs.stir.ac.uk)

May 21, 1998

## Abstract

This paper presents a cascadable aVLSI integrate-and-fire neural network chip (SPIKE I) capable of realistic biological time constants incorporated into a real time software based sound segmentation system with results. The sound segmentation system is based on an engineering abstraction of the functionality of the cochlea and auditory nerve. A comparison of the software simulation and software/hardware combination results indicates that clustering does occur. Furthermore the patterns of onsets and offsets generated are broadly similar. Analysis of the results indicates area's for improvement. These have been included in a second integrate-and-fire neural network chip (SPIKE II) presently being fabricated.

## 1 Introduction

This paper provides an overview of the sound segmentation system, the integrate-and-fire neural model used and the architecture of the neural network implemented. A comparison using software and hardware implementations with real time data is performed. This suggests clustering of onsets and offsets does occur within the network. Improvements to the system architecture to produce a more general purpose integrate-and-fire neural network chip with increased flexibility are discussed.

Sound segmentation software developed by one of the authors [2, 3] is based on a model of early mammalian audition. The model is an engineering approximation of some of the biological processes within the cochlea, auditory nerve and cochlear nuc-

leus. Functions rather than the biological processes are modelled. The cochlea function is implemented by a multi-channel filter bank (the Gammatone filter bank [5]), followed by simple rectification, (modelling the inner hair cells of the organ of Corti), onset enhancement, (modelling the transfer characteristics of the auditory nerve), and compression (to allow the system to cope with a large dynamic range).

This preprocessing produces an onset signal in each channel (the channel onset signal), which is input to an integrate and fire neuron. Neuron outputs are connected as fixed excitatory inputs to ten adjacent neurons (five on either side) and the resultant network performs clustering of sound onsets across time and channels. For the purposes of this paper a "cluster" is defined as those neurons in the network which fire within a "small" time interval of each other.

The integrate-and-fire neural network is being implemented in aVLSI because it can exploit the parallelism of the sound segmentation process itself i.e. the 32 bandpass filtered channels and post processing. Implementation in aVLSI will enable space efficient design as storage and multiplication are not required.

# 2 Integrate-And-Fire Neuron Model

Recent studies have indicated that the integrate-andfire neuron is computationally more powerful than previous neuron models as they allow the use of time as a resource [6][7].

Between spikes an integrate-and-fire neuron's activity (see Figure 1) is governed by the following equation, where  $V_{C_{\text{int}}}$  is the activation voltage and I(t) is the post synaptic input:

$$\frac{dV_{C_{\rm int}}(t)}{dt} = -\frac{V_{C_{\rm int}}(t)}{RC_{\rm int}} + \frac{I(t)}{C_{\rm int}} \tag{1}$$

The input I(t) has two components, the primary component being the channel onset signal and the secondary component caused by adjacent neuron activity. The leaky integrator integrates the resultant current until the threshold,  $\theta$ , of the comparator is reached whereupon the neuron fires. The output of the comparator goes high, causing  $V_{C_{\text{int}}}$  to be zeroed and held low for the refractory period. With  $V_{C_{\text{int}}}$  zeroed, the comparator output goes low completing the neuron spike.

# 3 Integrate-And-Fire Neural Network

It was decided to implement the integrate-and-fire neural network in aVLSI, leading to the development of SPIKE I. A block diagram of the implemented neuron is shown in Figure 2, a more detailed description of the functional blocks is provided in [1]. A 4-bit digital number representing the preprocessed output of earlier sound segmentation stages is loaded onto the neuron via the data bus. To the network this incoming signal appears to be time varying, as the load cycle requires nanoseconds and the network operates in the milliseconds.

For clustering to occur across frequency channels, inter-neuron communication is required. On SPIKE I, each neuron output is connected to the adjacent 5 higher and 5 lower frequency channels. Each voltage pulse causes a pulse of current to be applied to the integrating capacitor of adjacent neurons, equivalent to 0.3V or one tenth of the upper threshold value.

Each chip contains 8 neurons with local weighted interconnect and is cascadable. Inter-neuron and inter-chip communication uses the robust pulse outputs of the individual neurons, which are relatively immune to noise, simple to implement and digital in nature.

Thoughtout the design process the ability to test the chip (SPIKE I) was considered and maximised. However compromises had to be made because of cost/pin limits. The basic integrate-and-fire neuron design [1] used in the SPIKE I chip proved that the basic principles of the design worked, and resulted in  $RC \le 20ms$  and refractory period  $\le 100ms$  time constants.

## 4 Results

Results gained from a four chip network of 32 neurons with real-time data applied indicate that the network successfully clusters sound onsets, see Figure 3. This shows a comparison of the output from a sound segmentation software integrate-and-fire neural network without/with interconnect and the output from a 4 chip SPIKE I network when fed with identical real time data. The top three traces show a complete simulation, which on initial investigation the results look very good. The bottom three traces show a small section of the overall simulation expanded, highlighting discrepancies between simulation and hardware. A cluster in Figure 3 can be described as a "vertical" line consisting of a number of spikes.

Experiments have shown that there is performance variation both across chip and between chips. To process spatio-temporally encoded data this variation should be minimised.

Individual neuron's performances are well matched for charging time and refractory period (see Figure 4 and Figure 5), although there is some variation [1]. A fixed input pattern fed to three adjacent neurons and scanned across the system highlighted variations in firing pattern. The problem has been identified as coming primarily from variation in the weighted interconnect.

Comparison of software and hardware data has shown that the input to each hardware neuron has to be scaled by between 0.25 and 0.5 to produce a good match [4]. There are three possible explanations for this, the first being charge injection from switches dumping charge into  $C_{\rm int}$ , this will increase the voltage level. Secondly imperfect current sources may result in larger currents than calculated being applied to  $C_{\rm int}$ . Finally the switching transients of pulsed current sources may also result in excess charge being injected on to  $C_{\rm int}$ .

## 5 Discussion

The SPIKE I chip was designed as ASIC, specifically to interface with the sound segmentation software with the minimum of support circuit. However this limits its flexibility/usefulness in a wider sense. A more general purpose integrate-and-fire neural network would offer greater insights into system behaviour. Drawing on the results gained from SPIKE I, areas for improvement were highlighted:

- The interconnect in SPIKE I implements a fixed excitatory weight which has the effect of advancing neighbouring neuron's potential firing times when triggered. Recent work has shown the importance of variable interconnections between neurons [8]. A programmable 4-bit inhibitory/excitatory interconnect is suggested, so that the relationship between clustering and interconnect can be explored, allowing both advancing and retarding of firing times.
- The interconnection between neurons is fixed in strength and radius at +/-5 neurons. A wider and more flexible interconnection neighbourhood would improve the flexibility of the system.
- The comparator with hysteresis used in the original design is a circuit which relies on the size relationship between its transistors, so any variation will adversely affect performance. A more appropriately designed comparator should be used within the design to enhance performance.
- Stray capacitances within the design and variation in capacitance adversely effect performance, care must be taken to minimise these effects.
- Partitioning of analogue and digital power supplies would act to limit any interference.
- With any clocked or switched system charge injection noise can be a problem. If this is injected on to a capacitor it could result in a premature fire of a neuron or ending of a refractory period. Care must be taken to limit its effects.

The above have been taken into account when designing SPIKE II, which is presently being fabricated. The design is cascadable, and each chip consists of 4 neurons with 32 programmable synapses per neuron. The synapse design is based on dynamic current mirror techniques [9][10]. This will pulse current on to capacitors. As with SPIKE I, all inter-chip and inter-neuron communication will be of a robust pulsed voltage nature. For a comparison of SPIKE I and SPIKE II see Table 1. SPIKE II has 4 neurons compared with SPIKE I which contained 8 neurons. However SPIKE II has 3 times the number of synapses per neuron compared with SPIKE I and each

synapse's weight is programmable allowing investigation into the effects of interconnect strength and greater system flexibility. The design of SPIKE II also takes in account the flaws in present in SPIKE I.

## 6 Conclusions

A cascadable integrate-and-fire neural chip has been successfully implemented using aVLSI techniques. Comparison of simulation and hardware results has shown that it successfully clusters sound onsets and offsets. Areas for improvement have been identified, and these improvements are included in a second chip, SPIKE II, results from which should be available at the time of the conference.

# 7 Acknowledgements

The authors acknowledge the assistance of Adrian O'Lenskie and Frank Kelly in the design and construction of the circuitry for testing the analogue hardware.

Mark Glover is supported by the UK EPSRC.

## References

- [1] Glover, M., Hamilton, A., Smith, L.S. Analogue VLSI Integrate and Fire Neural Network for Clustering Onset and Offset Signals in a Sound Segmentation System in Neuromorphic Systems: Engineering Silicon from Neurobiology Editors L.S. Smith and A. Hamilton. World Scientific (in press).
- [2] Smith, L. S.: Onset-based sound segmentation. Advances in Neural Information Processing Systems, eds Touretzky D.S., Mozer M.E., Hasselmo M.E. (1996) 729-735 MIT Press.
- [3] Smith, L. S. (inventor): Onset/offset coding for interpretation and segmentation of sound. Patent Application No GB2299247, filed by University of Stirling, 23 March 1995, published 25 September 1996.
- [4] Smith L.S., Glover M.A., Hamilton A.: A Comparison of a Hardware and a Software Integrate and Fire Neural Network for Clustering Onsets in Cochlear Filtered Sound. Submitted to Workshop on Neural Networks for Signal Processing, Aug 31-Sept 3 1998

| Table 1: Comparison Of SPIKE I And SPIKE II Chips |                                 |                                      |
|---------------------------------------------------|---------------------------------|--------------------------------------|
|                                                   | SPIKE I                         | SPIKE II                             |
| Number Of Neurons                                 | 8                               | 4                                    |
| Synapses Per Neuron                               | 10                              | 32                                   |
| Total Synapses Per Chip                           | 80                              | 128                                  |
| Local Interconnect                                | +/- 5 Neurons                   | 32 Neurons                           |
| Interconnect Weights                              | Fixed                           | 5-Bit Programmable                   |
| Inter-Neuron Voltage Steps                        | Fixed, 10% Comparator Threshold | -/+15 Steps, 1% Comparator Threshold |
| Refractory Period (Milliseconds)                  | 15 - 95                         | Approx 10 - 100                      |
| Cascadable                                        | Yes                             | Yes                                  |
| Voltage Rail                                      | 5V DC Rail                      | 5V DC Rail, 5V AC Rail               |
| Number Of Chips For Algorithm                     | 4                               | 8                                    |
| Current Drawn                                     | $32 \mathrm{mA}$                | -                                    |
| Power Consumed                                    | $160 \mathrm{mW}$               | -                                    |
| Technology                                        | Mietec 2.4um CMOS               | Mietec 2.4um CMOS                    |

- [5] Patterson R. D. & Allerhand M. H.: Time-domain modelling of peripheral auditory processing: A modular architecture and software platform. J. Acoust. Soc. Am. 98 (4), October 1995.
- [6] Maass, W.: Networks of spiking neurons: The third generation of neural network models, Neural Networks. 10(9): 1659-1671, 1997.
- [7] Maass, W.: Lower bounds for computational power of networks of spiking neurons. Neural Computation, 8(1):1-40, 1996.
- [8] Nischwitz A., Glunder H.: Local lateral inhibition: a key to spike synchronization. Biol. Cybern. 73, 389-400(1995)
- [9] Mayes, D.: Implementing radial basis function neural networks in pulsed aVLSI. Ph.D Thesis, The University of Edinburgh 1997.
- [10] Vittoz A. & Wegmann G.: Chapter 7 Dynamic Current Mirrors in Analogue IC design: the current mode approach edited by C. Toumazou, F. J. Lidgey & D.G. Haigh, Published By Peter Peregrinus Ltd, ISBN 0 86341 215 7.



Figure 1: Integrate-And-Fire Neuron. There is one neuron per channel and its input is formed from the channel onset signal I(t) combined with pulsed current signals from neurons firing in adjacent channels.



Figure 2: SPIKE I Integrate-And-Fire Neuron: The preprocessed input (a 4-bit value) is converted to a current by a 4-bit DAC (IDAC) which drives the VLDCO. The VLDCO produces pulses of current which are integrated by the leaky integrator.  $F_{\rm CR}$  controls the CR time constant by varying the value of R with frequency for all neurons on a chip. When the integrator reaches a predetermined threshold, its output goes high. The refractory timer then zeros the integrated voltage  $V_{C_{\rm int}}$ , causing the integrator output to go low completing the spike. Until the refractory timer times out, the integrator is inhibited. Frequency  $F_{\rm ref}$  controls the refractory period for all neurons on a chip.



Figure 3: Spikes generated for TIMIT utterance dr1/fsjk0/sa1. X axis is time (milliseconds), Y axis is channel, with low frequency (60 Hz) at bottom, and high frequency (6000Hz) at top. There are 29 channels. (a) shows spikes generated by simulated neurons, one per channel, no interconnection. (b, c) show spikes generated when each neuron excites its 10 (+/-5) adjacent neurons: b is simulation, and c is the hardware neuron. To illustrate the spatio temporal clustering which occurs, the section from 1775 to 2200 ms has been enlarged. (d) is an enlargement of (a), (e) of (b) and (f) of (c).



Figure 4: Very Low Duty Cycle Oscillator (VLDCO) (see Figure 2) output and performance for varying input level: Comparison of inter-pulse intervals for neuron 6 & 7.



Figure 5: Refractory periods for neuron 7 on chips A,B,C and D:  $F_{ref}$  period adjust is used to alter the refractory period of the integrate-and-fire neurons on a chip.