An orientation selective multi-chip aVLSI system for parallel image processing

Kazuhiro Shimonomura & Tetsuya Yagi
Osaka University
2-1 Yamadaoka, Suita, Osaka, 565-0871 Japan
e-mail: {kazu,yagi}@ele.eng.osaka-u.ac.jp

ABSTRACT
We describe a multi-chip aVLSI system which emulates the orientation selective response of the simple cell in the primary visual cortex. The system consists of two analog chips: a silicon retina and an orientation selection chip which mimics the parallel and hierarchical architecture of visual system in the brain. First, the image filtered by the Laplacian-Gaussian-like receptive field of the silicon retina, then it is transferred to the orientation selection chip. The communication between two chips is carried out using analog signals to represent pixel values. The orientation selection chip selectively aggregates multiple pixels of the silicon retina, mimicking the feedforward model proposed by Hubel and Wiesel. The present system exhibits the orientation selectivity with even and odd-type response. The spatial properties of analog responses from the orientation selection chip were verified for different receptive field sizes under indoor illumination. Multiple orientations and types of the output images can be obtained within a single frame period. The multi-chip aVLSI architecture used in the present study is considered to be applicable in the implementation of higher order cells in the primary visual cortex, such as the complex cell.

INTRODUCTION
The brain computes image with quite different algorithm and architecture from those of the conventional engineering system. Namely, image is sensed and processed by the retina and is sent to the brain. Here, the computations are carried out with a parallel architecture and as well as with analog signal representation in hierarchically arranged neuronal networks. Using this algorithm and architecture, the visual system of the brain can percept the external scene in real time with extremely low power dissipation, although the response speed of a single neuron is much slower than semiconductor devices. On the other hand, the conventional vision system, which consists of the von Neumann type digital computer and image pickup device such as CCD (charge-coupled device) camera, operates with sequential processing algorithm and digital signal representation. Because of this mismatch, the conventional image system faces to explosive expansion of energy consumption.

In this study, we fabricate a multi-chip aVLSI system to emulate the orientation selective response of the simple cell in primary visual cortex, aiming at developing a new image processing device which can be utilized for real-time image computations in engineering applications. The multi-chip system consists of a silicon retina and an orientation selection chip mimicking the hierarchical architecture of visual system in the brain.

There have been several types of neuromorphic aVLSI circuits fabricated previously to emulate the orientation selective receptive field with multi-chip configuration [2, 3]. Those chips use AER(Address Event Representation) for inter-chip communication, in which the spike output of each pixel on the sending array is encoded with a unique address [4, 10]. We used here more simple technique, analog signal transfer, considering its utility for practical engineering applications. This method also provides more intuitive insight on the computational essence in early vision problems, although it does not reflect the spike representation of communication between the retina and the primary visual cortex.

A MODEL
Convergence of the response of neurons having center-surround receptive field can produce an elongated orientation selective receptive field [5], which is known as the feedforward model. According to recent physiological studies, it is thought that the mechanism for generating the orientation selectivity is more complex [6]. Although this model does not thoroughly explain all properties of simple cell, we used a feedforward model as the basic structure to generate an orientation selective receptive field.

The electronic circuit model to obtain orientation selectivity with a feedforward connection is shown in Fig. 1. The upper half of the model consisting of a voltage source, resistive networks and subtraction block generates a Laplacian-Gaussian-like center-surround receptive field[9]. The voltage $V_p$ is dependant on light intensity applied to the photosensor. The input image is smoothened by two layers of resistive network, which have different tightness of electrical coupling between neighboring pixels, a Laplacian-Gaussian-like
Figure 1: Electronic circuit model to obtain an orientation selective receptive field based on feedforward model. $V_3$ is the Laplacian-Gaussian-like filtered output. $V_{v_p}$ is the orientation selective output.

receptive field, then, is obtained by subtracting $V_2$ from $V_1$. The meaning of distribution profiles of $V_1$, $V_2$ and $V_3$ in early vision can be interpreted by the standard regularization theory [7, 9]. $V_3$ of each pixel is sent to a corresponding voltage follower. The output node of some voltage followers located on a straight-line are shorted. Orientation of that straight-line corresponds to preferred orientation $\theta_p$ of this circuit. This forms follower-aggregator which calculates the average output of connected voltage followers.

**SYSTEM ARCHITECTURE**

Some aVLSI vision chips have been fabricated previously to implement the orientation selectivity with monolithic[13, 14]. In a monolithic implementation, it is generally difficult to realize both a high spatial resolution and computational complexity. To overcome this difficulty, we adopted a multi-chip implementation. Various neuromorphic multi-chip systems also have been fabricated for visual processing[2, 3, 10, 11, 12]. These systems use the address-event representation (AER) protocol for inter-chip communication. In the multi-chip system of the present work, analog output of each pixel on the sender chip is read out in sequence and transferred to the corresponding pixel on the receiver chip.

Figure 2: The schematic diagram of the multi-chip system consisting of the silicon retina and the orientation selection chip.

Figure 3: The circuit structure for aggregating multiple pixels located on a straight-line. All pixels located on an orientation wire are selected with NAND gate. Shift register for size selection, then, determines number of lines connected to output node $V_{out}$. 

BICS 2004 Aug 29- Sept 1 2004

BIS2.4 2 of 7
Fig. 4: The block diagram of the system to obtain the even and odd-type orientation selective output, \((\Delta x, \Delta y) = (0, 2)\) for \(0^\circ\), \((2, 0)\) for \(60^\circ\) and \(120^\circ\) orientation selective output.

Fig. 2 shows the block diagram of the multi-chip system to implement the model shown in Fig. 1. The system consists of a silicon retina and an orientation selection chip. The silicon retina has been designed to emulate the function of outer retinal circuit[8]. In the chip, the pixel circuit, including photodiode (PD), is arranged in the hexagonal grid. Neighboring pixels are connected by two layers of resistive networks. The spatial response of this silicon retina exhibits a Laplacian-Gaussian-like receptive field. The orientation selection chip consists of a pixel circuit array and five shift registers. Each pixel circuit includes analog memory to hold the analog voltage input from the silicon retina. The image represented by analog voltages in the silicon retina is transferred to the orientation selection chip in pixel by pixel using the horizontal and vertical shift registers. After the image is transferred, multiple pixel aggregation for obtaining the orientation selectivity is achieved using the wirings shown in gray lines of Fig. 2. Output image of the orientation selection chip is then read out using three shift registers.

Fig. 3 shows the pixels connected by an oriented wire on the orientation selection chip. Each pixel circuit includes an analog memory consisting of a hold capacitor and transconductance amplifier, logical NAND gates and analog switches. \(v_{in}\) is the analog voltage transferred from corresponding pixel of the silicon retina. When a pixel is selected by the signals from output shift registers, the analog switch turns on through the logical operation of NAND gates, and the voltage held by capacitor is read out. The multiple pixels defined by the vertical shift register for size selection are shorted and generate the orientation selective output \(v_{out}\) by follower aggregation.

The multiple pixel aggregation for the orientation selectivity can be executed by selecting multiple pixels located along the orientation wire. The orientation wire shown in Fig. 2 is for \(120^\circ\) selective output. Although other orientation selective outputs can be obtained by preparing common NAND gate for each orientation, only \(120^\circ\) connection was embedded in

Figure 5: Response to spot light. A is the pattern presented to the silicon retina. B is output of the orientation selection chip without pixel aggregation. C is orientation selective output with 8 pixels aggregation.

Figure 6: Cross section of a raw output from the orientation selection chip responding to a horizontal slit pattern. A and B are even and odd-type response, respectively. These are obtained from 10th vertical column of the chip.
Figure 7: Orientation tuning curve obtained from even-type output. Preferred orientation of the system was 0°, 60° and 120° in A, B and C, respectively. N, number of aggregated pixels, were 4(dot-dashed line), 8(dashed line) and 16(solid line). Accumulation time of the silicon retina was 33ms.

this prototype chip. In this chip, therefore, 0° selective output is obtained by selecting the aggregating pixels using output horizontal shift register. For 60° selective output, the output image of the silicon retina is transferred upside down. The response produced by the single orientation selection chip is even-type. The odd-type receptive fields can be obtained by off-chip subtraction between neighboring even-type pixels which are tuned to the same preferred orientation as each other shown in Fig.4.

This orientation selection chip was fabricated in a 0.35μm standard CMOS process. The size of pixel is 60.95μm(V)×58.85μm(H) yielding a 21×21 pixel array using 2.4×2.4mm² chip. The average power consumption is 13.6mW at 3.3V power supply.

RESULTS

The white spot light on black background shown in Fig.5A was presented to the silicon retina. The output of the orientation selection chip is shown as grayscale image in Fig.5B and C. B is the output without pixel aggregation, which is about the same as the response of the silicon retina. Inhibitory response occurs in surrounding region by center-surround receptive field of the silicon retina. Upper and lower row in C shows even-type and odd-type output with 8pixel aggregation, respectively. The orientation selectivity of the orientation selection chip is set to 0°, 60° and 120° preference. The excited region and the inhibited region appear side by side alternately, similar to the elongated receptive field found in the primary visual cortex simple cells. Fig.6 shows row cross sections of output responding to a horizontal slit pattern. These outputs were obtained under indoor illumination. Accumulation time of the silicon retina was 33ms.

The white slit pattern on black background oriented every 15° was presented to the silicon retina. Fig.7 shows the response of the orientation selective chip from the pixel at the center of the slit. Preferred orientation is 0°, 60° and 120° in A, B and C, respectively. In each case, the response becomes larger as the orientation of the slit is closer to the preferred orientation of the pixels. The number of aggregated pixel, N, was changed to 4, 8 and 16 as in Fig. As N decreases, the orientation tuning curve becomes shallow because the aspect ratio of the receptive field becomes smaller.

We investigated spatial frequency properties of the orientation selective chip. The grayscale grating patterns with varying spatial frequencies were presented to the silicon retina. The orientation of the grating pattern was the same as preferred orientation of the

Figure 8: Spatial frequency response characteristic of the orientation selection chip. Bias voltage $V_{brs2}$ for determination of the receptive field size was set to 0.24V, 0.48V, 0.82V and 1.4V for curve with square, circle, triangle and inverted triangle, respectively.
Figure 9: Response to hand with various orientation in each column. A is output of the orientation selection chip without pixel aggregation. B is orientation selective output with 8 pixels aggregation.
pixels. Varying the bias voltage $V_{bias}$, which controls the resistance $R$ in Fig.1, experiments were carried out for four different receptive field sizes. Measured spatial frequency properties are shown in Fig.8. Preferred orientation of the system was $0^\circ$. The response shown on the vertical axis is the maximum amplitude responding to the drifting grating pattern. Spatial frequency was calculated based on photosensor spacing of the silicon retina. The peak response amplitude moves to lower frequencies, as the receptive field size broadens.

The response of the orientation selection chip to natural image placed under indoor illumination ($0.2W/m^2$) was verified as shown in Fig.9. Accumulation time of the silicon retina was 33.3ms. A hand with various orientation was presented to the system. Fig.9A shows the response without pixel aggregation. The orientation selective responses to the hand with pixel aggregation are shown in B. For each case, 8 pixels were aggregated on the orientation selection chip. The fingers with the same orientation as preferential orientation of the chip respond clearly, while the others are blurred. These responses were obtained within one frame period (33ms) from the same orientation selection chip, sequentially, by changing the orientation of aggregated pixels. It is, however, a big advantage of the hardware system that separate chips of different orientation can be connected in parallel to the silicon retina simultaneously.

CONCLUSIONS
In the present study, we have implemented the analog multi-chip neuromorphic vision system, aiming at a new image processing device which can be utilized for real-time and parallel image computations in engineering applications. This multi-chip vision system provides the orientation-selective response which mimics the spatial properties of simple cell response in primary visual cortex. The orientation selectivity of this system is produced by the hierarchical architecture, based on the feedforward model, consisting of the silicon retina and the orientation selection chip. The present system will be applicable to emulate the response of the other simple and complex cell model using multiple types of the output obtained in parallel from the system.

ACKNOWLEDGEMENT
The VLSI chip in this study has been fabricated in the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo with the collaboration by Rohm Corporation and Toppan Printing Corporation.

REFERENCES