# CHAPTER V CONCLUDING REMARKS Scientific community has done great efforts in analyzing computing properties of oscillatory neural networks and a large number of new models have appeared in recent years. However, computer simulations of oscillatory complex systems require solving numerous differential equations, which is usually slow and demands powerful computers. This drawback makes them not suited for real-time applications in most cases. On the other hand, if these systems are physically implemented instead of being simulated, we can take advantage of device characteristics and time and power requirements can be considerably reduced. In addition to this, it is possible to embed these networks in sensing stages simplifying communication tasks considerably. A clear example of this is focal plane processing in visual processing applications, like the subject of this thesis. Unfortunately, physical implementations of these models are not as common as simulations due to their complexity. This lack of physical designs would not be a serious problem if implementations were similar to simulations because results would be alike for both approaches. But obtaining accurate results is incompatible if area and power have to be reduced as much as possible. As neuron models are simplified and adapted to a simple physical implementation, their characteristics change appearing secondary effects and leading to a different network behavior. Then, these adapted models, in addition to being inspired in simulated models that have proven their ability, need to be studied in detail to obtain successful results. As the objective of this work, we have focused on the study of the characteristics of these oscillatory neural networks and their possibilities for a microelectronic implementation, besides the design and implementation on silicon of a sample model to check the validity of the results. The chapter is structured as follows. After some remarks on neuromorphic implementations, contributions of this research to engineering community and conclusions will be presented. After this, some practical applications of the design as it has been presented are shown. Finally a brief discussion about the future of this line of research and its applications to image preprocessing will be discussed. ## V.1. REMARKS ON NEUROMORPHIC IMPLEMENTATIONS Segmentation of real images is a complex problem that often has not a unique answer. A simple example of this complexity is a scene of an office with its desks, chairs and books, which is composed of multiple small objects and textures. Also, these small elements form bigger elements. Both, a keyboard key and the whole keyboard can be perceived as separate elements although the former is a part of the latter. Only a previous knowledge of what we are seeing and the ability to pay attention to certain parts of the image and ignore others may produce a perfect segregation of the key or the keyboard from the background. Real objects are very complex to be distinguished without attention and a previous knowledge of them. This complexity makes that the visual perception process cannot be carried out by a single layer of cognition when applied to real images and most layers need results not only from lower stages but from higher stages too. Thus, in most cases, some kind of feedback between different layers must be established to achieve correct results. As a result of the segmentation process, higher stages must analyze the obtained data and then modify some parameters of the segmentation layer to segregate objects again. This process must be repeated until correct results are obtained. For this reason it is important for the segmentation network to be fast enough to segregate objects various times before the next image has to be processed. Our design accomplishes this objective because segmentation is obtained in very few oscillation cycles, typically 2, with a period on the order of tens of microseconds. Thus, thousands of images can be segmented each second. In addition to being fast, another property makes this network very suitable for segmentation purposes. As results depend on biasing parameters, other visual stages may change these parameters and they can easily modify network behavior and segmentation properties, as demonstrated in the previous chapters. Work cycle is mainly controlled by discharge current, and the number of possible segmented objects strongly depends on it. This number also depends on speed of excitatory couplings, which depends on excitation current biasing. The ratio between inhibition and excitation currents allows controlling the size of segmented objects and position current is important for noise tolerance in input images. When this value is low enough, only pixels that belong to big objects can become active while isolated pixels, usually produced by input image noise, remain always silent not disturbing the segmentation process. It may be argued that the number of elements that the network can segregate is limited by biasing values. However, this network is inspired in the visual system of living beings and the number of elements that animals and humans can segregate is limited too. To solve this problem, the artificial system must behave as biological systems do. An attentional layer must select few objects among the great number of elements that are found in a visual scene to concentrate the attention of the whole system to them. First, the system can make a rough segmentation, then, concentrate in one complex object by disabling others and segregating objects of the selected complex object and so on until processing is finished. Then, the whole system will be able to segregate more elements than the rough segmentation network does. An evident example of this process in living beings takes place in written text. Our mind only decodes few words while other words are seen but not understood unless our attention is focused on them. Notice that we understand sentences because we remember the words and not because we see all of them simultaneously. Higher cognition levels have to be used to achieve proper results. These characteristics mean that a pre-attentive segmentation layer as the one presented in this thesis must be considered as an element of a whole perception system where other layers exist: layers that concentrate attention to a specific part of an image, layers that store information about what we are seeing and what we have seen before and other layers that carry out other cognitive processes that vary as a function of the intelligence level that the visual system is to be equipped. Evaluation and comparison of a single perception layer is not an easy task. Each vision algorithm emphasizes some properties and minimizes others, thus, each one is better suited to a specific kind of images and environment conditions and no standard rules for all images can be fixed. In addition to the previous problem, for a correct comparison of performance and power consumption, it is essential to consider the different stages that the circuit is connected to. Communication between different layers can be an important bottleneck when a great amount of information is used as in image processing. Parallel architectures can be very fast when processing data but they are more sensitive to slow processes of input and output information. If the parallel architecture is analyzed as a stand-alone device that has to communicate with its periphery via slow serial connections, it will obviously lose all its advantages in the communication process. However, if it can be embedded in a complex system with other cognition stages, communication can be parallel and no loses due to interfacing are produced. Our network is a good example of it. Loading the shift register with the input image data is a time consuming operation. Nevertheless, register D flip-flops in our design can be substituted by light sensors in a final design and then, no time consuming processes are needed to input data because it is acquired in parallel and as each photoreceptor is embedded in the system, it can be located close to each oscillator. Another important aspect of the architecture presented here is that each object of the image can be easily selected by choosing the proper sampling time during each oscillation cycle. Depending on the state of each cell when the network state is checked, we obtain an image that only contains one object while others are ignored. Then, this information can be fed to higher processing stages to be analyzed. If these stages operate also in parallel, they can be embedded in the detection sensor layer and the oscillator network, thus, extra communication is not required and the interface bottleneck disappears. An elementary example of this is given in this work. The global inhibitor evolution, which detects if any cell in the network is active, can be considered as a higher stage that counts how many objects are segregated. Then, this higher stage can easily output its results using a single line because they do not require a great amount of data. Fault tolerance is another important advantage of parallel systems over non-parallel approaches. As a parallel system, the network we have presented consists of a population of active agents and it can assume that some of its oscillators fail and still work properly as a whole while small faults in designs based on a central agent are nearly always fatal. This property has been demonstrated when noisy images are used. Oscillators associated with noisy pixels stopped their oscillations as if they had failed and results were correct although noisy. ## V.2. CONTRIBUTIONS AND CONCLUSIONS The main contribution of this thesis is the demonstration by means of simulations and by a practical implementation the possibility to implement bioinspired oscillatory algorithms to process images using VLSI analog technology. We focused on the microelectronic implementation of image segmentation networks based on LEGION algorithm. This algorithm has been developed by Wang and Terman and it has proven its segmentation capabilities on software simulations applied to natural images. However, its oscillatory nature produces a large computation load when it is not simplified. On the other hand, this algorithm has some intrinsic properties that make it suitable for parallel electronic implementations: its connections are mainly local and oscillations are based on differential equations that can be generated by physical elements. Thus, there is no need to simulate oscillations as done in computer implementations, but generate it in a physical system and then, just read the state of the system to which it has evolved. The key element of such networks is the basic oscillator that emulates a neuron and has to be replicated to form an array that contains as many cells as pixels are in the input image. Several approaches have been presented in the literature for neural oscillator implementation: - To mimic biological neurons as close as possible in software models or hardware implementations in view improving our comprehension of living beings. These models are difficult to analyze and simulate due to their complexity. - 2.- Software implementations of simplified neurons to emulate biological abilities while reducing their mathematical analysis and computer load when simulated. These models can be easily configured and modified although not - being usually fast and requiring large resources in terms of computing complexity or simulation time. In addition to this, a straight hardware implementation of them to improve performance is not simple. - 3.- Hardware implementations of neurons designed to simplify their architecture for specific electronic applications. Their main drawback is the difficulty of their mathematical analysis due to their expression complexity and that they cannot be easily modified after their implementation. However, if area overhead and power consumption are to be reduced, they are the best suited. They are not the most common ones because of their design cost and only few electronic models exist for image segmentation architectures. In this thesis, we have focused on the third option and we have designed a neural cell based on an astable oscillator and additional circuits to couple cells, input information, extract results and detect network activity to inhibit cell oscillations when required. Oscillatory and synchronization properties of such cells have been characterized analytically and by simulations and it has been demonstrated that their characteristics are suited for the task of segmentation. Complex structures of cells used in software simulations\* have been simplified while keeping synchronization characteristics and properly adapted for a straight VLSI implementation. In addition to this, computer simulations do not usually consider some secondary effects that arise in physical implementations unless they are specifically programmed, i.e. delays and mismatch between oscillator characteristics. In our work we have studied this effects and some interesting conclusions about their influence on segmentation properties have arisen. Besides simulations, a VLSI design for this network has been presented and implemented on an available analog VLSI technology. The final circuit has been tested and its functionality as an image segmentation system is demonstrated. In addition to this, we have corroborated the importance of mismatch in this system predicted in simulations, especially when area and power consumption are reduced to their minimum possible value. In our trials we have demonstrated that the proposed oscillators can synchronize in spite of their output delay and show that temporal lag between cells is due to mismatch and capacitances in output nodes. Thus, for future designs, capacitances or output voltage shifts could be reduced to improve network performance. However, delays are not critical for network functionality. Although its performance is reduced, it still works properly. Results also prove that the network can properly segment noisy images when certain biasing currents are low enough. This biasing produces that non-coupled oscillators, which are associated with noisy pixels, cannot oscillate and stay silent <sup>\*</sup> The term complex is tricky and may lead to interpretation errors. Here, as we are studying the network under an electronic perspective, complex means difficult to implement using VLSI techniques although it may be simple to model using closed expressions and simulate. during the experiment. This behavior allows the network to work with noisy inputs without using the lateral potential concept used in digital implementations, thus reducing considerably its complexity. In addition to this, we have confirmed that there is no need to add noise to the system for its proper working as done in computer simulations. Electronic noise and mismatch are enough to desynchronize oscillators when operation starts contrariwise to digital implementations that need to add some gaussian noise to start the segmentation process. It can be argued that an important drawback of the system is that this segmentation network is sensitive to tuning biasing currents. Depending on their value, segmentation results may vary, and also power consumption. However, as the perception of an image is subjective, this is a common problem of any low-level image processing algorithm, and the velocity of an algorithm is an important characteristic to consider because some trials must be done before correct results are obtained. This characteristic justifies faster processing speeds than image input cadence because more than one segmentation process per image is needed. # V.3. APPLICATIONS Scene segmentation layers are one of the most important elements in visual systems because segregating components that appear in an image into subsets that correspond to physical elements in a scene is essential to enable the tasks of higher processing stages. However, a single segmentation layer without other processing stages as light sensing, preprocessing or previously segmented information analysis is not very useful for real world applications. The design presented in this dissertation has been focused to demonstrate the possibility of the microelectronic implementation of the algorithm. Since it has been developed as a test chip, no other stages have been implemented and its resolution is limited for small input images (16-by-16 pixels) while practical image application need higher resolution devices. It seems logical to suppose that bigger networks could process real images as properly as the test chip presented. Nevertheless, scalability of the network is an issue to analyze in future designs. As the network grows, common node parasitic capacitances are greater but stronger drivers can solve that problem. The main problem when implementing greater networks is the number of possible objects that appear in an image and the increase of noisy pixels (commonly related to higher spatial frequencies that may appear in bigger images). The test chip presented in this dissertation demonstrated that it could cope with noisy images if proper bias conditions are used and it is reasonable to suppose that a bigger design would be able to obtain the same performance. Referring to the number of different objects that may appear in bigger images, some kind of attention layer should be added to the network in larger designs to concentrate the attention to particular parts of the input image and ignore others. Optical sensors have not been included in the design to simplify its implementations. However, embedding photoreceptors instead of a shift register loaded by an external computer is a necessary improvement for final products. Focal plane processing designs will be able to work without being connected to any external machine and problems to solve will be more realistic. Finally, although not many applications for the network as it has been presented exist -since it is only a part of a whole image processing system- we can enumerate some of them. First, it can isolate objects from complex images. Depending on the instant that the network state is checked, we obtain different results. Each instant corresponds to a single connected component object of the image and the right reading moment can be easily computed looking at the global inhibitor. The output image composed of active oscillators corresponds to pixels of a single object while oscillators associated with background and other object pixels stay silent. Then, this information can be used to concentrate the attention of the system to a single object. And secondly, the global inhibitor activity counts the total number of connected components in an image and gives an idea of their complexity. When oscillators grouped in one cluster shift to silent state but oscillators grouped in another cluster have not yet shifted to active state, the global cell shifts from its high state to its low state and stays there until the next cluster shifts to the active state. Thus, the number of valleys in inhibitor activity in a period accounts for the total number of objects in an image. In addition to this information, the length of activity pulses of the inhibitor is proportional to object complexity. As the object is more complex, that is to say bigger with long paths between pixels of the same object, oscillator delays increase and inhibition pulses become longer. ## V.4. FUTURE WORK There are several ways to improve the proposed system and make it more suitable for portable low power applications. As stated in the results chapter, the main drawback of the design is the combined effects of mismatch and output delay of oscillators. It is responsible of producing an important delay between cells especially when low current biasing is used. This delay, then, makes that non-directly coupled oscillators shift to active asynchronously and even be active at different instants. It also decreases considerably the number of possible different objects segmented in an image. To overcome this drawback one or both negative effects must be reduced. Advanced layout techniques can be improved or biasing or area increased to reduce mismatch. However, increasing power or area is against the objectives of the design, thus, reducing output delays seem the only option to follow. To accomplish this objective, the use of fast class AB key comparators has proven to have a large power consumption figure; thus, input/output current-mode comparators seems the best approach to reduce power consumption. If output voltages are nearly constant and do not have to shift from $V_{DD}$ to ground and backwards, power consumption will be reduced while changes from active to silent and backwards will be quite faster and not affected by parasitic output capacitances. Excitatory connections between cells are local in this design and it must remain local in future designs to maintain their number under a reasonable limit for parallel implementations. However, while maintaining them local, a diffusion network can be implemented to spread the effect of excitations further. This network can be as simple as to compare gray levels between cells and establish a connection strength that depends on it or be improved to detect motion, color, textures, etc. As stated at the beginning of this dissertation, this design is only a step of the whole vision process and this network must be considered as a part of a complete system to validate its performance properly. What makes it so suited to vision problems is that different algorithms can be implemented through excitatory synapses and then, our network can be used to segment the different elements that the former algorithm has detected. Another important issue is synchronization mechanisms, and in particular, the use of Fast Threshold Modulation (FTM) must be considered and studied in deeper detail. Mismatch is unavoidable in physical systems and to reduce it, other variables as area or power, must be increased. Two questions remain to be solved in future designs. The first one is whether weaker couplings can be used to synchronize oscillators with large mismatch. And the second one, how much can output delays be reduced without increasing considerably power needs. The more they are reduced, the faster transitions will be or the weaker excitation biasing we will need. If these values cannot be delimited below critical boundaries, strong couplings should be still used in the future. Segmentation of gray level images is another application of the presented network that must be studied in detail in following implementations. LEGION has proven its effectiveness for such images, thus, a microelectronic design could be also successful. In addition to this, the analog nature of the network makes it more suitable for analog values than digital systems because no A/D conversion is needed and analog values can be stored in single devices while these values need some bits to be represented digitally. Thus, when compared to latter designs, the neuromorphic network will considerably improve its performance. Finally, the ultimate objective of this kind of circuits will be to embed all vision tasks in one single design. In view of fast advances in microelectronics and neurosciences, it would not be foolish to foresee in the near future a new breed of devices, NSoC, which stands for Neuromorphic System on a Chip. These devices will be inspired in living beings and integrate in a single design sensing devices and low-level and high-level processing stages, obtaining small and low-power intelligent systems that can compete with modern high-end but voluminous and power-eager computers. Advantages of these systems will be integration and power consumption allowing its use in small portable and autonomous systems that cannot integrate present complex computers. But overall, the main advantage will be the capacity of full parallel processing, avoiding serial communication bottlenecks that slow down the overall system performance and complicate designs.