Neural networks (history)

Next: Top-down vs. bottom-up approaches Up: AI Lecture 2 Previous: Brain-like computational modelling: neural Contents

Neural networks (history)

Warren McCulloch and Walter Pitts in 1943 showed that simple networks of simulated neurons could evaluate any computable function and claimed that if supplemented by a tape and means of writing/altering symbols on it, could function as a Universal Turing Machine. Each unit of the network (a McCulloch-Pitts neuron) is a binary unit (2 states - on/off) receiving excitatory and inhibitory inputs (+ or -1) from other units or from outside the network. The state of a network emerges over several cycles. In each cycle, inhibitory input blocks firing. Excitatory inputs are summed and if exceeding threshold lead to firing - simplified model of a neuron, as also of an electric relay. Further associated neurons with propositions and their activation states with truth values.

Donald O. Hebb (1949) published The Organization of Behavior: A Neuropsychological Theory ``Stimulus and response - and what happens in the brain in the interval between them.'' (Emphasised anti-localisation approaches of the Gestalt theorists and his own mentor, Lashley see sec. )

``When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased'' (p. 62). This idea laid the conceptual foundation for connectionism.

Hebb postulated that a particular stimulation leads to the slow development of a ``cell assembly'' - sensory neurons capable of acting together to stimulate other specific motor assemblies. These assemblies were created by an interaction in which repeated of firing of one cell due to another strengthens the connection between them.

Cell assemblies simulated on the digital computer included inhibitory connections and formulation of the ``learning rules'' employed in later connectionist modelling.

Apart from these models at architectural level there were more complex models, eg. of perception which used spacial maps of sensory inputs and motor outputs directing say eye movements.

Frank Rosenblatt (1962) constructed perceptrons (two-layer networks). Input device (retina) consisting of binary units with weighted connections to associator units which would become active when their combined input exceeded a threshold. The associator units in turn activated the response units. During each cycle the activity of the response unit was compared to the desired response. Connections which gave the correct response were strenthened and those giving the wrong one were weakened. Rosenblatt proved that if an appropriate set of connection weights existed, this procedure would find it. His work involved both mathematical modeling as well as experiments. His Mark I Perceptron had a 20X20 grid of photoelectric cells connected to electrical networks with variable connections.

A few years later Minsky and Papert (Perceptrons, 1969) proved that single-level perceptron networks could not calculate the simplest functions involved in pattern recognition, eg. they could not tell connected patterns from disconnected ones. Minsky and Papert's book sharply decreased interest in connectionism. Most cognitively-oriented computational work in the decade following the publication of Minsky and Papert's paper was based on symbolic processing, in the spirit of Newell and Simon's program. (Divided funds, practical limitations of hardware and available computer power, inability to deal with high-level cognition, aura of speculation and mystical holism.)

Revival of interest in neural networks

Though ignored by cognitive scientists, work in connectionist networks continued in the background, till in the mid-1980s interest in connectionism saw a dramatic revival. The factors:

Disillusionment with the (hugely funded in USA and Japan) classical symbol processing program of AI. Slow progress, limited almost entirely to microworld simulations.
Growing interest in specifically human cognition as opposed to the strong AI interest in general cognition and intelligence.
The rapidly decreasing cost of and increasingly easy access to computer resources. By the mid-1980s, personal computers became fast enough to simulate simple neural networks. Parallel computers were available to simulate large ones.
In the late '80s connectionist networks began to be used to shed light upon some real neurological phenomena such as dyslexia - brought connectionism to the attention of clinicians and researchers.
Applications in other branches of science, especially in physics, and in real world.

'80s NETtalk (Sejnowski) could convert letters into phonemes and thus learn to read (learning took it from babbling to baby-like talk to clear talk).

Next: Top-down vs. bottom-up approaches Up: AI Lecture 2 Previous: Brain-like computational modelling: neural Contents