Project AGI

Building an Artificial General Intelligence

This site has been deprecated. New content can be found at https://agi.io

Saturday 14 February 2015

A response to "when is missing data a valid state?"

I recently published a post on the question of classifying 'no input' or missing data. There was a comprehensive reply by Fergal Byrne in a thread on the Nupic Theory mailing list. I am re-posting it here (below) as it provides a valuable perspective of the discussion in the context of human cortical function.

Jan 24th 2015 - by Fergal Byrne


Natural sensory systems have evolved to transduce some physical phenomenon into an information-carrying neural signal which is of value to the organism. Different modalities have different amounts of local processing, different schemas of encoding, and different types of encodings for "no input" or "nothing to see here". Which configuration a particular sensory modality uses is essentially arbitrary from an information theory point of view, as you (ie the system) just need to learn (or evolve) how to interpret and use the stream of data appropriately.

The input from a sensory source contains not just a series of "values" representing the physical phenomenon being sensed (as we often use in NuPIC), but also contains information "about" the input which is used by the nervous system to process that input. This information might be explicit in the data stream or it might be some implicit, distributed property in the correlations or sequencing of several signals. It might also be "remote" in some sense from the signal source, for example the motor command being sent to a muscle, from which stretch-sensing neurons are sending load-feedback information, mixed with sensory force-feedback information from the fingertips.

A region of cortex is always integrating some amount of feedforward input with some amount of input from elsewhere in its connectome. The mixture of feedforward and recurrent or feedback input will vary greatly, as will the absolute "amount" of signal (measured both by number of signalling axons and rate of signalling) and the information density of that input stream.

The job of a particular region is to make the attempt to form a stable sequence of representations of this varying stream of incoming signals, by predicting atomic sensorimotor transitions (in L4), pooling those transitions and fitting the pooled representations into learned sequences (in L2/3), executing behaviour (in L5), and feeding back towards the sources of the inputs (in L6). The extent to which a region succeeds in this task is encoded in the sparsity of its outputs (as well as the identities of the signalling axons), allowing other regions and the motor system to use this information appropriately.

This model differs from that of Predictive Coding (mentioned on your blog) in that every SDR encodes both a semantic representation and the prediction error, not just the error. The stream of signals coming out of a layer is maximally sparse when the layer has made a "perfect" prediction given its inputs, with extra neurons temporarily recruited to represent less well predicted information.

In each region, the information gets several chances to be converted into a nice, stable, sparse SDR:

- In L4, the raw, fast-changing sensorimotor inputs are integrated with the stabler information from L2/3, L6 and with local predictive input from L4 itself, and L4 attempts to simply predict what input will come in next.
- L2/3 uses top-down context signals in L1 as well as sequences it learns itself to form a stable, slower-changing representation of the outputs from L4.
- L5 uses the content and stability of L2/3, as well as the "goal" or "target" meaning of top-down L1 inputs to generate appropriate, corrective motor output and signal to higher regions.
- L6 polls information from thalamic sensorimotor input, from L5 (and thus L2/3, L1) and from within L6 itself to output "subgoal" data to lower regions, as well as gating or amplifying feedback to thalamus, and looped modulation of L4 activity.

Both L2/3 and L5 can signal large errors in sparse prediction to higher regions (where they might be handled) and L6 can control the inputs (both by thalamic gating and by controlling L4) and what the source regions are trying to do (by feedback into L1 in the lower region).

In the case where the feedforward signal is silent (in the sense of the design of the encoding, not literally "zero signal"), a region will rely more on other inputs (both feedforward and feedback) to decide its activity and output. Depending on its inputs and purpose, it may use L6 to try to "amplify" the inputs, it may use L5 to generate behaviour which causes new sensory data to come in, or it may simply "hallucinate" something consistent with past context and top-down influence.

In the situation of an alert human in a silent, dark room, V1 will essentially be receiving a very sparse stream of effectively random, stochastic signals from the retina (perhaps with the exception of a constant signal from specific classes of darkness-encoding ganglion cells). Any variation in these signals will just be noise, so V1 will be unable to form a stable predictive sparse output. Its output will also be a sequence of randomly chosen, uninterpretable bursting columns, which no higher region can use. This "anti-pattern" - an effectively random SDR with maximal bursting - is the "symbol" (or one of a vast number of equivalent semantically empty symbols) for darkness or absence of information.

At some point in cortex (perhaps in V1), a region will choose to save energy by switching off the inputs altogether using L6 control of thalamus. This leads to the perception of darkness which we experience.

An illustration of this is the terrifying experience of "tunnel vision". We normally maintain an illusion of living in a "dome-like" visual space by holding a memory of the visual world somewhere in the brain. We differentially update and refresh pieces of this memory using a combination of foveal and peripheral vision processing. This is an expensive exercise, but its very useful to an animal, so we're usually prepared to bear the costs and include it in our energy budget.

In an emergency involving the brain (extreme stress, trauma, disease), however, some triage mechanism activates and this luxury is withdrawn. The inputs and pathways which update the "visual dome" memory are gated out one by one, leaving only the direct, live feed from the foveal area providing visual input to the system. Regions performing the background painting of the memory stop receiving inputs, interpret that as "darkness", and paint the memory as "black". The oldest memories are the first to "expire", and they're usually the ones out near the edges of our dome. The dome gradually fades to black from periphery towards the centre, leading to the experience of looming darkness shrinking vision to a "tunnel".

In answer to your question about biological encoders, this lecture [1] (from a great course on vision [2]) gives you a flavour of the kind of pre-processing carried out in the retina.


Fergal Byrne

[1] https://www.youtube.com/watch?v=rWBW-OrVGAA
[2] https://www.youtube.com/playlist?list=PLCEC78997E3E2DAB4

No comments :

Post a Comment