## More on Channel Theory

In my last post I introduced a couple of concepts from the channel theory of Jeremy Seligman and Jon Barwise. In this post I would like to continue that introduction.

To review, channel theory is intended to help us understand information flows of the following sort: a‘s being F carries the information that b is G. For example, we might want a general framework in which understand how a piece of fruit’s bitterness may carry the information that it is toxic, or how a mountain side’s having a particular distribution of flora can carry information about the local micro-climate, or how a war leader’s generous gift-giving may carry information about the success of a recent campaign, or the sighting of a gull can carry the information that land is near. In a previous post, we asked how position of various participants in a fono might forecast information about the political events of the day. One would hope that such a framework may even illuminate how an incident in which a person gets sick and dies may be perceived to carry the information that there is a sorcerer who is responsible for this misfortune.

In my last post, I introduced a simple sort of data structure called a classification. A classification simply links particulars to types. But as my examples above were intended to show, classifications are not only intended to model  ‘categorical’ data, as usually construed.

Def 1. A classification is a triple A = $\langle tok(A), type(A), \vDash \rangle$ such that for every token $a \in tok(A)$, and every type $\alpha\in typ(A)$, $a \vDash_{A}\alpha$  if and only if  $a$ is of type $\alpha$.

One might remark that a classification is not much more than a table whose attributes have only two possible value, a sort of degenerate relational database. However, unlike a record/row in a relational database, channel theory treats each token as a first-class object. Relational databases require keys to guarantee that each tuple is unique, and key constraints to model relationships between records in tables. By treating tokens as first class objects, we may model relationships using an infomorphism:

Def 2. Let $A$ and $B$ be two classifications. An infomorphism $f : A \rightleftarrows B$ is a pair of functions $f = \lbrace f^{\wedge}, f^{\vee} \rbrace$ such that $f ^{\wedge} : typ(A) \rightarrow typ(B)$ and $f^{\vee}: tok(B) \rightarrow tok(A)$ so that  it satisfies the following property: that for every type $\alpha$ in A and every token b in B, $b \vDash_{B} f^{\wedge}(\alpha)$ if and only if $f^{\vee}(b) \vDash_{A} \alpha$.

An infomorphism is more general than an isomorphism between classifications, i.e. an isomorphism is a special case of an infomorphism. For example, an infomorphism $f : A \rightleftarrows B$ between classifications A and B might map a single type $\beta$ in B onto two or more types in A, provided that from B’s point of view the two types are indistinguishable, or more precisely that for all tokens b in B and all types $\alpha$ in A, $f^{\vee}(b) \vDash_{A} \alpha$ if and only if $f^{\vee}(b) \vDash_{B} \alpha^{\prime}$. Note that this requirement does not mean that those types in A are not distinguishable in A (or more technically, are not co-extensional in A). There may be tokens in A outside the range of $f^{\vee}$ for which, for example, $a \vDash_{A} \alpha$ but not $a \vDash_{A} \alpha^{\prime}$. A dual observation may be made about the tokens of B. Two tokens of B may be mapped onto the same token in A, provided that those tokens in B are indistinguishable with respect to the set of types $\beta$ in B for which there exists some $\alpha$ such that $f^{\wedge}(\alpha) = \beta)$. Again, this does not mean these same tokens in B are wholly indistinguishable in B. For example, there may be types outside the range of  $f^{\wedge}$ classifying them differently. Thus, an infomorphism may be thought of as a kind of view or filter into the other classification.

It is actually rather difficult to find infomorphisms between arbitrary classifications. In many cases there will be none. If it were too easy, then the morphism would not be particularly meaningful. Too stringent and then it would not be very applicable. However, two classifications may be joined in a fairly standard way.For example, we can add them together:

Def 3. Given two classifications A and B, the sum of A and B is the classification A+B such that:

1.      $tok(A + B)=tok(A)\times tok(B)$,

2.     $typ(A + B)$ is the disjoint union of $typ(A)$ and $typ(B)$ given by $\langle 0,\alpha \rangle$ for each type $\alpha \in typ(A)$ and$\langle 1,\beta \rangle$ for each type $\beta \in typ(B)$ , such that

3.      for each token $\langle a,b\rangle \in tok(A+B)$ $\langle a,b\rangle {{\vDash }_{A+B}}\langle 0,\alpha \rangle \text{ iff a}{{\vDash }_{A}}\alpha$ and $\langle a,b\rangle {{\vDash }_{A+B}}\langle 1,\beta \rangle \text{ iff b}{{\vDash }_{B}}\beta$.

Remark. For any two classifications A and B there exist infomorphisms ${{\varepsilon }_{A}} : A \rightleftarrows A+B$ and ${{\varepsilon }_{B}}:B\rightleftarrows A+B$ defined such that ${{\varepsilon }_{A}}^{\wedge }(\alpha )=\langle 0,A\rangle$ and ${{\varepsilon }_{B}}^{\wedge }(\beta )=\langle 1,B\rangle$ for all types $\alpha \in typ(A)$ and $\beta \in typ(B) {{\varepsilon }_{B}}^{\vee }(\langle a,b\rangle )=b$ and ${{\varepsilon }_{A}}^{\vee }(\langle a,b\rangle )=a$ for each token $\langle a,b\rangle \in tok(A+B)$.

To see how this is useful, we turn now to Barwise and Seligman’s notion of an information channel.

Def 4. A channel C  is an indexed family of infomorphisms $\{ f_{i} : A_{i} \rightleftarrows C \} _{i \in I}$ each having co-domain in a classification C called the core of the channel.

As it turns out, in a result known as the Universal Mapping Property of Sums, given a binary channel C = $\{ f : A \rightleftarrows C, g : B \rightleftarrows C \}$, and infomorphisms ${{\varepsilon }_{A}} : A \rightleftarrows A+B$ and ${{\varepsilon }_{B}}:B\rightleftarrows A+B$, the following diagram commutes:

The result is general and can be applied to arbitrary channels and sums.

I still haven’t exactly shown how this is useful. To do that we introduce some inference rule that can be used to reason from the periphery to the core and back again in the channel.

A sequent $\langle \Gamma ,\Delta \rangle$ is a pair of sets of types. A sequent $\langle \Gamma ,\Delta \rangle$ is a sequent of a classification $A$ if all the types in  $\Gamma$ and $\Delta$ are in $typ(A).$

Def 5. Given a classification $A,$ a token $a\in tok(A)$ is said to satisfy a sequent $\langle \Gamma ,\Delta \rangle$ of $A,$ if $a{{\vDash }_{A}}\alpha$ for every type $\alpha \in \Gamma$ and $a{{\vDash }_{A}}\alpha$ for some type $\alpha \in \Delta$. If every $a\in tok(A)$ satisfies $\langle \Gamma ,\Delta \rangle$, then we say that $\Gamma$ entails $\Delta$ in $A$, written $\Gamma {{\vdash }_{A}}\Delta$ and $\langle \Gamma ,\Delta \rangle$ is called a constraint of $A.$

Barwise and Seligman introduce two inference rules: f-Intro and f-Elim. Given an infomorphism from a classification A to a classification C, $f:A\rightleftarrows C$:

$f\text{-Intro: }\frac{{{\Gamma }^{-f}}{{\vdash }_{A}}{{\Delta }^{-f}}}{\Gamma {{\vdash }_{C}}\Delta }$

$f\text{-Elim: }\frac{{{\Gamma }^{f}}{{\vdash }_{C}}{{\Delta }^{f}}}{\Gamma {{\vdash }_{A}}\Delta }$

The two rules have different properties.  f-Intro preserves validity, ­f-Elim does not preserve validity; f-Intro fails to preserve invalidity, but f-Elim fails to preserve invalidity. f-Elim is however valid precisely for those tokens in A for which there is a token b of B mapping onto A by the infomorphism f.

Suppose then that we have a channel. At the core is a classification of flashlights, and and at the periphery are classifications of bulbs and switches. We can take a sum of the classifications of bulbs and switches. We know that there are infomorphisms from these classifications to the sum (and so this too makes up a channel), and using f-Intro, we know that any sequents of the classifications of bulbs and switches will still hold in the sum classifications: bulbs + switches. But note that the classification bulbs + switches, since it connects every bulb and switch token, any sequents that might properly hold between bulbs and switches will not hold in the sum classification. Similarly, all the sequents holding in the classification bulbs + switches will hold in the core of the flashlight channel. However, there will be constraints in the core (namely those holding between bulbs and switches) not holding in the sum classification bulbs + switches.

In brief: suppose that we know that a particular switch is in the On position, and that it is a constraint of switches that a switch being in the On position precludes it being in the Off position. We can project this constraint into the core of the flashlight channel reliably. But in the channel additional constraints hold (the ones we are interested in). Suppose that in the core of the channel, there is a constraint that if a switch is On in a flashlight then the bulb is Lit in the flashlight We would like to know that because *this* switch is in the On position, that a particular bulb will be Lit. How can we do it? Using f-Elim we can pull back the constraint of the core to the sum classification. But note, that this constraint is *not valid* in the sum-classification. But it is not valid for precisely those bulbs that are not connected in the channel. In this way, we can reason from local information to a distant component of a system, but in so doing, we lose the guarantee that our reasoning is valid, and we lose the guarantee that it is sound.

[1] Barwise, Jon, and Jerry Seligman. 1997. Information Flow: The Logic of Distributed Systems. Cambridge tracts in theoretical computer science 44. Cambridge: Cambridge University Press.

### 5 Responses to “More on Channel Theory”

1. You’re wrong! My dissertation proves it.

2. My meta dissertation owns yours.

3. I wasn’t quite sure how to connect to this – I’m delighted by it, but the notation gets prohibitive for my puny blogworld attention span real quick. But then in another place I was thinking about reactions to the latest political sex scandal (Weinergate) and it sort of clicked as perhaps an example of what you’re getting at; which I think of as ‘linkage’, not of particulars to types but of one class of information to another. That is, do we learn something useful/important/at all relevant about politicians as political representatives from their sex lives? Have I got a valid example here?

4. I should first offer a word of explanation for what some may view as Dead Vole’s deadest vole. In the last few years I have mentioned channel theory to John McCreery (and to some of the other folks here as well) more than a few times, promising that I would present at least some of the theory. Unfortunately, its quite a bit of work to do, both to write and to read. Indeed, I’ve only presented the preliminaries; there is a more elegant formulation known as local logics and logic infomorphisms. and it is most interesting in its potential applications, which of course can only be got at once the concepts are gotten hold of. Of course, neither is it a panacea, nor is it the most appropriate choice of framework for specific modeling applications; Its purpose is to delineate the conditions necessary for information flow (of a certain kind), and as information can be quite heterogeneous in presentation and multimodal in format, channel theory is quite general in its set up. Unlike, for example, a logic which may build up its propositions (types) out of an array of components (e.g. all the machinery of first order logic), channel theory does not presume any particular internal structuring of either its types or its tokens, although these can be externally imposed. It does not even assume closure under negation.

I think that Carl hits it on the head. A classification is a kind of informational domain constituted by a relation of satisfaction between tokens and types. But channel theory is most concerned with a certain kind of linkage /between/ these informational domains. Information flow occurs /between/ classifications. It could be a classification of people by their sexual inclinations and a classification of politicians by their qualities as political representatives. We might have a theory that a person who cannot be faithful to his wife (or who is careless enough to get caught in so public a way) will not be faithful to his constituents (or too careless to do it right). Of course, the theory is only supposed to hold when it is the same individual in question. That identity is the posited linkage between individuals in each classification. If it were a channel, then the core of this channel might be some (native, or scientific) theory of behavior.

What I would really like to do is present a few examples of how it is supposed to work. One common application in the literature is an application of channel theory to the ‘alignment’ of concepts or ontologies. A well-cited example in the (small-ish) literature on the theory is the construction of a theory bridging French and American ontological classifications of bodies of running water. The idea is basically that you might want to have a guide to translating between two ontologies, e.g. if x is a riviere then it is either a river or a stream. If it is a stream, it is a rivere, but if it is a river, then it is either a riviere or a fleuve. If it is a fleuve then it is a river. This in itself is not at all radical, but channel theory hopes to give a grounded information theoretical basis for exactly how one conceptual scheme can translate or carry information about the other. It is not enough that the types be linked in some way. This is more acutely seen when the issue one confronts is how to bridge two ontologies, without a sufficient basis for comparison (at the level of tokens). You might think that it can be done at the level of types via structural comparisons of some sort, but it doesn’t work out that well, at all.