On Understanding Understanding.
Perception-based processing of NL texts in SCIP systems, or meaning constitution as visualized learning.

Burghard B. Rieger¹
FB II: Computational Linguistics, University of Trier
Universitätsring, D-54286 Trier, Germany
URL: `www.ldv.uni-trier.de:8080/rieger.html`

Abstract: Inspired by information systems theory, Semiotic Cognitive Information Processing (SCIP) is grounded in (natural/artificial) system-environment situations. SCIP systems' knowledge-based processing of information makes it cognitive , their sign and symbol generation, manipulation, and understanding capabilities render it semiotic . Based upon structures whose representational status is not a presupposition to, but a result from recursive processing, SCIP algorithms initiate and modify the structures they are operating on to realize (rather than simulate) language understanding by meaning constitution. Thus, the symbolic (de)composition of propositional structures in traditional semantics is complemented by SCIP, which models learning and understanding dynamically by visualizing what is understood in a perception-based, sub-symbolic, multi-resolutional way of processing natural language discourse. An experimental 2-dim scenario with object locations described relative to a mobile agent's varying positions allows to test SCIP systems' performance against human natural language understanding in a controlled way² .

Keywords: Computational semiotics, discourse understanding, meaning constitution, semantic space, fuzzy modeling, symbol grounding, dynamics, systems theory, visualization, quantitative linguistics.

1 Introduction

In terms of information systems theory, life may be understood as the ability to survive by adapting to changing requirements in the real world. Living systems do so by way of processing information they receive or derive from relevant portions of their surrounding environments, of learning from their experience, and of changing their behavior accordingly. In contrast to other living systems which transmit experiential results of environmental adaptation only biogenetically³ to their descendants, human information processing systems have additional means to convey their knowledge to others. In addition to the vertical transmission of system specific (intraneous ) experience through (biogenetically successive) generations, mankind has developed complementally effective horizontal means of mediating specific and foreign (extraneous ) experience and knowledge to (biogenetically unrelated) fellow systems within their own or any later generation.

1.1 Process and Result

This is made possible by a semiotic move based on representations that allow not only to distinguish processes from results of experience, but also convert the essentially transient status of experiential results to some more stable though dynamic cognitive entities of knowledge. These can be retrieved, activated, re-used, modified and improved by later processes and their results in learning.

Vehicle and medium of this move is a particular kind of representation. Based upon more or less complex sign systems which constitute languages to form more or less abstract compositional structures or textures, these may be realized in processes of communicative exchange, called actualization⁴. Such textures function - whether internal to a system (as its knowledge) or external to it (as language texts or discourse) - like virtual environments⁵ to any system properly attuned, i.e. being able to recognize and interpret them.

1.2 Mediate and Immediate

In terms of information system theory, virtuality can be characterized by the fact that it dispenses with the identity of space and time coordinate values for a system and its environment which normally prevails for this relation when qualified to constitute reality [3,pp. 287]. The hypothesis is that the dispensation of this identity (of value pairs) is not only conditional for the possibility to distinguish (mutually and relatively independent) systems from their environments , but also to establish a systems theoretical perspective on representations. This opens new vistas on language understanding and how it can be modeled apart from propositional decomposition of language structures.

Immediate or space-time-identical system-environment settings can be distinguished from mediate or space-time-dispensed system-environment situations. The former may be characterized (and modeled) by some stimulus-response form of interaction whereas the latter are tied to some intermediate stratum of a particular format which needs actualization in order to be realized. Its representational function seems to evolve in the very process of actualizing which is itself immediate in the sense that the system concerned has to have physical access to that stratum and its particular format in order to let it represent something else or become mediate in the above specified sense. The distinction corresponds to the twofold status which semioticians like SAUSSURE [4] and PEIRCE [5] had long identified as being characteristic of signs and symbols as well as text and discourse. These can both be perceived as some physical language material (consisting of components) and also be realized as language structure (having meaning) which is to be understood.

It is this double identity (or ontology) of language signs and symbols which calls not only for a two-level modus of actualization but also for a multi-level modeling approach to realize understanding by machine. Taking up conceptions developed in situation theory [1,6], the semiotic approach to understand language understanding may tentatively be characterized as follows:

For information processing systems appropriately adapted (tuned) to their environments the process of actualization consists essentially in a twofold embedding

: to perceive the space-time-identity of pairs of immediate system-environment coordinates which will let the system experience the material properties of texts as composed of signs (i.e. by functions of physical presentation and mutually homomorphic appearance). These properties apply to the percepts of language structures accessible to a system in particular discourse situations , and
: to realize the representational relatedness of pairs of mediate system-environment parameters which will let the system experience the semantic properties of texts as meanings (i.e. by functions of emergence, identification, organization, representation of structures). These apply to the comprehension of language structures as multi-level and multi-dimensional entities (re)cognized by a system to form the described situations which can be understood.

1.3 Knowledge and Cognition

In terms of cognitive theory realizational functions like identifying structures, interpreting signs, and understanding meanings translate to processes which extend the segments of reality accessible to a living (natural and possibly artificial) information processing system. This extension applies to both the immediate and mediate relations which a system establishes by adaptive learning and emergent understanding based on its own innate or acquired structuredness, its processing capabilities, and its knowledge. Knowledge based information processing therefore became the paradigm for modeling cognition. Traditional approaches to cognition have tried to provide the knowledge externally as propositionally formatted symbol representations and consequently had to exclude the dynamics of change and self-organizing which are characteristic of adaptation and learning. More recent models of cognitive information systems are designed to model attunement to (textual) environments, endowed with knowledge acquisition, modification, and representation capabilities to allow for the dynamic processing of representational data or sign structures to actualize relevant information. These are referred to as semiotic cognitive information processing (SCIP) systems [7] one of which will be dealt with here.

Following these introductory remarks (I) will be a short survey (II) of what cognitive models of meaning have dealt with so far. Some essentials of SCIP systems as part of a future dynamic image generating semantics (DIGS) will (III) characterize the computational semiotics approach which - for the purpose of modeling - allows to separate the process of understanding referential expressions from generating them in describing real world entities. The model will be realized in three steps of increasing procedural concretion of processing by formal definition as morphisms and constraints (IV), their specified instantiations as setting and experiment (V), and their implementations as process and measurement (VI) with test results to illustrate the model's performance. Finally, a summary and outlook to future work (VII) will conclude the paper.

2 Cognitive Models of Meaning

It is common practice in cognitive modeling and mathematical semantics [1,p. 57] to identify the real world with the (symbolic) structure that represents it. From a semiotic point-of-view, this identification is hiding rather than revealing what makes a structured sign aggregate represent or stand for (symbolize) something else.

2.1 Reality, Perception, and Representations

In the context of disciplines focusing on aspects of cognition, like language philosophy, logics, linguistic semantics, biological neuro-science, and computational connectionism, it has been outlined [8] that the relationship between the real world or objective reality (R) of observable entities external to a cognitive system, and the perception of such entities by observations which constitute a system's experience or subjective actuality (A), is cognitively as well as epistemologically highly relevant and model-theoretically most decisive. Suggestions for how this mediation relation may be (re-)constructed have resulted over the years in a number of types of models. These range from simple identity as A=R, to functions as A=f(R) depending on reality (R) only, or as A=f(R,O,C) being based additionally on features of the observing system (O) and its cultural and/or experiential background (C), and reach out to structurally coupled resonance phenomena of semantically closed cognitive systems as A_t+1=f(A_t,E,P) which relate perturbations (P) inflicted on the system-environment from outside, the structure of a state space (E) determining that system's possible states, to cope for the dynamic changes of the system's actual states A_t along a time scale. In this formula, A seemingly can do altogether without R [9]. This is a consequence of self-organizing, dynamic, autopoietic systems [10] for which the observability of entities external to a cognitive system hinges on their communicability to others which include internal results of commonly experienced external perturbations. Reality R, therefore, should be viewed more like a situational condition for the possibility of inter-subjective and social collections of experiential results rather than an independently existing realm of entities. Thus, suggesting and finding parameters to reconstruct the background of experiential perception for the interpretation of what can be considered observable reality in this way, underscores the importance of distinguishing endo- from exo-views of reality to overcome the traditional mind/matter duality. In view of representations like natural languages, the endo-exo distinction allows for a semiotically more adequate approach to entities whose observable reality provides for an experiential perception which is also the precondition for their understanding (and the modeling of it).

2.2 Semantic Theory, Meaning, and Understanding

Theoretical and computational linguistics - mediated by (language) philosophy, (formal) logics, and (discrete) mathematics - have clearly dominated research and explicative theory development. They decided on how natural languages (NL), their (compositional) structures, and their (semantic) functions are to be understood and explicated as symbol manipulation and transformation systems. NL communication has long been conjectured to consist of what only recently the cognitive sciences have identified as a complex of multi-level processes. These were conceived as operating on (world, linguistic, situational) knowledge which has to be considered conditional for any information processing. However, the knowledge bases (KB) designed to comply with these conditions were hypothesized as physical symbol systems [2,11] whose static conception of structure proved to be unable to adapt to changing conditions (learning). Some of the problems [12] that cognitive modeling along these lines encountered since are due to the declarative (i.e. symbolic, compositional, propositional) formats employed and the (deterministic, rule-based, modular) procedures chosen in generating, forming, and manipulating linguistic concepts like morphemes, syllables, words, phrases, sentences, texts, and their meanings. As these tend to be construed of clear-cut elements (aggregates, structures, relations, functions, processes, etc.) of systems of language entities, their crisp and determinate definitions do not comply with the way they are perceived which is variable, context dependent, fuzzy, and possibilistic in nature.

In order to understand the dynamics of how natural languages serve the communicative purposes they do, fuzzy [13,14] and procedural modeling [15] approaches to semiotic systems [16,17] and NL understanding [18,19] have advanced some ideas [20,21] for a computational theory of cognitive processing of fuzzy percepts. Conceived as a multi-layered process of structure identification and dynamic representation, the fuzzy modeling techniques employed in SCIP systems so far allow for (numerical, sub-symbolic, distributed, non-propositional) formats whose (parallel, pattern-based, quantitative) computation results in (the emergence of) meanings as enactment of labeled processes of choice restriction [22]. Accordingly, meanings are the outcome rather than the presuppositions of processing [23], whose modeling is a form of realization rather than simulation [24]. It appears that a perception-based simulation of processes (of constraint detection and representation) may bring about results which realize meaning constitution and understanding (of symbolic structures) as grounded in these very processes.

One of the most severe problems though, arising with this kind of non-linguistic models of language understanding is how their performance and their results may be evaluated and tested in an inter-subjectively agreeable and preferably empirical way. Generating images of what declarative referential language expressions describe could be a solution, provided that the semantic contents detected computationally are a result of and not a presupposition to the detection process.

3 SCIP Systems and DIG Semantics

Based on the above (as yet) fragments of a computational theory of cognitive processing of fuzzy percepts, the contours of a dynamic image generating semantics (DIGS) can be identified which may eventually be able to cope with variability and vagueness, adaptivity and learning, emergence and plasticity of knowledge and understanding in a unifying and comprehensive way. As a fully-fledged theory it would comprise perception-based, sub-symbolic parts side by side with rule-based, symbolic components, the former ideally grounding the latter. Anticipating this combination in a weaker sense, the setup of the modeled semiotic cognitive information processing (SCIP) system is complemented by a component which employs rule-based, symbolic processing for deterministic language generation purposes. As these techniques have long been developed and applied in natural language processing (NLP) in general and computational linguistics (CL) in particular, they are well understood and ready to be used.

In our setup they will be employed to generate the language material (texts) describing real world situations in a formally controlled, i.e. rule-based, determinate and symbolic way. This training material will then be submitted to an equally controlled, though simulative way of processing which is perception-based, non-determinate (stochastic), sub-symbolic (numerical) and hypothesized to model a symbol grounding process of language understanding [25,26] . Suitable visualization of processing results will allow for an ad oculos test and comparison with the real world scenario described by the texts processed.

3.1 Knowledge, Memory, and Models

Most cognitive scientists agree that cognition is a form of information processing. Models of cognition therefore are inspired by information systems theory and based upon (natural or artificial) system-environment situations. Any system whose processing of external, environmental data (input) is determined by its own internal structuredness will generally produce some information (output) relative to both, its internal and external conditions. As soon as the flow of input data consists not only of signals but also of signs or symbol aggregates, the simple system-environment relation will become more complex, oscillating between immediate and mediate as characterized in system theoretical terms. This is due to the double ontology of signs which are not only perceived but have to be recognized as representations in order to be processed accordingly, i.e. interpreted as standing for something else that the perceivable signal is not.

Traditional models of cognitive information processing try to account for this double ontology of signs and symbols - which are physically real like data but in addition also have meaning - by providing the processing system with the necessary information via arbitrarily complex representations (sets, structures, systems) of sign-meaning correspondences, named knowledge-bases. KBs extend the system's data processing capabilities to cognitive, i.e. knowledge-based processing in generating, manipulating, and interpreting sign and symbol aggregates of different kinds. These comprise linguistic knowledge in form of grammars (rules of syntax and semantics), and world knowledge in form of network structures (like frames, scripts, and scenes).

Conceived as being externally attributable to the modeled system and therefore assembled and formatted by the model designer, KBs obviously serve to model functions which are considered essential to the original or natural cognitive systems and their structure (i.e. knowledge and memory). The assumption behind most KBs, however, that knowledge is propositional and its only format of representation is truth-functional, rule-based, and symbolic in nature (linguistic transparency) has been refuted [7,pp. 350]. It was revised in so far as the process of language understanding can neither be identified with sentence parsing nor with the inverted process of generating natural language expressions applying formal syntax and semantics as provided by computational linguistics.

Picture Omitted

Figure 1: Schema of test layout to compare the situated SCIP system's (enigmatic) internal-view (endo-reality) resulting from its (well-defined) processing, against the observer's (well defined) external-view (exo-reality) which traditional, symbol based, cognitive modeling identifies prematurely with the (enigmatic) processes underlying natural language understanding. Whereas the referential semantics and propositional text grammar are employed to generate PHT corpora of NL descriptions of (real world) situations, the subsymbolic, two-level processing of these descriptions yield the SCIP system's semantic space structure. Its algorithmic visualization (Fig.9) allows for a comparison with the external observers' view of real world situations (Fig.3) which traditional models describe by grammatically correct and semantically true propositions encoded as referential meaning or informational content.

3.2 Cognitive Information Processing

In order to let traditional models of cognitive language information processing (CLIP) become semiotic, their knowledge and memory functions have to be conceived as procedural and internal to the systems changing their character from static determination to dynamic flexibility (Fig. 1: processing loops). Additionally, the representational format for knowledge structures and memory functions should facilitate adaptation to changing environmental and processing conditions (learning), and enable identification in changing contexts (efficiency) for a singular system concerned, as well as among a plurality of systems interacting by means of externalized sign representations (communication).

Allowing for variable, ill-defined, underdetermined data to be processed, and enabling the self-organized constitution (emergence) of vage and fuzzy entities to be represented and operated on, semiotic cognitive information processing (SCIP) is based on well-defined procedures which can handle imprecision in a precise way. SCIP systems' ability comprises their performance in knowledge-based information processing and representing its results [27], organizing these representations by activating others from prior processing [28], constituting meanings [29], allowing for (semantic) inferencing [30], and planning [31] by selecting from organized and represented dispositions [32], and modifying them according to changing conditions, results, and states of evolving system-environment adaptedness [33]. Based on NL structures, SCIP performance is a form of complex, multi-resolutional information processing. As a process of meaning constitution it is tied to (and may even be identified with) language understanding [34] or meaning acquisition. Whenever the meaning of signs is not a presupposition to but a result from algorithmic processing of (symbolic) data whose representational status (like in NL discourse) is commonly accepted, then these learning algorithms - being able to initiate and modify the structures they are operating on - may qualify as semiotic and thereby as part of computational semiotics .

3.3 Perception-based Discourse Understanding

The SCIP system's approach to discourse understanding is - very much like modeling vision [35] - essentially perception based. As such it complements the declarative, symbolic (de)composition of propositional structures exercised by traditional NL semantics in a way which allows for the dynamics these lack. Provided by procedural definitions of quantitative, sub-symbolic and flexible pattern identification, representation, and manipulation, their flexibility might become a central part of an evolving dynamic image generating semantics (DIGS). Its adaptivity would essentially depend on the SCIP system's format of non-symbolic, distributed numerical representations whose processing allow new representations to emerge when needed. They are tieing the system to those segments of the real world which the language expressions are a part of and - when processed properly - convey information about as their meanings⁶. They do so both, according to their grammaticality and propositional contents as determined in a formally specified sense external to the system, a n d according to the system's own or internal understanding based upon the non-propositional, syntagmatic and paradigmatic regularities in textual structures which can also be visualized. To achieve this, DIGS would have to formalize these ties in two ways: as a deterministic system of grammatical rules for semantic and syntactic constraints to generate true and correct language descriptions of real world entities, and inversely - independent from grammatical rules and their symbolic representations - as a class of restrictions that are typified by (soft) constraints, modeled as procedures which produce (fuzzy) relations represented as (word type/ numerical value) distributions. As the former can straight forwardly be provided by computational linguistics, the latter are not just another instance of transformed data representation but - as they result from non-symbolic, numerical computation - a new type of structural representation associating emergent entities (concepts) with observable entities (objects/signs) to realize what may be named their understanding.

Picture Omitted

Figure 2: Diagram of morphisms mapping vocabulary items (signs) z Î T Í V onto meaning points or intensions p Î M Í I , allowing their designation des Í V×M (cognitive interpretants) to be reconstructed as composition par °syn. The denotation den Í M×X relating intensions to real world entities, may be reconstructed as composition env°sys of the attuned system's constraints' relation sys Í M×S and the environmental segment's constraints' relation env Í S ×X. Thus, den relates (fuzzy) intensions p Î M Í I to real situations by classifying (fuzzy) subsets X of entities (objects) x Î X Ì U in the universe of discourse due to types of (abstracted) situational uniformities s Î S common to both. Hence, the reference relation ref Í T ×X is reconstructed as composition den°des, whereas its inverse or description relation dsc Í X ×T is (re)constructed as composition stx°sem of sem Í X ×E and stx Í E ×T, relating (real) entities (objects) x Î X Ì U via (formal) language expressions (logical interpretants) e Î E Ì G of the grammar to semantically true and syntactically correct (natural) language strings (signs) z Î T Í V.

4 Morphisms and Constraints

Being grounded in system-environment situations, SCIP systems may formally be characterized by morphisms⁷ which allow to represent meanings and functions of language entities as evolving from multi-level decompositions of cycles of constraint processing (referring here and below to Fig. 2) operating on and modifying the structured entities concerned [7,p. 380]. Thus, morphisms designate a very general type of relatedness which allows to characterize the procedural notion of semioticity formally on a rather abstract level. Morphisms call for further specifications which in turn may be instantiated in a variety of ways. Some of these will permit operational application in SCIP-like settings of which a few might even realize PEIRCE's conception of semiosis⁸ .

4.1 Decomposition I

The first level of decompositions apply to both the reference and the description morphisms ref and dsc (Fig. 2).

A. 1 For the process type of describing entities in the universe of discourse, the morphism dsc:X® T is introduced (Fig. 2). In order to generate semantically true sem and syntactically correct stx natural language expressions T Í V from a given vocabulary, the decomposed morphism dsc=stx°sem Í X×T will have to be instantiated. This instantiation can theoretically be specified and algorithmically determined by formal expressions e Î E Ì G of grammatical adequacy as provided by computational linguistics. The morphisms stx and sem define a notion of constrained syntactic correctness and semantic truth of propositional structures. These are dynamically generated to describe real world entities x Î X Ì U in a controlled way to form NL expressions in texts z Î T Í V. Assembled into collections of increasing size, this language material T Í V forms PHT-corpora (of pragmatically homogeneous texts) whose semantic contents (meaning) are the described situations these texts refer to.

A. 2 For the inverse process type of understanding natural language expressions T Í V the referencing morphism ref:T® X is introduced (Fig. 2). Due to the designative and denotative constraints des and den hypothesized to constitute referential meaning, the decomposition ref=den°des Í T×X allows to instantiate the reference morphism relating language entities T Í V to specified real world entities X Ì U in the universe, i.e. constituting these NL expressions' meanings.

However, whereas the description process can be based on externally defined formal grammars G whose expressions of symbol manipulation rules E Í G fully determine the language generation, the meanings or concepts M Í I which instantiate the referencing process cannot be provided from the outside without losing the chance to see the system's own, internal way of meaning constitution diverge from the external observers' view and to model its possible approximation to the model designers' understanding. In order to keep that possibility and let the model produce such potential divergence, another level of decomposition has to be introduced to allow instantiation of the as yet unspecified morphisms des and den.

4.2 Decomposition II

The second level of decompositions apply again to both the designation and the denotation morphism des and den (Fig. 2).

B. 1 As a relational notion of correspondence between observable language elements z Î T Ì V and realized entities of an abstract conceptual nature p Î M Í I, the designation morphism des=par°syn Í T×M is defined as a composition of principles which restrict the combinability of language entities in a way universal to all natural languages. These principles characterize natural languages' ability to form discernable entities and patterns recursively by aggregational or syntagmatic (syn) and selective or paradigmatic (par) restrictions. These can be instantiated by implementable semiotic algorithms for the recursive computation of the combinatorial constraints syn and par and their multi-layered, multi-resolutional representation y Î C in (patterns of) distributions of (emergent and abstract) entities p Î M Í I.

B. 2 For the abstract entities p Î M Í I which are being realized conceptually to relate to real world entities x Î X Ì U in the universe, the denotation morphism den=env°sys Í M×X is defined as a composition of structural constraints inherent to the system sys on the one hand and to the environment env on the other. They couple the system and its environment to each other and determine their mutual structuredness, restricting the range of components in typified situations s Î S common to both. Instantiated as a cluster analyzing algorithm, the sys constraints provide the internal or endo-view a SCIP system may obtain of its environment in collecting structural information (uniformities) as gathered from processing the discourse that describes it. Visualizing these uniformities is a transformation algorithm which instantiates the sys morphism to yield an image of the real world situation comparable to x Î X Ì U.

4.3 Natural Language and Symbol Grounding

In order to demonstrate the suggested SCIP system's potential and factual discourse understanding capability, it has to be made more concrete procedurally. The base for such a concretion will be language or rather natural language discourse in its structured form. It functions as structural coupling [10], which not only relates an information processing system to its embedding environment (and vice versa), but also serves to instantiate the hypothesized description and reference morphisms by different processes due to the semiotic functions that characterize situated natural language discourse. Whereas the process dsc:X® T of describing real world entities X Ì U by NL expressions T Í V can and will be controlled by means of formal grammars (syntax and semantics) dsc=stx°sem Í X×T, the process ref:T® X of referencing or how NL expressions T Í V can stand for or represent some real world entities X Ì U is still enigmatic. Its hypothesized reconstruction ref=den°des Í T×X as understanding process composed of meaning constitution (mental imaging) des=par°syn Í T×M and structure visualization (symbol grounding) den=env°sys Í M×X needs to be specified.

To ease the burden of declaring and outlining the correspondences between the formal types of processes introduced so far and their procedural instantiations, the measurements of constraints and their algorithmic implementation, up to computed results and their visualizations, we will in what follows confine ourselves to an experimental setting chosen to limit the SCIP system's testable performance with reference to earlier publications where appropriate.

5 Setting and Experiment

0.6mm
Picture Omitted
Figure 3: 2-dim reality of stationary object locations \bigtriangleup and ^[¯] with mobile agent A, oriented North. The agent's system-positions relative to the object-locations determine the propositional descriptions of SPOL relations in simple, declarative sentences. These are composed of four core predicates (left, right, front, behind) modified by five hedge predicates (two first order: near, far; three second order: extremely, very, rather) as specified by the formal grammar (syntax Tab. IV and semantics Tab. V) which define and control the semantic content of the generated descriptions (PHT corpus), not however the way it is processed for understanding by the SCIP model..

Modeling understanding as a particular form of information processing within a system-environment frame (Fig. 1) is to take advantage of running real-time process simulation tests. For the purpose of experimentally testing semiotic processes, however, their situational complexity has to be reduced by abstracting away irrelevant constituents, hopefully without oversimplifying the issue and trivializing the problem. Therefore, a simple 2-dimensional real world scenario (Fig. 3) was devised as a reference plane with stationary objects \bigtriangleup,^[¯] Î X Ì U (environment) and an oriented mobile agent A Î X Ì U (system) whose trajectories can be used to generate verbal descriptions of the objects' locations relative to the agent's changing system positions (SPOL relations) in simple declarative sentences⁹ . Thus, the propositional form of natural language predication will be used solely to control the contents of the natural language descriptions generated as training material, not, however, to determine the way it is processed to model its understanding . Moreover, the scenario determines the overall situation and provides for the observer's external view of reality, allowing the model designer to distinguish what the SCIP system might grasp (or understand) of it in processing the NL discourse which describes that scenario.

[htb]

SCIP_System =

{O, B, W, F, K}

Orientation O: =

{[N\vec]

= (0,1)

, [S\vec]

=(0,-1)

[O\vec]

= (1,0)

, [W\vec]

=(-1,0)}

Mobility B: =

{ g(0,1), g(1,1), g(1,0),

(pace and

g(1,-1), g(0,-1), g(-1,-1),

direction)

g(-1,0), g(-1,1): g=1 }

Perception W: =

{ K:={k_t}, L:=å_t=1^Tl_t, V:={z_i},

H_i:=å_t=1^Th_it : i=1,... ,j,... ,N }

Processing F: =

{a, d, z, ...};

K: =

{[(a)\tilde] | x, [(d)\tilde] | y, ...}

Semantics :

none

Syntax :

none

Table 1: Definition of SCIP-systemic properties.

SCIP_Environment =

{Â_E,Â_O,Â_G,D_Ä,l_Â}

Plane Â_E : =

{ P_n,m : $R_n,m Î Â_G(n₀,m₀,g),

P_n,m Î R_n,m }

Object Â_O : =

{ ^[¯], \bigtriangleup, \bigcirc, ... }

Grid Â_{G(n₀,m₀,g)} : =

{R_n,m =

[(n-1)g,ng] ×[(m-1)g,mg],

1 £ n £ n₀ , 1 £ m £ m₀ , g > 0 }

Direction D_Å : =

{[N\vec]

=(0,1)

,[S\vec]

=(0,-1)

[O\vec]

=(1,0)

,[W\vec]

=(-1,0)}

Objectlocation l_Â:

Â_O®Â_E

Table 2: Definition of SCIP-environmental properties.

5.1 System and Environment

To be able to test the perception-based non-propositional form of language understanding realized in SCIP systems, it has to be enacted on natural language discourse whose semantic content is well known and certain in an externally defined sense in order to ascertain internal divergences from it. This knowledge and certainty is formally guaranteed by inter-subjectively agreeable correct expressions of true propositions describing a specified segment of reality. Controlling this situated process of description are a formal syntax and semantics employed to generate sentences and texts in pragmatically homogeneous discourse corpora to form the language material. Thus, the non-symbolic form of perception-based processing of these natural language texts (discourse) ideally realizes understanding as symbol grounding which can be compared to, and tested against the real-world scenario whose descriptions are given in the texts processed.

[htb]

SCIP_Coupling:	Language entities coupling system and
	environment structurally

Word: the sign-object identified as vocabulary element (type) whose occurrences in (linear) sets of sign-objects (tokens) are countable;

Sentence: the string (non-empty, linear set) of words forming a (syntactically) correct expression of a (semantically) true proposition which denotes a named object's location relative to the system's position (SPOL-relation);

Text: the string (non-empty, linear set) of sentences with identical (pairs of) core-predicates which describe SPOL-relations resulting from the (mobile) system's linear and step-wise movement relative to (fixed) objects;

Corpus: the (non-empty) set of texts comprising descriptions of (any/ all/ samples of) factually possible SPOL-relations generated by a systemically and environmentally specified SCIP setting.

Table 3: Definition of structural SCIP-Coupling entities.

In order to let this perception-based processing be modeled in terms of information system theory, some conditions have to be specified and defined. They will assure

: that the three main components of the experimental setting, the system , the environment , and their structural coupling are specified by sets of conditioning properties. These define the SCIP system (Tab. I) by way of a set of procedural entities like orientation, mobility, perception, processing . The SCIP environment (Tab. II) is defined as a set of formal entities like plane, objects, grid, direction, location . And the language discourse material or SCIP-coupling (Tab. III) mediating between system and environment is organized by a number of structural properties of embedded part-whole relations like word, sentence, text, corpus of which sentence and text require further linguistic specification to ensure correctness and true descriptions of real world situations;

[htb]

T(ext)	: =	{ S_i \| S_i ® S_i+1 : B Ù({KP₁,KP₂} Î S_i
		= {KP₁,KP₂} Î S_i+1) Ù " KP_j Î S_i
		È S_i+1 ; j=1,2 ; i=1,¼,I}
B	: =	{ g(0,1), g(1,1), g(1,0), g(1,-1),
(pace and		g(0,-1), g(-1,-1), g(-1,0),
direction)		g(-1,1): g=1 }
S_i	®	NP VP
NP	®	N
VP	®	V PP
PP	®	HP KP_j
N	®	The á triangle \| square \| circle ñ
V	®	is
HP	®	á extremely \| very \| rather ñ ánear \| far ñ
KP₁	®	áin front \| behind ñ
KP₂	®	áon the left \| on the right ñ

Table 4: Text generating phrase structure syntax.

: that the environmental data perceived by the SCIP system consists of a corpus of (natural language) texts whose correct expressions of true propositions can inter-subjectively be agreed on. This is achieved by introducing a formal text generating syntax¹⁰ (Tab. IV) and a corresponding reference semantics¹¹ (Tab. V) on the base of which sentences and texts may automatically be generated. As correct expressions of true propositions they describe the environmental situation the system finds itself exposed to, i.e. the object-locations relative to changing system-positions (SPOL-relations). Both, syntax and semantics represent the formally specified exo-view of reality (or the described situations). And finally
: that the system's internal picture of its surroundings representing the endo-view (or discourse situations) is to be derived from this textual language environment data other than by way of propositional reconstruction, i.e. without syntactic parsing and semantic interpretation of sentence and text structures. Because this part is the core of the perception-based model of discourse understanding, the measurements and processes employed will be dealt with in more detail below.

[htb]

Core-predicates (KP)

in SPOL relations of system-positions x,y and object-locations n,m (with 0-coordinates down left) for all orientations N, S, E, W of the mobile agent

North x,y	infront	behind
ontheleft	> m, < n	> m, > n
ontheright	< m, < n	< m, > n

South x,y	infront	behind
ontheleft	< m, > n	< m, < n
ontheright	> m, > n	> m, < n

East x,y	infront	behind
ontheleft	< m, < n	> m, < n
ontheright	< m, > n	> m, > n

West x,y	infront	behind
ontheleft	> m, > n	< m, > n
ontheright	> m, < n	< m, < n

Hedge-predicates (HP)

as distance measure for SPOL-relations (under crisp interpretation): in numbers of grid-points | x-n | and | y-m | of a 12×12 grid laid on the reference plane (Fig. 3)

*Crisp* interpret.	1	2	3	4	5	6	7	8	9	10	11	12
extremely near	1	1	0	0	0	0	0	0	0	0	0	0
very near	0	0	1	1	0	0	0	0	0	0	0	0
rather near	0	0	0	0	1	1	0	0	0	0	0	0
rather far	0	0	0	0	0	0	1	1	0	0	0	0
very far	0	0	0	0	0	0	0	0	1	1	0	0
extremely far	0	0	0	0	0	0	0	0	0	0	1	1

Table 5: Reference semantics for hedged core predicates.

5.2 Scenario and SPOL Relations

With the overall situation being a two dimensional reference plane with some stationary objects and a mobile agent (Fig. 3) the SCIP system's perceptive capabilities are limited to its language processing without (as yet) any other ability to act or react. It is on the grounds of the discourse the SCIP system is exposed to and processes in a sub-symbolic, non-propositional, and perception-based way that the object-locations have to be identified in the reference plane.

The semioticity of this processing is conditioned ex negativo by the fact that - whatever the system might gather from its language environment - in doing so it will not apply any grammatical knowledge of (symbolically coded) syntax or semantic rules made available prior to that process. Instead, SCIP is defined to be based solely on the system's own (co- and contextually restricted) susceptibility and processing capabilities to (re-)cognize, identify, and (re-)organize environmental data structures a n d to (re-)present the results in some dynamic structure which determines the system's knowledge (organization), learning (change) and understanding (representation). It is based on the assumption that a deeper representational level or core structure might be identified as a common base for different notions of meaning developed so far in theories of referential and situational semantics as well as some structural or stereotype semantic theories.

The natural language descriptions, i.e. the syntactically correct expressions of semantically true propositions of predications which represent in their sum a pragmatically homogeneous text (PHT) corpus and specify the overall view or the external observer's exo-reality couple the SCIP system and the SCIP environment to each other (as defined by Tabs. I to III). Submitting these descriptions to the perception based, sub-symbolic, cognitive processing as defined by DIGS formalisms ref=den°des Í T×X is to detect and identify structures and patterns inherent in the language material which relate to structures and patterns that organize the real world these texts describe as their referential meanings. Due to instantiated and implementable SCIP algorithms to process designation as des=par°syn Í T×M and denotation as den=env°sys Í M×X, the detected patterns and structures will result in some mappings and vectorial representations (mental images) in the semantic space p Î M Í I constituting its understanding. Structures and patterns in semantic space therefore should reveal some of the SCIP system's internal view of its environment (endo-reality) as computed from processing the PHT corpus which describes that environment externally (exo-reality).

In order to visualize what structures may be found in the semantic space and visualized accordingly, cluster analyzing algorithms have been employed. They are numerical and independent of any string processing or symbol manipulation techniques common in computational and linguistic semantics, and provide for agglomerative tree structure (dendrogram) generation as a means of visualization which is formally controlled, repeatable, and may inter-subjectively be agreed upon. Based on such dendrograms another image generating algorithm was developed to allow for a direct comparison and experimental testing of the SCIP system's capacity to understand the referential meanings of language expressions against the externally observed situational reality as specified, described, and represented in the discourse processed.

6 Process and Measurement

a-abstraction

d-abstraction

V ×V

C×C

M ×M

z₁

z_m

z₁

a₁₁

a_1m

^··_·

z_m

a_m1

a_mm

}

® ^{a | z_i}

syn

{

y₁

y_m

y₁

d₁₁

d_1m

^··_·

y_m

d_m1

d_mm

}

® ^{d | y_i}

par

{

p₁

p_m

p₁

z₁₁

z_1m

^··_·

p_m

z_m1

z_mm

Syntagmatic

Paradigmatic

C o n s t r a i n t s

Figure 4: Formalization of syntagmatic and paradigmatic constraints as two-level mapping of usage regularities of items z_i Î V and their differences y_i Î C. These mappings which are based first on the correlation measure a:V×V® Á_a (Eqn. 4) and second on the Euclidian distance d:C×C® Á_d (Eqn. 7), constitute consecutive (a- and d)-abstractions which result in meaning representations p_i Î M Í I respectively.

Generating language structures and/or analyzing language regularities by computational procedures cannot only be concerned with the application of rules to strings of symbols in order to produce, re-write, transform, unify, etc. other strings of symbols (sentences), nor is it merely about measuring varying degrees of combinatorial determinacy and to detect different patterns of the language elements' and structures' linear distributions. What is important though is to identify computationally these patterns' and structures' different types and represent them as (symbolically) labeled possibility distributions of (numerical) values that distinguish and determine (define) these labels¹².

6.1 Syntagmatics and Paradigmatics

Computational processes serving that purpose may therefore be identified with procedural definitions of those regularities which they are able to detect and analyze as constraints and/or to generate and represent as structures. Fuzzy linguistics [37] has successfully operationalized some and applied recursively to huge amounts of NL data in PHT corpora. These algorithms detect and analyze language regularities, exploit structures as produced by the constraints concerned [38], and represent these as vectors in possibility spaces from which observable syntagmata and paradigmata can be derived. Based upon the fundamental distinction of natural language items' agglomerative or syntagmatic and selective or paradigmatic relatedness¹³, the core of the representational formalism can be characterized as a two-level process of abstraction (Fig. 4). Semiotically these formal constraints synÞa | z_n and parÞd | y_n and consecutive mappings des=par°synÞd | y °a | z model the meanings of words as a function of all differences of all usage regularities (Fig. 5) detected for any vocabulary as employed in a PHT corpus.

A. 1 The first level of constraint exploration or a-abstraction (instantiating the syn-relation in Fig. 2) on the set {T} of fuzzy subsets of the vocabulary z Î T Í V provides the word-types' usage regularities or corpus points y Î C.

The basically descriptive statistics used to grasp these relations on the level of words in discourse are centered around a correlational measure (Eqn. 4) to specify intensities of co-occurring lexical items in texts, and a measure of similarity (or rather, dissimilarity) (Eqn. 7) to specify these correlational value distributions' differences. Simultaneously, these measures may also be interpreted semiotically as set theoretical constraints or formal mappings (Eqns. 5 and 8) which instantiate the designation morphism des=par°syn Í T×M (Fig. 2 and 5) as a function of differences of usage regularities of words.

For any PHT corpus K={ k_t } ; t=1,¼,T of texts with an overall length

T
å
t=1

l_t; 1 £ l_t £ L

(1)

of word-tokens per text, and a vocabulary

V={ z_n } ; n=1,¼,i,j,¼,N

(2)

of word-types whose item frequencies are denoted by

H_i=

T
å
t=1

h_it ; 0 £ h_it £ H_i

(3)

the correlation-coefficient a_i,j allows to express pairwise relatedness of word-types (z_i,z_j) Î V ×V in numerical values a_i,j Î Á_a ranging from -1 £ a_i,j £ +1 by calculating co-occurring word-token frequencies in the following way

a_i,j=

T
å
t=1

(h_it-e_it) (h_jt-e_jt)

æ
è

T
å
t=1

(h_it-e_it)²

T
å
t=1

(h_jt-e_jt)²

ö
ø

[1/2]

;

(4)

where e_it =

H_i

l_t and e_jt=

H_j

l_t

Picture Omitted

Figure 5: Fuzzy mapping relations a and d between the structured sets {T} and {R} of vocabulary items z_n Î T Í V , of corpus points y_n Î R Í C , and of meaning points p_i Î M Ì I as instantiated reconstruction of the designation morphisms des in Fig. 2.

Evidently, pairs of word types whose tokens frequently either co-occur in, or are both absent from, a number of texts will positively be correlated (affinity), those of which only one (and not the other) frequently occurs in a number of texts will negatively be correlated ( repugnancy).

As a fuzzy binary relation, [(a)\tilde] : V×V® Á_a can be conditioned on any z_i Î V which yields a crisp mapping as operational definition of the syn morphism (Fig. 2)

syn:=

| z_i : V®C; C: = {y_i | 1 £ i £ N}

(5)

where C is the set of corpus-points {y_n} representing the numerically specified, syntagmatic usage regularities that have been observed for any word-type z against all other z_n Î V as measured by a-values. The so-called a-abstraction over the first of the components of each ordered pair (z_i,z_n) determines these usage regularities' abstract representation

y_i : = (a(i,1),¼,a(i,N))^N

(6)

as a point in the N-dimensional corpus space y_i Î C spanned by the number of axes N corresponding to the number of vocabulary items (word-types) z_n Î V.

A. 2 The second level of constraint exploration or d-abstraction (instantiating par in Fig. 2) on the set {R} of fuzzy subsets of corpus points y Î R Í C provides the corresponding meaning points p Î M Ì I as a function (i.e. the set theoretical composition par°syn) of word-types which are being instantiated by word-tokens employed in texts.

[ht] 0.5mm
Picture Omitted
Figure 6: Cluster dendrogram of labeled meaning points p Î M Í I depicting semantic space structure after processing of 500 texts. Labels of hedge predicates extremely, very, rather (near | far) and core predicates left, right, front, behind are abbreviated to their first letters respectively.

Considering áC,dñ as a representational structure (corpus space) of abstract entities constituted by syntagmatic regularities of word-token occurrences in pragmatically homogeneous discourse, then the similarities and/or dissimilarities of these entities will capture what constitutes their corresponding word-types' paradigmatic regularities. These may be calculated by a distance measure d of, say, Euclidian metric

d(y_i,y_j)=

æ
è

N
å
n=1

(a(z_i,z_n)-a(z_j,z_n))²

ö
ø

[1/2]

;

(7)

Á_d:=0 £ d(y_i,y_j) £ 2Ön

Thus, d may serve as a second mapping function to represent any item's differences of usage regularities measured against those of all other items. As a fuzzy binary relation, [(d)\tilde] : C ×C® Á_d can be conditioned on y_i Î C which again yields a crisp mapping as operational definition of the par morphism (Fig. 2)

par:=

| y_i:C® M; M:={p_i | 1 £ i £ N}

(8)

where M is the set of meaning-points {p_n} representing the numerically specified paradigmatic structure that has been derived for each abstract syntagmatic usage regularity y_i against all other y_n Î C . The distance values can therefore be abstracted analogous to Eqn. 6, this time, however, over the other of the two components in each ordered pair, thus defining an element

p_i:=(d(i,1), ¼,d(i,N))^N

(9)

called meaning point p_i Î M Ì I in a N-dimensional structure called semantic space.

F R O N T
	0pt	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	0pt	0	4	4	4	4	4	4	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	4	4	4	4	4	4	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	4	4	4	4	4	4	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	4	4	4	4	4	4	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	6	6	5	5	6	6	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	6	6	5	5	6	6	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	6	6	5	5	6	6	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
	0pt	0	6	6	5	5	6	6	3	3	3	3	0	2	2	2	2	2	2	0	0	0	0	0
L	0pt	0	3	3	3	3	3	3	4	4	4	4	0	2	2	2	2	2	2	0	0	0	0	0	R
E	0pt	0	3	3	3	3	3	3	4	4	4	4	0	2	2	2	2	2	2	0	0	0	0	0	I
F	0pt	0	0	0	0	0	0	0	0	0	0	0	D	0	0	0	0	0	0	0	0	0	0	0	G
T	0pt	0	2	2	2	2	2	2	2	2	2	2	0	6	6	5	5	4	4	0	0	0	0	0	H
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	6	6	5	5	4	4	0	0	0	0	0	T
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	5	5	6	6	4	4	0	0	0	0	0
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	5	5	6	6	4	4	0	0	0	0	0
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	6	6	5	5	4	4	0	0	0	0	0
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	6	6	5	5	4	4	0	0	0	0	0
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	4	4	4	4	5	5	0	0	0	0	0
	0pt	0	2	2	2	2	2	2	2	2	2	2	0	4	4	4	4	5	5	0	0	0	0	0
	0pt	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	0pt	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	0pt	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
B E H I N D

Figure 7: Endo₁(i,j)_[500] showing numerical patterns of relative object location viewed from the system agent (\bigtriangleup oriented north) by sums of grid points marked (from center: extremely| very| rather near; rather| very| extremely far) according to layers of (agglomerative) clusters which the (crisply interpreted) hedged core predicates (left, right, in front, behind) exhibit in semantic space as analyzed after processing of 500 texts.

Thus, the perception-based, non-symbolic, numerical processing of the PHT corpus of natural language expressions describing real world situations yields vectorial representations of meaning points in semantic space whose structuredness is obvious as a result of differences in word usage regularities and well documented as connotative meaning representation [31,39, 40, 41, 42]. The question to be answered here, however, is whether - and to what extent - the semantic space structure corresponds or even refers to any of the situational patterns in the real world which have been described in the discourse processed.

N O R T H

396

416

390

364

342

320

294

248

216

148

152

120

416

442

420

397

378

358

334

286

252

218

184

150

400

430

414

396

382

366

346

300

267

234

201

168

384

417

406

394

386

374

358

314

282

250

218

186

368

404

398

392

390

382

370

328

297

266

235

204

352

392

391

394

390

382

342

312

282

252

222

326

370

378

384

393

394

392

356

327

298

269

240

300

346

360

372

386

393

398

366

338

310

282

254

274

322

342

360

379

392

404

376

349

322

295

268

228

274

298

320

342

358

374

352

328

304

280

256

194

238

263

286

309

326

343

324

304

284

364

244

160

202

228

252

276

294

312

296

280

264

248

232

S O U T H

Figure 8: Endo₂(m,n)_[500] showing two (boldface) maxima of object location likelyhood in sums of density values per grid point. These are computed by superimposing locality pattern values from Endo₁(i,j)_[500] according to Eqn. 10.

6.2 Semantic Space Structure and Visualization

In the course of the sub-symbolic, numerical analysis of the PHT corpus that describes real world situations as SPOL relations, the two-level consecutive mappings enacted by the SCIP system (Figs. 4 and 5) resulted in vectorial representation of meaning points or concepts p Î M Í I in semantic space. Its intrinsic structure is to be analyzed to reveal some of the situational (systemic and environmental) constraints which can be employed to in a four stage visualization process:

applying methods of average linkage cluster analysis [43] allows to identify - comparable to results as produced by KOHONEN-maps [44] - semiotically similar word-types (object labels and hedged core predicate labels) as structurally adjacent meaning points or concepts p Î M Í I in a dendrogram format (Fig. 6),
superimposing the hedges' numerical (crisp) interpretation for distance values and the core predicates' directional interpretations for the regions of object locations relative to a centrally positioned agent system, the sums of cluster agglomerations from Fig. 6 produce an intermediate, 23×23 representation (Fig. 7) of the system's own oriented view of its environment which can be transformed to
a mapping that images the system's endo-view which is orientation dependent and directionally indeterminate, as its directionally determinate exo-view representation¹⁴ (Fig. 8). This can subsequently be transformed into another format to visualize
the referential plane as structured holistically by a profile of (numerically interpolated) polygons which connect regions of denotational likelihood by so-called isoreferentials; their emerging overall pattern denotes possible object locations (Fig. 9).

B. 1 Earlier investigations into the intrinsic structure of semantic space data had revealed [43,45] that topological adjacencies of meaning points can well be identified, and clusters of points be detected and represented in an agglomerative process with an average linkage cluster criterion. Applying these techniques to the semantic space structure as presently computed from processing PHT corpora of increasing size (50 to 500 texts) resulted in dendrograms like Fig. 6. It clearly separates the collections of core predicate labels (34):front-left from (35):behind-right and identifies the latter as (37):square. The former is - in conjunction with the copula (36):is - less distinct as (39):triangle comprises all labels.

B. 2 As the semantic space structure may be considered the internal model (endo-view) of what the SCIP system gathered in processing the language data it was exposed to, this structure has to be transformed in order to be visualized to allow for a comparison with real world situation (exo-view) described according to an externally defined grammar (syntax and semantics). For this transformation the four hedged core predicates (left, right, front, behind) are employed to determine a 2-dimensional 23×23 grid (Fig. 7) which spans from the oriented system's center position into four directions along the concentric frames (extremely, very, rather, very, extremely) of hedged near and far regions. The numbers at each point i,j - that make the grid the Endo1_i,j transformed data representation - are frequencies as provided by the cluster dendrogram which allows to identify each hedged core predicate with the number of agglomerative steps it is part of.

0.5mm
Picture Omitted
Figure 9: 2-dim visualizations of potential object locations (isoreferentials) depicting the SCIP system's incremental meaning constitution or learning to understand (without any knowledge of grammar) by sub-symbolic, perceptual processing of textual constraints in an increasing number of (50 to 500) texts. Thus, the SCIP language understanding capacity is tested against real world situations which these texts refer to. The situations are created by the mobile agent's random walk producing changing system positions relative to stationary object locations (SPOL-relations) in a 2-dim reality (Fig. 3), and their descriptions in NL texts of simple, declarative sentences are generated in a grammatically well defined way controlled by formal syntax (Tab. IV) and referential semantics (Tab. V).

B. 3 The Endo1_i,j data (Fig. 7) serves as base for this third step which is a line- and column-wise data compression transformation. It results in a new mapping Endo2_m,n (Fig. 8) according to the summation equation

Endo2_m,n =

m+11
å
i=m

n+11
å
j=n

Endo1_i,j

(10)

B. 4 The matrix Endo2_m,n (Fig. 8) represents the data structure transformed for an external observer's visualization of the system's endo-view as processed from texts describing SPOL-relations, i.e. fixed object locations relative to changing system positions. The corresponding (two-dimensional) images generated on the base of Endo2-increments of text corpora of increasing size (50 to 500 texts) gives an impression of the dynamics of the developing picture of referential likelihood (Fig. 9). The polygons interpolating the Endo2_m,n data points are called isoreferentials whose overall pattern forms a profile that denotes potential object locations quite clearly as regional maxima, however fuzzy .

References

[1]: J. Barwise and J. Perry, Situations and Attitudes, Cambridge, MA: MIT Press, 1983.
[2]: H. A. Simon, The Sciences of the Artificial., 2nd ed. Cambridge, MA: MIT Press, 1982.
[3]: B. B. Rieger, "Situation Semantics and Computational Linguistics: towards Informational Ecology. A semiotic perspective for cognitive information processing systems," in Information. New Questions to a Multidisciplinary Concept, K. Kornwachs and K. Jacoby, Eds. Akademie Verlag, 1996, pp. 285-315.
[4]: F. d. Saussure, Cours de Lingistique Générale. (1916) Kritische Edition R.Engler. Wiesbaden: Backhaus, 1967.
[5]: C. S. Peirce, "Pragmatism in Retrospect: A Last Formulation," in The Philosophical Writings of Peirce, ser. (CP 5.11-5.13), J. Buchler, Ed. New York: Dover, 1906, pp. 269-289.
[6]: K. Devlin, Logic and Information. Cambridge: Cambridge UP, 1991.
[7]: B. B. Rieger, "Semiotic Cognitive Information Processing: Learning to Understand Discourse. A Systemic Model of Meaning Constitution," in Perspectives on Adaptation and Learning, R. Kühn et al., Eds. Berlin/ Heidelberg/ New York: Springer, 2003, pp. 347-403.
[8]: M. F. Peschl and A. Riegler, "Does Representation Need Reality?" A. Riegler, M. Pesch, and A. vonStein, Eds. New York/ Boston: Kluwer Academic/Plenum, 1999, pp. 9-17.
[9]: H. R. Maturana, "Biology of Language. The epistomology of reality," in Psychology and Biology of Language and Thought, G. Miller and E. Lenneberg, Eds. New York: Academic Press, 1978, pp. 27-64.
[10]: H. Maturana and F. Varela, Autopoiesis and Cognition. Dordrecht/ Boston/ London: Reidel, 1980.
[11]: A. Newell, "Physical symbol systems," Cognitive Science, vol. 4, pp. 135-183, 1980.
[12]: A. Carsetti, "Introduction." in Functional Models of Cognition. Self-Organizing Dynamics and Semantic Structure in Corgnitive Systems, A. Carsetti, Ed. Dordrecht/Boston/London: Kluwer Academic, 2000, pp. 127-142.
[13]: L. A. Zadeh, "PRUF - a meaning representation language for natural languages," Int. Journ. Man-Machine-Studies, no. 10, pp. 395-460, 1978.
[14]: --, "Fuzzy logic = Computing with words," in Computing with Words in Information/Intelligent Systems I, Eds. Heidelberg/ New York: Physica Verlag, 1999, pp. 3-23.
[15]: H. Ritter and T. Kohonen, "Self-Organizing Semantic Maps," Biological Cybernetics, vol. 61, pp. 241-254, 1999.
[16]: A. Meystel, Semiotic Modeling and Situation Analysis: an Introduction. Bala Cynwyd, PA: AdRem Inc, 1995.
[17]: B. B. Rieger, "Semiotics and Computational Linguistics. On Semiotic Cognitive Information Processing," in Computing with Words in Information/Intelligent Systems I, L. Zadeh and J. Kacprzyk, Eds. Heidelberg/ New York: Physica Verlag, 1999, pp. 93-118.
[18]: --, "Meaning Acquisition by SCIPS," in ISUMA-NAFIPS-95, B. M. Ayyub, Ed. Los Alamitos, CA: IEEE Computer Society Press, 1995, pp. 390-395.
[19]: --, "Fuzzy Word Meanings as Semantic Granules. Emergent constraints for self-organizing tree structures in SCIP systems," in 5th International Joint Conference on Information Sciences (JCIS-2000), L. Zadeh, P. Wang, and J. Kacprzyk, Eds. Durham, NC: Duke UP, 2000, pp. 56-59.
[20]: L. A. Zadeh, "Toward a Theory of Fuzzy Information Granulation and its Centrality in Human Reasoning and Fuzzy Logic," Fuzzy Sets and Systems, vol. 90, no. 3, pp. 111-127, 1997.
[21]: --, "From computing with numbers to computing with words - from manipulation of measurement to manipulation of perceptions," in Computing with Words, P. P. Wang, Ed. New York, NY: John Wiley & Sons, 2001, pp. 35-68.
[22]: B. B. Rieger, "Fuzzy Computational Semantics," in Fuzzy Systems. Proceedings of the Japanese-German-Center Symposium, ser. H. Zimmermann, Ed. Berlin: JGCB, 1994, pp. 197-217.
[23]: S. Wachsmuth, G. Fink, F. Kummert, and G. Sagerer, "Using speech in visual object recognition," in Mustererkennung 2000, G. Sommer, N. Krüger, and C. Perwass, Eds. Berlin/ Heidelberg: Springer, 2000, pp. 428-435.
[24]: H. Pattee, "Simulations, Realizations, and Theories of Life," in Artificial Life, C. Langton, Ed. Reading, MA: Addison Wesley, 1989, pp. 63-77.
[25]: S. Harnad, "The symbol-grounding problem," Physica-D, vol. 42, pp. 335-346, 1990.
[26]: T. Ziemke, "Rethinking Grounding," in Understanding Representation in the Cognitive Science, A. Riegler, M. Pesch, and A. von Stein, Eds. New York/ Boston: Kluwer Academic/Plenum, 1999, pp. 177-190.
[27]: B. B. Rieger, "Distributed Semantic Representation of Word Meanings," in Parallelism, Learning, Evolution. Evolutionary Models and Strategies, WOPPLOT-89, J. D. Becker, I. Eisele, and F. W. Mündemann, Eds. Berlin/ Heidelberg/ New York: Springer, 1991, pp. 243-273.
[28]: B. B. Rieger and C. Thiopoulos, "Situations, Topoi, and Dispositions. On the phenomenological modelling of meaning," in 5. Österreichische Artificial-Intelligence-Tagung, Innsbruck-Igls, J. Retti and K. Leidlmair, Eds. Berlin/ Heidelberg/ New York: Springer, 1989, pp. 365-375.
[29]: B. B. Rieger, "A Systems Theoretical View on Computational Semiotics. Modeling text understanding as meaning constitution by SCIPS," in Joint Conference on the Science and Technology of Intelligent Systems, J. S. Albus, Ed. Piscataway, NJ: IEEE & NIST, 1998, pp. 840-845.
[30]: --, "Procedural Meaning Representation. An empirical approach to word semantics and analogical inferencing," in COLING-82, J. Horecky, Ed. Amsterdam/ New York: North Holland, 1982, pp. 319-324.
[31]: --, "Semantic Relevance and Aspect Dependancy in a Given Subject Domain," in COLING-84, D. Walker, Ed. Stanford: ICCL-ACL, 1984, pp. 298-301.
[32]: --, "Relevance of Meaning, Semantic Dispositions, and Text Coherence. Modeling reader expectation from natural language discourse," in Text and Discourse Connectedness, M. Conte, J. Petöfi, and E. Sözer, Eds. Amsterdam/ Philadelphia: Benjamin, 1988, pp. 153-173.
[33]: B. B. Rieger and C. Thiopoulos, "Semiotic Dynamics: a self-organizing lexical system in hypertext," in Contributions to Quantitative Linguistics. Proceedings of the 1st Quantitative Linguistics Conference - QUALICO-91, R. Köhler and B. Rieger, Eds. Dordrecht: Kluwer Academic Publishers, 1993, pp. 67-78.
[34]: B. B. Rieger, "Computing Granular Word Meanings. A fuzzy linguistic approach in Computational Semiotics," in Computing with Words, P. P. Wang, Ed. New York, NY: John Wiley & Sons, 2001, pp. 147-208.
[35]: D. Marr, Vision. SanFrancisco: Freeman, 1982.
[36]: R. Goldblatt, Topoi: the Categorial Analysis of Logic. Amsterdam: North Holland, 1979.
[37]: B. Rieger, "Fuzzy Modellierung linguistischer Kategorien," in Lexikon und Text, H. Feldweg and E. Hinrichs, Eds. Tübingen: Niemeyer, 1996, pp. 155-169.
[38]: B. B. Rieger, "Computational Semiotics and Fuzzy Linguistics. On meaning constitution and soft categories," in A Learning Perspective: Proceedings of the 1997 International Conference on Intelligent Systems and Semiotics (ISAS-97), A. Meystel, Ed. Washington, DC: US Gov. Printing Office, 1997, pp. 541-551.
[39]: --, "Fuzzy Word Meaning Analysis and Representation in Linguistic Semantics. An empirical approach to the reconstruction of lexical meanings in East- and West-German newspaper texts," in COLING-80, M. Nagao and K. Fuchi, Eds. Tokyo: ICCL, 1980, pp. 76-84.
[40]: --, "Connotative Dependency Structures in Semantic Space," in Empirical Semantics II. A Collection of New Approaches in the Field, B. B. Rieger, Ed. Bochum: Brockmeyer, 1981, pp. 622-711.
[41]: --, "Fuzzy Representation Systems in Linguistic Semantics," in Progress in Cybernetics and Systems Research Vol. XI, R. Trappl, N. Findler, and W. Horn, Eds. Washington/ New York/ London: McGraw-Hill Intern., 1982, pp. 249-256.
[42]: --, "Generating Dependency Structures of Fuzzy Word Meanings in Semantic Space," in Proceedings of the XIIIth International Congress of Linguists, S. Hattori and K. Iounu, Eds. Tokyo: CIPL, 1983, pp. 543-548.
[43]: --, "Clusters in Semantic Space," in Actes du Congrès International Informatique et Science Humaines, L. Delatte, Ed. Liège: LASLA, 1983, pp. 805-814.
[44]: T. Kohonen, Self-Organization and Associative Memory, 3rd ed., Berlin/ Heidelberg/ New York/ London: Springer, 1989.
[45]: B. Rieger, Unscharfe Semantik. Frankfurt/ Bern/ Paris: Peter Lang, 1989.
[46]: B. B. Rieger, "Perception-based Processing of NL Texts. Modeling discourse understanding as visualized learning in SCIP systems." in Proceedings 4th Intern. Conf. on Recent Advances in Soft Computing (RASC-02), A. Lotfi, J. Garibaldi, and R. John, Eds. Nottingham: Trent UP, 2002, pp. 506-511.
[47]: --, "Discourse Understanding as Image Generation. On perception-based processing of NL texts in SCIP systems." in Proceedings 6th Conf. of the United Kingdom Simulation Society (UKSIM-03), D. Al-Dabas, Ed. Nottingham: UKSimSoc, 2003, pp. 1-8.
[48]: B. B. Rieger, C. Flores, and D. John, "The _experimentalSCIP", www.ldv.uni-trier.de:8080/rieger/SCIP.html, 2003.
[49]: L. Zadeh and J. Kacprzyk, Eds., Computing with Words in Information/Intelligent Systems. Heidelberg/ New York: Physica Verlag, 1999.
[50]: P. P. Wang, Ed., Computing with Words. New York, NY: John Wiley & Sons, 2001.

[Figure]Burghard B. Rieger, Professor em. of Computational Linguistics and former Head of Department of Linguistic Computing at the University of Trier, Germany, has been a researcher and academic teacher for more than three decades. His interdisciplinary work is on topics ranging from German language and literature to linguistics and cognitive science with an early affinity to quantitative and computational approaches. Most of his research is in computational semantics and knowledge representation with special focus on vagueness and fuzzy modeling. His recent work and current interest is in computational semiotics as the study and implementation of dynamic systems of meaning acquisition and language understanding by man and machine. - He received his PhD and Dr. habil. in Linguistics from the Technical University (RWTH) Aachen and held various university positions as lecturer, researcher, and visiting professor (Nottingham, Aachen, Amsterdam, Essen, Trier) before he was appointed Professor ordinarius (Chair of Computational Linguistics) at the University of Trier (1986). He wrote two books on quantitative text analysis and on fuzzy computational semantics and is the author of more than 80 articles. He is the editor of several collections and conference proceedings on topics in Empirical Semantics, Computational Linguistics, and Linguistic Computing. He was president of the German Society for Linguistic Computing GLDV (1989-93) and vice-president of the International Society for Terminology and Knowledge Engineering TKE (1990-94), served as Dean and Vice-Dean of his Faculty (1997-2001), and is now Professor emeritus of Trier University.

Footnotes:

¹The author is indebted to two anonymous referees whose comments helped to improve the written version of the lecture and, hopefully, its readability. All errors are, as always, my own.

²The implementation of the SCIP system-environment testbed is due to my PhD-students, Christoph Flores and Daniel John, whose design and programming proficiencies are thankfully appreciated.

³According to standard theory there is no direct genetic coding of experiential results but rather indirect transmission of them by selectional advantages which organisms with certain genetic mutations gain over others without them to survive under changing environmental conditions with higher reproduction.

⁴Term borrowed from Situation Semantics [1,pp. 60] where abstract, actual and factual signify levels of typified (ontological) specificity in characterizing events, state-of-affairs, courses-of-events, situations etc.

⁵Simon's [2] remark "There is a certain arbitrariness in drawing the boundary between inner and outer environments of artificial systems. ... Long-term memory operates like a second environment, parallel to the environment sensed through eyes and ears" (pp. 104) is not a case in point here. As will become clear in what follows, his distinction of inner (memory structure) and outer (world structure) environments of a system misses the special semiotic quality of natural language signs whose twofold environmental embedding (textual structure) cuts across the inner/outer distinction, resolving both, memory and world structures in becoming representational for each other.

⁶The meaning conveyed cannot always be represented in a language independent way, e.g. by observable operations/processes enacted without being understood prior to their (re)presentation as semantic contents. This is also why traditional cognitive approaches easily accept linguistic analyses of propositional language structure as only explication (linguistic transparency) of understanding, and why linguistic semantics in turn appeals to formal logics as an available format for the representation of declarative NL expressions' predicative functioning.

⁷The concept of morphism [36] is employed because it captures a notion of generality as a type of abstract relatedness whose possible instantiations (as mappings, relations, partial or total functions, etc.) due to yet unknown conditions of definiteness cannot and need not be decided on.

⁸"By semiosis I mean [... ] an action, or influence, which is, or involves, a coöperation of three subjects, such as sign [z Î T Í V], its object [x Î X Ì U], and its [cognitive: p Î M Í I or logical: e Î E Ì G] interpretant, this tri-relative influence not being in any way resolvable into actions between pairs." [5,p.282]

⁹"Triangle is very far in front, very near to the left. Square is very near in front, extremely near to the right. ... " etc.

¹⁰The simple phrase structure grammar defines texts to consist of sentences whose core predicates are the same.

¹¹The core predicates' denotations are given according to their symmetric directional dependencies whereas the hedge predicates' (crisp) interpretation is numerical which also allows for continuous (fuzzy) definitions [3,p. 311].

¹²It should be noted that the computational processes dealt with here (and below) are not introduced ad hoc, but instead were derived from and are embedded in the semiotically motivated extension of an information systems theory inspired approach to natural language understanding as part of dynamic image generating semantics (DIGS).

¹³According to SAUSSURE [4] universal constraints control the multi-level combinability and formation of language entities based upon the distinction of restrictions on linear aggregation of elements (syntagmatics ) from restrictions on their selective replacement (paradigmatics ). It is these constraints which allow to distinguish not only different levels of entity and structure formation, but also different functions of structure and meaning constitution which structural linguists have learned since to better understand.

¹⁴As the experimental setting does not (yet) allow the mobile system's orientation to change while traversing the reference plane in different directions, the predicates employed during generation of SPOL-relation descriptions are confined to be directionally determinate (in front=north, left=west, etc.). To allow changing orientations for the mobile system would necessitate a procedural modeling and algorithmic reconstruction of less restricted environmental and systemic constraints env°sys which (so far) have been assumed part of the structural coupling.

File translated from T_EX by T_TH, version 3.72.
On 13 Jun 2006, 12:01.

On Understanding Understanding. Perception-based processing of NL texts in SCIP systems, or meaning constitution as visualized learning.

Burghard B. Rieger1 FB II: Computational Linguistics, University of Trier Universitätsring, D-54286 Trier, Germany URL: www.ldv.uni-trier.de:8080/rieger.html

1 Introduction

1.1 Process and Result

1.2 Mediate and Immediate

1.3 Knowledge and Cognition

2 Cognitive Models of Meaning

2.1 Reality, Perception, and Representations

2.2 Semantic Theory, Meaning, and Understanding

3 SCIP Systems and DIG Semantics

3.1 Knowledge, Memory, and Models

3.2 Cognitive Information Processing

3.3 Perception-based Discourse Understanding

4 Morphisms and Constraints

4.1 Decomposition I

4.2 Decomposition II

4.3 Natural Language and Symbol Grounding

5 Setting and Experiment

5.1 System and Environment

5.2 Scenario and SPOL Relations

6 Process and Measurement

6.1 Syntagmatics and Paradigmatics

6.2 Semantic Space Structure and Visualization

References

Footnotes:

On Understanding Understanding.
Perception-based processing of NL texts in SCIP systems, or meaning constitution as visualized learning.

Burghard B. Rieger¹
FB II: Computational Linguistics, University of Trier
Universitätsring, D-54286 Trier, Germany
URL: `www.ldv.uni-trier.de:8080/rieger.html`