Abstract
The emergence of semantic structure as a self-organizing process is studied in Semiotic Cognitive Information Processing Systems on the basis of word usage regularities in natural language discourse whose linearly agglomerative (syntagmatic) and whose selectively interchangeable (paradigmatic) constraints are exploited by text analysing algorithms. They accept natural language discourse as input and produce a vector space structure as output which may be interpreted as an internal (endo) representation of the SCIP system's states of adaptation to the external (exo) structures of its environment as mediated by the discourse processed. In order to evaluate the sytem's endo- representation against the exo- view of its environment as described by the natural language discourse processed, a corpus of texts - composed of correct and true sentences with well-defined referential meanings - was generated according to a (very simple) phrase structure grammar and a fuzzy referential semantics which interpret simple composite predicates of cores (like: on the left, in front etc.) and hedges (like: extremely nearby, very faraway etc.). Processed during the system's training phase, the corpus reveals structural constraints which the system's hidden structures or internal meaning representations apparently reflect. The system's architecture is a two-level consecutive mapping of distributed representations of systems of (fuzzy) linguistic entities whose states acquire symbolic functions that can be equaled to (basal) referencial predicates. Test results from an experimental setting with varying fuzzy interpretations of hedges are produced to illustrate the SCIP system's miniature (cognitive) language understanding and meaning acquisition capacity without any initial explicit syntactic and semantic knowledge.
Whereas traditional approaches in artificial intelligence research (AI) or computational linguistics (CL) model cognitive tasks or natural language understanding in information processing systems according to the realistic view of semantics, it is argued here that meaning need not be introduced as a presupposition of semantics but may instead be derived as a result of procedural modelling1 as soon as a semiotic line of approaches to cognition will be followed [3].
According to Situation Semantics any language expression is tied to reality in two ways: by the discourse situation allowing an expression's meaning being interpreted and by the described situation allowing its interpretation being evaluated truth-functionally. Within this relational model of semantics, meaning may be considered the derivative of information processing which (natural or artificial) systems-due to their own structuredness-perform by recognizing similarities or invariants between situations that structure their surrounding realities (or fragments thereof).
By ascertaining these invariants and by mapping them as uniformities across situations , cognitive systems properly attuned to them are able to identify and understand those bits of information which appear to be essential to form these systems' particular views of reality: a flow of types of situations related by uniformities like e.g. individuals, relations, and time-space-locations. These uniformities constrain a system's external world to become its view of reality as a specific fragment of persistent (and remembered) courses of events whose expectability renders them interpretable or even objective.
In semiotic sign systems like natural languages, such uniformities appear to be signalled also by word-types whose employment as word-tokens in texts exhibit a special form of structurally conditioned constraints. Not only allows their use the speakers/hearers to convey/understand meanings differently in different discourse situations (efficiency), but at the same time the discourses' total vocabulary and word usages also provide an empirically accessible basis for the analysis of structural (as opposed to referencial) aspects of event-types and how these are related by virtue of word uniformities accross phrases, sentences, and texts uttered. Thus, as a means for the intensional (as opposed to the extensional) description of (abstract, real, and actual) situations , the regularities of word-usages may serve as an access to and a representational format for those elastic constraints which underly and condition any word-type's meaning , the interpretations it allows within possible contexts of use, and the information its actual word-token employment on a particular occasion may convey.
The philosophical concept of language game can be combined with the formal notion of situations allowing not only for the identification of an cognitve system's (internal) structure with the (external) structure of that system's environment. Being tied to the observables of actual language performance enacted by communicative language useage opens up an empirical approach to procedural semantics. Whatever can formally be analysed as uniformities in Barwiseian discourse situations may eventually be specified by word-type regularities as determined by co-occurring word-tokens in pragmatically homogeneous samples of language games. Going back to the fundamentals of structuralistic descriptions of regularities of syntagmatic linearity and paradigmatic selectivity of language items, the correlational analyses of discourse will allow for a multi-level word meaning and world knowledge representation whose dynamism is a direct function of elastic constraints established and/or modified in language communication.
As has been outlined in some detail elsewhere [4] [6] [8] [12] the meaning function's range may be computed and simulated as a result of exactly those (semiotic) procedures by way of which (representational) structures emerge and their (interpreting) actualisation is produced from observing and analyzing the domain's regular constraints as imposed on the linear ordering (syntagmatics) and the selective combination ( paradigmatics) of natural language items in communicative language performance. For natural language semantics this is tantamount to (re)present a term's meaning potential by a fuzzy distributional pattern of the modelled system's state changes rather than a single symbol whose structural relations are to represent the system's interpretation of its environment. Whereas the latter has to exclude , the former will automatically include the (linguistically) structured, pragmatic components which the system will both, embody and employ as its (linguistic) import to identify and to interpret its environmental structures by means of its own structuredness.
The basically descriptive statistics used to grasp these relations on the level of words in discourse are centred around a correlational measure (Eqn. 1) to specify intensities of co-occurring lexical items in texts, and a measure of similarity (or rather, dissimilarity) (Eqn. 4) to specify these correlational value distributions' differences. Simultaneously, these measures may also be interpreted semiotically as set theoretical constraints or formal mappings (Eqns. 2 and 5) which model the meanings of words as a function of differences of usage regularities.
ai,j allows to express pairwise relatedness of word-types (xi,xj) Î V ×V in numerical values ranging from -1 to +1 by calculating co-occurring word-token frequencies in the following way
where eit=[(Hi)/L] lt and ejt=[(Hj)/L] lt, with the textcorpus K={ kt } ; t=1,¼,T having an overall length L=åt=1T lt; 1 £ lt £ L measured by the number of word-tokens per text, and a vocabulary V={ xn } ; n=1,¼,i,j,¼,N whose frequencies are denoted by Hi=åt=1Thit ; 0 £ hit £ Hi.
Evidently, pairs of items which frequently either co-occur in, or are both absent from, a number of texts will positively be correlated and hence called affined , those of which only one (and not the other) frequently occurs in a number of texts will negatively be correlated and hence called repugnant.
As a fuzzy binary relation, [(a)\tilde] : V×V ® I can be conditioned on xn Î V which yields a crisp mapping
where the tupels á(xn,1,[(a)\tilde](n,1)),¼,(xn,N,[(a)\tilde](n,N))ñ represent the numerically specified, syntagmatic usage regularities that have been observed for each word-type xi against all other xn Î V. a-abstraction over one of the components in each ordered pair defines
Hence, the regularities of usage of any lexical item will be determined by the tupel of its affinity/repugnancy -values towards each other item of the vocabulary which-interpreted as coordinates- can be represented by points in a vector space C spanned by the number of axes each of which corresponds to an entry in the vocabulary.
Thus, d may serve as a second mapping function to represent any item's differences of usage regularities measured against those of all other items. As a fuzzy binary relation, [(d)\tilde] : C ×C® I can be conditioned on yn Î C which again yields a crisp mapping
where the tupels á(yn,1,[(d)\tilde](n,1)), ¼,(yn,N[(d)\tilde](n,N))ñ represents the numerically specified paradigmatic structure that has been derived for each abstract syntagmatic usage regularity yj against all other yn Î C . The distance values can therefore be abstracted analogous to Eqn. 3, this time, however, over the other of the components in each ordered pair, thus defining an element zj Î S called meaning point by
p> Identifying zn Î S with the numerically specified elements of potential paradigms, the set of possible combinations S ×S may structurally be constrained and evaluated without (direct or indirect) recourse to any pre-existent external world. Introducing a Euclidian metric
the hyperstructure áS,zñ or semantic hyper space (SHS) is declared constituting the system of meaning points as an empirically founded and functionally derived representation of a lexically labelled knowledge structure (Tab. 1).
As a result of the two-stage consecutive mappings any meaning point's position in SHS is determined by all the differences (d- or distance-values) of all regularities of usage (a- or correlation-values) each lexical item shows against all others in the discourse analysed. Without recurring to any investigator's or his test-persons' word or world knowledge (semantic competence), but solely on the basis of usage regularities of lexical items in discourse resulting from actual or intended acts of communication (communicative performance), text understanding is modelled procedurally the process to construct and identify the topological positions of any meaning point zi Î áS,zñ corresponding to the vocabulary items xi Î V which can formally be stated as composition of the two restricted relations [(d)\tilde] | y and [(a)\tilde] | x (Fig. 1).
Processing natural language texts the way these algorithms do would appear to grasp some interesting portions of the ability to recognize and represent and to employ and modify the structural information available to and accessible under such performance. A semiotic cognitive information processing system (SCIPS) endowed with this ability and able to perform likewise would consequently be said to have constituted some text understanding. The problem is, however, whether (and if so, how) the contents of what such a system is said to have acquired can be tested, i.e. made accessible other than by the language texts in question and/or without committing to a presupposed semantics determining possible interpretations.
that the three main components of the experimental setting, the system , the environment , and the discourse are specified by sets of conditioning properties. These define the SCIP system by way of a set of procedural entities like orientation, mobility, perception, processing (Tab. 2), the SCIP -environment is defined as a set of formal entities like plane, objects, grid, direction, location (Tab. 3), and the SCIP -discourse material mediating between system and environment is structured first by a number of part-whole related entities like word, sentence, text, corpus (Tab. 4) of which sentence and text require further formal restrictions to be specified by a formal syntax (Tab. 5) and a referential semantics (Tab. 6). that the system's environmental data consists in a corpus of (natural language) texts of correct expressions of true propositions denoting system-object-relations described according to the formally specified syntax and semantics (representing the exo- view or described situations), and that the system's internal picture of its surroundigs (representing the endo- view or discourse situations) is to be derived from this textual language environment other than by way of propositional reconstruction, i.e. without syntactic parsing and semantic interpretation of sentence and text structures.
the grounds of natural language descriptions of system-position and object-location relations it is exposed to. Although the system's perception is limited to its (formal) language processing and as its ability to act (and
expressions comprising some 12 432 word tokens of 26 word types in 2 483 sentences and 684 texts generated according to the formal syntax and semantics specified for all possible system-positions and orientations. The training set of language material was then exposed to the SCIP system which perceived it as environmental data to be processed according to its system faculties as specified. It is worthwhile noting here again, that this processing is neither based on, nor does it involve
react) is restricted to pacewise linear movement, what makes it semiotic is that-whatever the system might gather from its environment-it will not apply any coded knowledge available prior to that process, but will instead only be confined to the system's own (co- and contextually restricted) susceptibility and processing capabilities to (re-)organize the environmental data a n d to (re-)present the results in some dynamic structure which determines the system's knowledge (susceptibility), learning (change) and understanding (representation). It is based on the assumption that some deeper representational level or core structure might be identified as a common base for different notions of meaning developped sofar in theories of referential and situational semantics as well as some structural or stereotype semantics.
For the purpose of testing semiotic processes, their situational complexity has to be reduced by abstracting away irrelevant constituents, hopefully without oversimplifying the issue and trivializing the problem. Therefore, the propositional form of natural language predication, will be used here only to control the format of the natural language training material, not, however, to determine the way it is processed to model understanding.
Illustrating an example situation, the reference plane (Fig. 2) shows two object-locations. These have (automatically) been described in a corpus of language any knowledge of syntax or semantics on the system's side.
In the course of processing, the two-level consecutive mappings ( Tab. 1, Fig. 1) result in the semantic hyper space (SHS) whose intrinsic structure reveal some properties which can be made visible in a three stage process:
The Endo1i,j data (Tab. 7) serves as base for the following third step of a line- and column-wise transform which results in a new mapping Endo2m,n (Tab. 8) according to the summation equationfirst, applying methods of Kohonen-maps [2] or-with comparable results-average linkage cluster analysis [7] allows to identify structurally adjacent word-types (like object label and predicate label candidates) [11], second, their numerical hedge interpretation yields the distance values, and their directional core interpretations determines the regions of object locations relative to a centrally positioned system (Tab. 7), producing an intermediate representation of the system's own oriented view which can be transformed to third, a mapping that images an orientation indepedent representation of the system's endo- view of its environment (Tab. 8). It can be visualized in another format as fourth, a holistic representation of the referencial plane structured by a pattern of polygons which connect regions of denotational likelihood or isoreferentials (Fig. 3).
The matrix Endo2m,n (Tab. 8) contains the data for an external observer's image of the system's endo- view as computed from the described object locations relative to system positions. The (two-dimensional) 2-dim- scattergram of Endo2 (Fig. 3) gives an overall picture of even referential likelihood by isoreferentials denoting potential object locations quite clearly, however fuzzy.
1Procedural models denote a class of models whose interpretation is not (yet) tied to the semantics provided by an underlying theory of the objects (or its expressions) but consist (sofar) in the procedures and their algorithmic implementations whose instantiations as processes (and their results) by way of computer programs provide the only means for their testing and evaluation. The lack of an abstract (theoretical) level of representation for these processes (and their results) apart from the formal notation of the underlying algorithms is one of the reasons why fuzzy set and possibility theory [15] [16] and their logical derivates were wellcome to provide an open and new procedural format for computational approaches to natural language semantics without obligation neither to reject nor to accept traditional formal and modeltheoretic concepts.
2The concept of knowledge underlying this use here may be understood to refer to known as having well established (scientific, however controversial, but at least inter-subjective) models to deal with, whereas unknown refers to the lack of such models.