In linguistic semantics, cognitive psychology, and knowledge representation most of the necessary data concerning lexical, semantic and/or external world information is still provided introspectively. Researchers are exploring (or make test-persons explore) their own linguistic/cognitive capacities and memory structures to depict their findings (or let hypotheses about them be tested) in various representational formats (lists, arrays, trees, nets, active networks, etc.). It is widely accepted that model structures resulting from these analyses do have a more or less ad hoc character and tend to be confined to their limited theoretical or operational performances within a specified subject domain and/or implemented system. Thus, these approaches - by definition - can only map what of the world's fragment under investigation is already known to the analysts, not, however, what of it might be conveyed in texts unknown to them. Being basically interpretative and in want of operational control, such knowledge representations will not only be restricted quite naturally to undisputed informational structures which consequently can be mapped in accepted and well established (concept-hierarchical, logically deductive) formats, but they will also lack the flexibility and dynamics of more constructive model structures which are needed for automatic meaning analysis and representation from input texts to allow for a component to build up and/or modify a system's own knowledge, however shallow and vague that may appear compared to human understanding.
Other than these more orthodox lines of introspective data acquisition in meaning and knowledge representation research, the present approach has been based on the algorithmic analysis of discourse that real speakers/writers produce in actual situations of performed or intended communication on a certain subject domain. The approach makes essential use of procedural means to map fuzzy word meanings and their connotative interrelations in the format of conceptual stereotypes. Their varying dependencies constitute dynamic dispositions2 that render only those concepts accessible which may - within differing contexts differently - be considered relevant under a specified perspective or aspect. Thus - under the notion of lexical relevance and semantic disposition - a new meaning relation may operationally be defined between elements in a conceptual representation system which in itself may empirically be reconstructed from natural language discourse. Such dispositional dependency structures would seem to be an operational prerequisite to and a promising candidate for the simulation of contents-driven (analogically-associative), instead of formal (logically-deductive) inferences in semantic processing.
After these (1.) introductory lines and more for illustrative purposes rather than for a detailed and qualifying discussion, some of the standard concept and/or word-meaning representational formats in memory models and knowledge systems (2.) will be compared in order to motivate our rather strict departure from them in developing and using (3.) some statistical means for the analysis of texts and the representation of the data obtained which will briefly be introduced as the semantic space model. Starting from the notion of priming and spreading activation in memory as a cognitive model for comprehension processes, we will (4.) deal with our procedural method of representing semantic dispositions by way of inducing a relation of lexical relevance among labeled concept representations in semantic space3. Concluding (5.), two or three problem areas connected with word meaning and concept processing will be touched which might be tackled anew and perhaps be brought to a more adequate though still tentative solution under an empirically founded approach in procedural semantics.
In early artificial intelligence research a different type of knowledge representation was developed for question-answering-systems. A fragment of the most common schema of the semantic network type [3] is shown in Fig. 2.2. Here again we have labeled concept nodes linked to one another by pointers representing labeled relations which form a network instead of a tree structure. This enables the system to answer questions like: Ïs Susy a cat?" correctly by identifying the SUSY-node, its ISA-relation pointer and the CAT-node. Moreover, the pointer structure allows for the processing of paths laid through the network, initiated by questions like: "Susy, cat?" which will prompt the answer "Susy is a cat. Cat eats fish. Cat is an animal. Fish is an animal."
A schematic representation of concept relatedness as envisaged by cognitive theorists who work along more procedural lines of memory models [4] is shown in Fig. 2.3. Their distance-relational conception lends itself readily to the notion of stereotype representation for concepts that do not have intersubjectively identifiable sharp boundaries [5]. Instead of binarily decidable category membership, stereotypical concepts or prototypes are determined by way of their adjacency to other prototypes. Taken as a memory model, stimulation of a concept will initiate spreading activation to prime the more adjacent concepts more intensely than those farther away in the network structure, thus determining a realm of concepts related by their primed semantic affinity. In the given example, the stimulation of the concept-node MANAGEMENT will activate that of BUSINESS first, then INDUSTRY and ORGANISATION with about the same intensities, then ADMINISTRATION and so on, with the intensities decreasing as a function of the activated nodes' distances.These three schemata of model structures - although obviously concerned with the simulation of symbol understanding processes - are designed to deal primarily with static aspects of meaning and knowledge. Thus, in interpreting input symbols/strings, pre-defined/stored meaning relations and constructions can be identified and their representations be retrieved. Without respective grounding made explicit and represented in that structure, however, possibly distorted or modified instantiations of such relations or relevant supplementary semantic information can hardly be recognized or be provided within such representational systems. As the necessary data is not taken from natural language discourse in communicative environments but elicited in experimental settings by either exploring one's own or the test persons' linguistically relevant cognitive and/or semantic capacities, usage similarities of different and/or contextual variations of identical items are difficult to be ascertained. This is rather unsatisfactory from a linguist's point-of-view who thinks that his discipline is an empirical one and, hence, that descriptive semantics ought to be based upon linguistic data produced by real speaker/hearers in factual acts of communicative performance in order to let new meaning representations (or fragments of them) replace (or improve) older ones to change/update a static memory structure.
The empirical analysis of discourse and the formal representation of vague word meanings in natural language texts as a system of interrelated concepts is based on the Wittgensteinian [8] notion of language games and their functions5. His assumption that a great number of texts analysed for the terms' usage regularities will reveal essential parts of the concepts and hence the meanings conveyed.
The statistics which have been used so far for the systematic analysis not of propositional strings but of their elements, namely words in natural language texts, is basically descriptive. Developed from and centred around a correlational measure to specify intensities of co-occurring lexical items used in natural language discourse, these analysing algorithms allow for the systematic modelling of a fragment of the lexical structure constituted by the vocabulary employed in the texts as part of the concomitantly conveyed world knowledge.
A correlation coefficient appropriately modified for the purpose has been used as a mapping function. It allows to compute the relational interdependence of any two lexical items from their textual frequencies. Those items which co-occur frequently in a number of texts will positively be correlated and hence called affined, those of which only one (and not the other) frequently occurs in a number of texts will negatively be correlated and hence called repugnant. Different degrees of word-repugnancy and word-affinity - indicated by numerical values ranging from -1 to +1 - may thus be ascertained without recurring to an investigator's or his test-persons' word and/or world knowledge (semantic competence), but can instead solely be based upon the usage regularities of lexical items observed in a corpus of pragmatically homogeneous texts, spoken or written by real speakers/hearers in actual or intended acts of communication (communicative performance).
Let K be such a corpus that consists of t texts belonging to a specific language-game, i.e. satisfying the condition of pragmatic homogeneity, and let V be the vocabulary of i lexical entries x being used
The resulting system of sets of fuzzy subsets of the vocabulary represent a structured lexicon. It is a relational data structure which may be interpreted topologically as a hyperspace with a natural metric, called semantic space. Its linguistically labelled elements represent meaning points, and their mutual distances represent meaning differences. The position of a meaning point may be described by its semantic environment. This is determined by those other points in the semantic hyperspace which - within a given diameter - are most adjacent to the central one choosen to be illustrated according to the following Eucledean metric
Having seen that topological environments of that sort do in fact assemble meaning points of a certain semantic affinity solely by the performance of the text analysing algorithms and without any competent language user's interference, a number of questions arose whose answers should at least be mentioned:
Having checked a great number of environments, it was ascertained that they do in fact assemble meaning points of a certain semantic affinity. Further investigation revealed [9] that there are regions of higher point density in the semantic space, forming clouds and clusters. These were detected by multivariate and clusteranalyzing methods [10] which showed, however, that both, the paradigmatically and syntagmatically related items formed what may be named connotative clouds rather than what is known to be called semantic fields [11]. Although its internal relations appeared to be unspecifiable in terms of any logically deductive or concept hierarchical system, their elements' positions revealed a high degree of stable structures which suggested a regular form of contents-dependant associative connectedness [12] which gave rise to the idea of having the variable relevance of related meanings and/or concepts be defined procedurally [13], [14], [15].
Taking up the heuristics provided by Spreading Activation Theory in semantic memory, cognitive structures, and concept representation as advanced by [16], [17], and [18], the notion of spreading activation can be employed not only to denote activation of related concepts in the process of priming studied in subsequent publications like [19] and [20] but - generically somewhat prior to that - may also signify the very procedure which induces these relations beween concepts. Originally developed as a procedural model to cope with observed latencies of activated concepts in comprehension processes, priming and spreading activation is based on network-type models or world-knowledge structures as illustrated briefly above. Essentially defined by nodes, representing concepts, meanings or objects, and pointers which relate them conceptually, semantically, or logically to one another, these formats have a considerable advantage over the semantic space structure outlined above: one of the problems of distance-like data structures in semantic processing is that - distance being a symmetric relation - well-known search strategies for retrieval, matching, and inferencing purposes cannot be applied because these are based upon some non-symmetric relations, as realized by pointer structures in well-known word meaning and/or world knowledge representations.
In order to make such procedures operate on the semantic space data, its structure has to be transformed into some hierarchical organisation of its elements. For this purpose, the semantic space model has to be re-interpreted as a sort of conceptual raw data and associative base structure. What appeared to be a disadvantage first, now turns out to be an advantage over more traditional formats of representation. Other than these approaches which have to presuppose the structural format of the semantic memory models that are to be tested in word recall and/or concept recognition experiments, the semantic space provides some of the necessary data for the procedural definition of dynamic, instead of static model structures that allow variable stereotype instead of fixed categorial concept representations. Thus, the concept nodes as abstract mappings of meanings of lexikal items are not just linked to one another according to what cognitive scientists supposedly know about the way conceptual information is structured in memory, but it is this very structure that is already considered to by a dynamic format of stereotype concept organization. Defined as procedures that operate on the semantic space data, this is tantamount to a dynamic re-structuring of meaning points and - depending on the controlling parameters - the generation of paths between them along which - in case of priming - activation might spread whenever a meaning point is stimulated.
Unlike the ready-set and fixed relations among nodes, an algorithm has been devised which operates on the semantic space data structure as its base to induce dependencies between its elements, i.e. among subsets of the meaning points. The recursively defined procedure detects fragments of the semantic space according to the meaning point it is started with and according to the semantic similarities, i.e. the distance relations it encounters during operation, constituting what we termed semantic relevance. Stop-conditions may deliberately be formulated either qualitatively (naming a target point) or quantitatively (number of points to be processed).
Given one meaning point's position as a start, the algorithm will - other than in [10] and [11] - first list all its neighbouring points by increasing distances, second provide similar lists for each of these neighbours, and third prime the starting point as dominant node to mark the tree's root. Then, the algorithm's generic procedure will take the first entry from the first list, determine from the appropriate second list its most adjacent neighbour among those points already primed, in order to identify it as the ancestor (mother-node) to which the new descendant (daughter-node) is linked whose label then gets deleted from the first list. Repeated succesively for each of the meaning points listed and in turn primed in accordance with this procedure, the algorithm will select a particular fragment of the relational structure latently inherent in the semantic space data under a certain perspective, i.e. the aspect or initially primed meaning point the algorithm is started with. Working its way through and consuming all labeled points in the space structure - unless stopped under conditions of given target points, number of points to be processed, or threshold of maximal distance - the algorithm transforms prevailing similarities of meanings as represented by adjacent points to establish - in the process of priming - a binary, non-symmetric, and transitive relation between them. This relation allows for the hierarchical re-organization of meaning points as descendant nodes under a primed head or root in an n-ary DDS-tree [12]. Weighted numerically as a function of a node's distance values and level of its tree-position, this measure either expresses a concept's dependencies as given by the root's descendants in that tree, or, inversely, it evaluates their criterialities for that concept as specified and determined by that tree's root.
Without introducing the algorithms formally, some of their operative characteristics can well be illustrated in the sequel by a few simplified examples. Beginning with the schema of a distance-like data structure as shown in the two-dimensional configuration of 11 points, labeled a to k (Fig. 4.1) the stimulation of three different starting points a, b and c results in the dependency structures which the algorithm of least distance selects (Fig. 4.2) as distance detection (first row), as a step-list representation of the selecting process of points activated (second row), then as their n-ary tree representations (third row) and finally as their transformations to binary-tree structures (fourth row) of points related respectively to be primed. It is apparent that stimulation of other points within the same configuration of basic data points will result in similar but nevertheless differing trees, depending on the aspect under which the structure is accessed, i.e. the point initially stimulated to start the algorithm with.
Applied to the semantic space data of 360 defined meaning points calculated from the textcorpus of the 1964 editions of the German newspaper Die Welt, the Dispositional Dependency Structures (DDS) of AUFTRAG/order and GESCHAEFT/business are given in Figs. 4.3 and 4.4 as generated by the procedure described. Different stop conditions given for the generation of the DDS resulted in different trees: DDSáAUFTRAGñ qualitative stop by target node GESCHAEFT, grade 7, depth 13, 64 nodes; and DDSáGESCHAEFTñ quantitative stop by number of nodes to be processed, grade 4, depth 10, 60 nodes. In the DDSáAUFTRAGñ (Fig. 4.3) we find only one descendant (LEIT/lead) on level 1, three as connotative alternates on level 2, one of which (ELEKTRON/electronic) has even 7 descendants on level 3, etc. In the DDSáGESCHAEFTñ (Fig. 4.4) there are two descendant connotative alternates (WERB/advertism; KENNTNIS/knowledge) on level 2, each of which has four descendants on level 3, etc. Attention is drawn to the dependencies of the direct descendants (BITTE/request) ® (PERSON/person) ® (HAUS/house). As in DDSáAUFTRAGñ this dependency is found in exactly the same order in the DDSáGESCHAEFTñ but here it is situated farther from the root, starting on the tree's sixth level only, instead of its third.
Figure 4.3: The Dispositional Dependency Structure (DDS) of AUFTRAG (=order)
Figure 4.4: The Dispositional Dependency Structure (DDS) of GESCHAEFT (=buisiness)
To calculate such differences, a numerical measure of criteriality
Cri to its mother-node zd under a given
aspect i can be defined as a function of its distance value
d2(zd,za), the tree's root zr , and its level g concerned.
1This paper (an intermediate version of which was read on ICCH/83) reports on the empirical foundations of a project in computational semantics on the automatic analysis and representation of natural language meanings in texts. This project was supported by the North Rhine Westphalia Ministry of Science and Research under grant IV A2 FA 8600. Published in: Agrawal, J.C./Zunde, P. (Eds.): Empirical Foundations of Information and Software Science. New York/London (Plenum Press) 1985, pp. 273-291.
2Instead of formally introducing any of the algorithms developed and tested so far for the purposes at hand, an impression of their performance and application shall in the sequel be given by way of some - hopefully illustrative - figures and examples. For more detailed introductions the reader is referred to the bibliography at the end of this paper where additional informations on the MESY-project in general and its procedural approach in particular may be found in a number of the author's recent publications.
3The system of both, the text analysing algorithm leading to the semantic space structure and the generative procedure operating on that structure to yield the DDS-trees, is implemented in FORTRAN, CDC-ASSEMBLER, and SIMULA on the CDC-Cyber 175 of the Technical University of Aachen Computing Center.
4See also [7] where the principle of semantization is introduced as a process which can be emulated by procedural means to constitute meanings by consecutive restrictions of elementary choices among entities on the levels of pragmatics, via semantics and syntactics down to morpho-phonetics; whereas these elements and/or entities on each of the semiotic levels are generated by an inversely operating procedure, which allows recurrent combinations of elements to be identified against those combinatorial possibilities not realized on that level these combinations constitute the new elements which on the next level may be combined, etc.
5''A meaning of a word is a kind of employment of it. For it is what we learn when the word is incorporated into our language. That is why there exists a correspondence between the concept rule and meaning. [...] Compare the meaning of a word with the function of an official. And different meanings with different functions. When language games change, then there is a change in concepts, and with the concepts the meanings of words change.'' [8] No. 61-65, p. 10e