1. Based upon statistical means for the empirical analysis and the formal representation of vague word meanings in natural language texts, procedures have been devised which allow for the systematic modelling of a fragment of the lexical structure as constituted by the vocabulary employed in the texts and the concomitantly conveyed components of linguistic meaning concerned [3]. The coefficients applied will map lexical items onto fuzzy subsets of the vocabulary according to the numerically specified regularities these items have been used with in the discourse analysed. The resulting system of sets of fuzzy subsets [4] is a datastructure which may be interpreted topologically as a hyperspace with a natural metric. Its elements are abstract objects representing meaning points, and the distances between them represent their mutual meaning differences. They form discernable clouds and clusters whose meanings are measured and mapped as a composite function of any one lexical item's collection of differences of usage regularities calculated against those of all other items occurring in the texts analysed [5]. Thus, the analysing algorithm takes natural language discourse from any specified subject domain as input and produces as output a distance-like datastructure (semantic space) of linguistically labeled elements (meaning points) whose topologies (position, adjacency, environment, etc.) reveal associative properties of the conceptual prototypes according to which lexical items have been employed in the texts to form stereotypical linguistic meanings.
2. In order to cope with phenomena of semantic efficiency like usage similarities of different and/or contextual variations of identical items, the present model separates the format of a basic (stereotype) meaning representation system from its latent (dependency) relational organization by means of variable conditional constraints to be modeled. Whereas the former is a rather static, topologically structured (associative) memory representing the data that text analysing algorithms provide, the latter can be characterized as a collection of dynamic processes to re-organize these data under various principles. Other than inefficient meanings allowing declarative knowledge to be characterized by unconditional constraints and represented in pre-defined structures like semantic networks, efficient meanings of contextual knowledge are heavily dependent on the communicative situations and vary according to the conditional constraints concerned. One of these can be formulated as and modelled by a generative algorithm that induces a relation of lexical relevance which - under any item's perspective - will produce varying dependencies called semantic dispositions whenever and only executed on changing data [6].
3. This is achieved by a recursively defined procedure which operates on the semantic space structure. Given one meaning point's position as a start, the algorithm will work its way through all labeled points in the semantic space - unless stopped under conditions of a given target node, number of nodes to be processed, or threshold of maximal distance - transforming prevailing similarities of meanings as represented by adjacency of points to induce a binary, non-symmetric, and transitive relation between them. This relation allows for the hierarchical reorganization of meaning points as nodes under a primed head in an n-ary tree or dispositional dependency structure (DDS). Weighted numerically as a function of a node's distance values and its level and position in the tree, this relation either expresses the head-node's meaning-dependencies on the daughter-nodes or inversely their meaning-criterialities under the specific aspect determined by that head [7]. To illustrate the feasibility of the generative procedure operating on the semantic space structure to yield DDS-trees, Fig. 1 shows the linguistic meaning of the lexical item INDUSTRIE in the format of its semantic dispositions. These are constrained by those other meanings/items that proved to be relevant (first value: distance/second value: criteriality) according to their usages in a corpus of German newspaper texts (``Die Welt'', 1964).
1Published in: Bahner, W./Schildt, J./Viehweger, D. (Eds.): Proceedings of the XIV. International Congress of Linguists 1987, Volume II, Berlin (Akademie-Verlag) 1990, pp. 1233-1235.