Toward a Semantic General Theory of Everything
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Toward a Semantic General Theory of Everything The notion of a universal semantic cognitive map is introduced as a general index- ing space for semantics, useful to reduce semantic relations to geometric and topo- logical relations. As a first step in designing the concept, the notion of semantics is operationalized in terms of human subjective experience and is related to the con- cept of spatial position. Then synonymy and antonymy are introduced in geometri- cal terms. Further analysis building on previous studies of the authors indicates that the universal semantic cognitive map should be locally low-dimensional. This essay ends with a proposal to develop a metric system for subjective experiences based on the outlined approach. We conclude that a computationally defined uni- versal semantic cognitive map is a necessary tool for the emerging new science of the mind: a scientific paradigm that includes subjective experience as an object of study. ! 2009 Wiley Periodicals, Inc. Complexity 15: 12–18, 2010 INTRODUCTION I magine that in a remote future all meaningful information is laid out in some ALEXEI V. SAMSONOVICH, REBECCA abstract space based on its semantics. Let us call this space the universal F. GOLDIN, AND GIORGIO A. ASCOLI semantic space, and the entire distribution of available chunks* of information in it (which may look like stars in the galaxy on the cover illustration) the univer- sal semantic cognitive map. The idea may sound intuitively familiar, as spatial Alexei V. Samsonovich and Giorgio A. analogies play a prominent role in human cognition and memory [1, 2]. However, Ascoli are at Krasnow Institute for the notion of a universal semantic cognitive map goes beyond mere indexing or Advanced Study, George Mason analogy. The map in and by itself can be used as a language and a complete repre- University, 4400 University Drive, sentation of all information that is ‘‘indexed’’ on it. Indeed, if each precise map Fairfax, Virginia 22030 location uniquely determines the associated meaning then, literally, one point of (e-mail: ascoli@gmu.edu) this map should be worth thousands of words. This representation of information, however, would make no practical sense, if map coordinates were assigned to Rebecca F. Goldin is at Department of chunks at random. Instead, as for a road atlas, a useful map would enable seman- Mathematics, George Mason Univer- tic inferences by virtue of geometric and topological inferences. Therefore, we sity, 4400 University Drive, Fairfax, assume that: Virginia 22030 c semantic relations among chunks are captured by geometric and topological relations among their images in the map; c points of the map and displacements on the map are associated with definite and consistent semantics; and *By a ‘‘chunk’’ we refer to a coherent representation of meaningful information, in any format and in any medium. Examples: an utterance, a document, a sculpture, a movie, a traffic light. A message of any kind is also a chunk. 12 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc., Vol. 15, No. 4 DOI 10.1002/cplx.20293 Published online 25 November 2009 in Wiley InterScience (www.interscience.wiley.com)
c the semantics of each map location rather than quantities of information. The next question refers to the can be obtained as a composition of For this reason, e.g., it seems natural to notion of semantic space that provides semantics of finite displacements expect that semantics should be meas- an infrastructure for the map. In cogni- along a path leading to that location ured by some sort of a vector rather tive psychology and linguistics, there from some reference point on the than a scalar. Even the task of finding were many limited attempts (and asso- map. the number and semantics of the vec- ciated criticisms: [6]) to make the idea tor components seems too challenging. of semantic space mathematically pre- If this hypothetical construct is ever We start with an observation that could cise. For example, Russell [7] intro- to materialize, it will in fact constitute simplify the task substantially: to con- duced a two-dimensional circumplex a General Theory of Everything, once a struct a meaningful multidimensional model to represent feelings, Osgood dream of a great-grandfather of Ijon semantic map, we only need a measure et al. [8] introduced a three-dimen- Tichy (a Stanislaw Lem’s character). In of semantic dissimilarity of any two sional semantic differential, Gärdenfors comparison, we have a rather modest chunks that can be interpreted as dis- [4] developed the notion of a concep- ambition to outline a more precise tance with a direction, plus a chunk tual space, etc. The state of the art in notion of this challenge from episte- with known semantic measure. The lat- linguistics related to the semantic mological and mathematical perspec- ter is easy to ascribe: it could be the space idea is represented by latent tives. In this essay, we first address the empty chunk, defined to have zero semantic analysis (LSA; [9]) and its critical question of how to measure semantic ‘‘position.’’ variations, including the probabilistic semantics practically. Next, we intro- To operationalize the notion of topic model [10], association spaces duce key distinctions and definitions semantic dissimilarity, we interpret [11], and other techniques. We eschew of semantic maps. Then, we illustrate semantics as potential average human any requirement to give a comprehen- these concepts with a simplified exam- experience. If a representation is sive review here. ple. Finally, we end with a discussion meaningful to a human subject then The growing success of LSA and of the potential impact that semantic the subject understands its meaning related examples supports our central map may have on the evolution of sci- (whatever this means) when he/she assumption that the idea of the univer- ence and the progress of humankind. becomes aware of it. Building on this sal semantic cognitive map can be premise, the notion of the meaning of given a precise mathematical sense in a chunk can be made precise by asso- general, and not only in those limited COGNITIVE-PSYCHOLOGICAL AND ciation with the mental state of aware- special cases. Specifically, we assume COGNITIVE-NEUROSCIENTIFIC ness of the chunk, as follows. Assume that it is possible to map any given set UNDERPINNINGS that, in principle, it is possible to mea- of chunks to points in a universal In this work, we treat semantics as sure the dissimilarity of any given pair semantic space endowed with a metric potential experience of a human sub- of mental states, using their neural that captures semantic relations ject. In general, the notion of semantics (e.g., functional magnetic resonance among chunks. and its definition have been a subject imaging) and/or behavioral (e.g., intro- of perpetual philosophical debate spective report) correlates. Of course, ([3, 4]; see also the last section) to KINDS AND PROPERTIES OF there are substantial technical issues which we are not offering a conclusion. SEMANTIC COGNITIVE MAPS here: how to filter out the noise and In this text, we use the word ‘‘map’’ to We will expand on our position in the last section with explanations and dis- effects of unrelated factors, how to mean both a map of sets from the cussion of related background issues. overcome cultural differences, etc. We chunks to a metric space, as well as its For now, we just note that our choice skip their discussion for now and image. Semantic relations among allows us to measure semantics by assume that there is a protocol accord- chunks captured by the map image measuring human experience and ing to which measurements can be may include dissimilarity, antonymy, therefore to define all necessary con- repeated with a sufficient number of and synonymy. For example, the ambi- cepts operationally. In this case, how subjects yielding consistent results. In ent space in which the map lives may should we measure the meaning of a this case, we could say that the notion have a metric that captures dissimilar- chunk? The measurement of the mean- of dissimilarity of two mental states is ity (called dissimilarity metric): the ing is not the measurement of the operationally defined, and so is the more dissimilar two chunks are the amount of information in the sense of notion of dissimilarity of two chunks. greater is the distance between them Shannon [5], because in the current At this point, it is only an assumption on the map. Naturally, zero distance framework, we need to distinguish that these general operational defini- should imply identity of meaning and among various qualities or qualia tions exist. vice versa. In this case, we call the Q 2009 Wiley Periodicals, Inc. C O M P L E X I T Y 13 DOI 10.1002/cplx
We also assume that with a suffi- ciently large set of chunks, elements of FIGURE 1 a path can be made small enough so that semantics of a path element would approach an ‘‘elementary semantic difference.’’ In other words, our hope is that as the difference of meaning of two chunks becomes smaller and smaller, it should also become simpler and easier to grasp. Indeed, when the necessary level of precision is specified, the difference between two similar items is usually easier to characterize than the differ- ence between items of different kind. Consider, for example, the following pairs of chunks: ‘‘a new car’’ and ‘‘a used car,’’ ‘‘the number 4’’ and ‘‘the Kinds of semantic cognitive map: strong (left: modified with permission from http://en. number 5,’’ ‘‘a new car with a scratch’’ wikipedia.org/wiki/File:Calabi-Yau.png) and weak (right). and ‘‘polysemy.’’ Consistent with this observation, we assume that the notion of an elementary semantic fla- vor makes sense, understood as the meaning of a finite semantic difference map a strong semantic cognitive map. rical framework outlined above in that cannot be further reduced to a As an alternative, a map may tend to which only the notion of dissimilarity composition of smaller differences. pull all synonyms together and all is initially assumed to be represented Based on this definition, elementary antonyms apart, with a soft constraint by distances? To the best of our knowl- semantic flavors must differ from each on the spread of the entire distribution edge, this is an open problem [13]. other qualitatively, implying that two and without representing dissimilarity Another issue concerns the seman- different flavors cannot correspond to per se [12]. In this case, we call the tics of the coordinates of the ambient the same direction in Rn. On the other map a weak semantic cognitive map space in which the map image lives. A hand, every direction in Rn associated (Figure 1). recognized limitation of LSA and with a pair of chunks that are sepa- A proposal to build a universal related approaches is that semantic rated along this direction should be semantic cognitive map capable of ac- dimensions cannot be clearly identi- representable by a composition of ele- commodating all possible human fied: ‘‘The typical feature of LSA is that mentary semantic flavors. Based on knowledge and experiences would dimensions are latent. That means these considerations, we assume that imply a huge program for many gener- there are no explicit interpretations for any point on the map has local co- ations to come. Today, we are far from the dimensions’’ (Ref. 14, p. 414). Is it ordinates associated with elementary the final goal, yet it could make sense possible to construct a semantic cogni- semantic flavors. to address one relatively simple ques- tive map, the dimensions of which We shall continue this consideration tion about the final product. Assuming have clearly identifiable, general in the next section using a simple that the universal semantic cognitive semantics? example (which is not representative of map can be constructed, what can we For now, we assume that the ambi- today’s state of the art in computa- say a priori about its geometric and ent space of the cognitive map is a tional or cognitive linguistics). Specifi- topological properties? vector space Rn, and therefore chunks cally, we restrict chunks to English We propose to narrow this question are associated with vectors under any documents, replace elementary seman- down to two issues that in our view semantic map of chunks to Rn. tic flavors with English words, neglect- can be addressed today. One has to According to the above assumptions, ing their ambiguity and context-de- do with geometric representation of the semantics of a given chunk are pendence, and assume that coordinates antonymy on a strong semantic cogni- given by a composition of semantics of given by these words on the map are tive map. How can one introduce syn- elements of a path leading to it from 0, global Cartesian coordinates. Although onymy and antonymy into the geomet- representing the empty chunk. these simplifications may introduce 14 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc. DOI 10.1002/cplx
inconsistencies, this toy example is c the coordinates of Rn have definite [17]. Our present goal is not to find the useful to illustrate our ideas. semantics, best representation of documents, but c A(X) records dissimilarities among rather to introduce a framework in elements of X by distances in Rn which such maps could be explored. A TOY EXAMPLE (with an appropriate choice of met- The first question that we address Let us start with a finite set X consist- ric): the more distant two docu- here is how to incorporate the notions ing of the set of documents currently ments are, the less similar are their of synonymy and antonymy into this existing in the literature. We are setting meanings, and vice versa. geometrical framework. This question out to present a mathematical frame- relates to the issue of translating rela- work in which X can be given geomet- It may not be possible to satisfy all tionships among words and docu- rical structure, and that structure can these properties at once, but we begin ments into relationships among docu- be studied to understand more about by noting that a part of this general ments. the structure of language and human framework has been used already in We posit the existence of two func- knowledge. As a first step in this pro- LSA. Attempts to improve on this anal- tions (defined operationally): cess, we would like to consider a map ysis will not only rely on changing pa- rameters but also on varying A and applicability g: X 3 W ? [0, 1], and A : X !
gA on A(X) 3 W, where A(X) is the dimension of the map by analyzing the physical world ‘‘out there.’’ This belief image. In particular, we let fA(p, w) 5 system of synonym–antonym relations. allows us to develop phenomenological f(x, w) and gA(p, w) 5 g(x, w), where p We have previously demonstrated knowledge of the World which, together 5 A(x). On each fiber, Fw, we then extend by example the possibility of simulta- with precise metrics, logic, and experi- fA(., w) and gA(., w) to smooth functions neous geometric representation of syn- mentation, leads to progress, including on an open neighborhood of the image onym and antonym relations in W tools to modify the World itself. Para- A(Fw). Such extensions are clearly not using a weak semantic cognitive map doxically, however, in this circle of em- unique, but we choose one to be as [12]. Our example constructed by nu- pirical science experiences themselves simple as possible, i.e. such that it merical optimization of a certain are left out as a hard problem. We does not vary wildly in regions not energy function was a map from W to recently proposed a scientific approach containing points in A(X). We abuse a vector space V: to study experiences, because humans notation and denote the extensions by observe them just as they observe other r:W!V (3) real phenomena [18]. Because experien- fA and gA as well. Thus, the domains of fA(., w) and gA(., w) are subsets of Rn. i.e., words were represented as vectors ces are subjective, their phenomenolog- Then, for each word w, we may associ- in V, such that almost every antonym ical knowledge must be developed via ate a vector to each point in its do- pair had an obtuse angle between their introspection rather than with psycho- main of applicability: the gradient of vectors, whereas almost every syno- physiological measures. To yield con- fA(., w) at that point. nym pair had an acute angle between sistent scientific results, however, those The notions of synonymy and their vectors (exceptions from this rule observations must become objective, in antonymy that conform to a common constituted only 1% of synonym and the sense that they can be reliably intuitive understanding can be now antonym pairs and could be due to repeated by, and quantitatively commu- introduced as follows. We compute gra- inconsistencies of the dictionary itself). nicated among, independent research- dient vectors at p for all words that In contrast to Rn, the vector space V is ers (see the cover illustration of this apply to documents at p. Then, we low dimensional, with most of the data issue). The key missing elements for define synonyms and antonyms to be captured in just four dimensions (95% this purpose are precise metrics and pairs of words such that gradient vec- of the variance of the data is captured measuring tools for human subjective tors at p are nearly parallel for syno- in three dimensions; the fifth and experiences, such as Cartesian coordi- nyms and nearly antiparallel for anto- higher dimensions have negligible var- nates of the mental space. Logic and nyms; the notions ‘‘nearly parallel’’ and iance). Moreover, the first three princi- experimentation, applied to measures ‘‘nearly antiparallel’’ are made precise pal components (PC) of the emergent of subjective experiences, will lead to with a threshold angle, which is a pa- distribution of words in V had clearly more progress, including tools to engi- rameter of the theory. Note that if A(X) identifiable semantics: ‘‘good-bad’’ (PC neer artificial minds. satisfies an appropriate density prop- 1), ‘‘calming-exciting’’ (PC 2), and A long time has passed since scien- erty near p in Rn then the gradients ‘‘open-closed’’ (PC 3). This result sug- tists were satisfied with the belief of a at p will be independent of the exten- gests that in the case outlined above of world made of particles and fields. sions of fA and gA to a neighborhood of a strong semantic cognitive map, it Today physicists are preoccupied with A(X). will be possible to select a semantically interpretation of far more abstract con- Assuming that it is possible to gen- meaningful local coordinate system cepts, while even the century-old quan- eralize this toy example to the univer- using these same general semantics tum mechanics lacks interpretation sal semantic cognitive map considered (possibly producing similar PC charac- [19, 20]. Brain scientists are similarly above, we relate the local coordinates teristics for the local density of the puzzled with developing a scientific on the universal semantic cognitive universal semantic cognitive map). description of the brain at the higher map to the above notions of synonyms cognitive level. In particular, there is no and antonyms. A pair of antonyms WHY SEMANTIC COGNITIVE general consensus on how to introduce should correspond to opposite direc- MAPPING MATTERS TO SCIENCE subjective experience into natural sci- tions along a coordinate that captures The brain creates the Universe as we ence [21]. The new scientific paradigm their semantics, two synonyms should know it, and the only Universe that we [18] should be based on the classical correspond to nearly the same direc- know directly: the mind. It all starts scientific method [22–24], while at the tion, and semantically different pairs of from humans being aware of their own same time it needs to be (a) semanti- antonyms should correspond to differ- subjective experiences. Scientists call cally oriented and (b) subject-oriented, ent local coordinates on the map. This some of these experiences ‘‘observa- i.e. it needs to include subjective expe- approach allows us to estimate the tions’’ and attribute their origin to the rience as a subject of study. The 16 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc. DOI 10.1002/cplx
explicit connection between semantics discussions of semantics in the litera- not add a restriction in this context. and subjectivity can be ensured by ture from ancient to modern times jus- Accordingly, we expect implications of selecting an operational definition of tify the freedom of our choice, which is semantic cognitive maps for the entire semantics based on subjectivity, and to define semantics in terms of subjec- natural science, not just for the new this is the choice we made in this work. tive experience: in other words, we say science of the mind. In conclusion, to To better understand what seman- that ‘‘meaningful’’ equals ‘‘meaningful develop a unified semantic theory ap- tics are, how to measure and to quan- to a subject’’. In this case, the notion of plicable to subjective experience and tify semantics, and how to build quan- semantics and the notion of potential to objective physical reality, we need titative theories in semantic terms, it experience become synonymous. to better understand semantics itself, would be useful if we could represent Although the new scientific para- from a conceptual to a computational semantics geometrically and apply digm needs to include subjective expe- level. This sort of knowledge can be powerful tools of geometry to (real rience as a topic of investigation, no extracted from natural language and rather than latent) semantic analysis. conceptual and mathematical tools are all documents, viewed as a collective The notion of semantics, or the mean- yet available to represent and study product of all human minds of all gen- ing of a representation, has many defi- subjective experiences quantitatively erations. nitions and interpretations. Major phil- with adequate accuracy, completeness, osophical, linguistic, and mathematical and generality. We need a powerful ap- ACKNOWLEDGMENTS movements have been founded on try- paratus to deal with semantics of expe- The authors thank Dr. Harold J. Moro- ing to articulate these notions [25, 26]. rience, and therefore, with semantics witz for his continuous encourage- The long and numerous controversial in general: the word ‘‘experience’’ does ment. REFERENCES 1. Lakoff, G.; Johnson, M. Metaphors We Live by, 2nd ed.; University of Chicago Press: Chicago, IL, 1980. 2. Samsonovich, A.V.; Ascoli, G.A. A simple neural network model of the hippocampus suggesting its pathfinding role in episodic memory retrieval. Learn Memory 2005, 12, 193–208. 3. Baldinger, K. Semantic Theory: Towards a Modern Semantics; Brown W.C., Wright, R., Transl.; St. Martin’s Press: New York, 1980. 4. Gärdenfors, P. Conceptual Spaces: The Geometry of Thought. MIT Press: Cambridge, MA, 2004. 5. Shannon, C.E. A mathematical theory of communication. Bell Syst Technical J, 1948, 27, 379–423, 623–656. 6. Tversky, A.; Gati, I. Similarity, separability, and the triangle inequality. Psychol Rev 1982, 89, 123–154. 7. Russell, J.A. A circumplex model of affect. J Pers Soc Psychol 1980, 39, 1161–1178. 8. Osgood, C.E.; Suci, G.; Tannenbaum, P. The Measurement of Meaning. University of Illinois Press: Urbana, IL, 1957. 9. Landauer, T.K.; Dumais, S.T. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Rev 1997, 104, 211–240. 10. Griffiths, T.L.; Steyvers, M. A probabilistic approach to semantic representation. In: Proceedings of the 24th Annual Conference of the Cognitive Science Society; Gray, W.D.; Schunn, C.D., Eds.; Lawrence Erlbaum Associates: Hillsdale, NJ, 2002; pp 381–386. 11. Steyvers, M.; Shiffrin, R.M.; Nelson, D.L. Word association spaces for predicting semantic similarity effects in episodic memory. In: Experimental Cognitive Psychology and Its Applications: Festschrift in Honor of Lyle Bourne, Walter Kintsch, and Thomas Landauer; Healy, A., Ed. American Psychological Association: Washington, DC, 2004; pp 237–249. 12. Samsonovich, A.V.; Ascoli, G.A. Cognitive map dimensions of the human value system extracted from natural language. In: Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms.Proceedings of the AGI Workshop 2006. Frontiers in Artificial Intelligence and Applications, vol. 157; Goertzel, B.; Wang, P., Eds. Amsterdam, The Netherlands: IOS Press, 2007; pp 111–124. ISBN 978–1-58603-758-1. 13. Schwab, D.; Lafourcade, M.; Prince, V. Antonymy and Conceptual Vectors. In: Proceedings COLING 2002: The 19th International Conference on Computational Linguistics. http://www.aclweb.org/anthology-new/C/C02/C02-1061.pdf; 2002. 14. Hu, X.; Cai, Z.; Wiemer-Hastings, P.; Graesser, A.C.; McNamara, D.S. Strengths, limitations, and extensions of LSA. In: Handbook of Latent Semantic Analysis; Landauer, T.K.; McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mah- wah, NJ, 2007; pp 401–425. 15. Mandelbrot, B. The Fractal Geometry of Nature. Lecture notes in mathematics 1358. W. H. Freeman: New York, NY, 1982. ISBN 0716711869. 16. Martin, D.I.; Berry, M.W. Mathematical foundations behind latent semantic analysis. In: Handbook of Latent Semantic Analysis; Landauer, T. K.; McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mahwah, NJ, 2007; pp 35–55. 17. Dennis, S. Introducing word order within the LSA framework. In: Handbook of Latent Semantic Analysis; Landauer, T.K.; McNamara, D.S.; Dennis, S.; Kintsch, W., Eds.; Lawrence Erlbaum Associates: Mahwah, NJ, 2007. pp 449–464. 18. Ascoli, G.A.; Samsonovich, A.V. Science of the conscious mind. Biol Bull 2008, 215, 204–215. 19. Bell, J.S. On the problem of hidden variables in quantum mechanics. Rev Mod Phys 1966, 38, 447. 20. Feynman, R.P. Simulating physics with computers. Int J Theoretical Phys 1982, 21, 467–488. Q 2009 Wiley Periodicals, Inc. C O M P L E X I T Y 17 DOI 10.1002/cplx
21. Chalmers, D.J. The conscious mind. Search of a Fundamental Theory. Oxford University Press: Oxford, UK, 1996. 22. Popper, K.R. The Logic of Scientific Discovery. Unwin Hyman: London, 1934/1959. 23. Kuhn, T.S. The Structure of Scientific Revolutions. University of Chicago Press: Chicago, IL, 1962. 24. Lakatos, I. Falsification and the methodology of scientific research programs. In: Criticism and the Growth of Knowledge; Lakatos, I.; Musgrave, A., Eds.; Cambridge University Press: Cambridge, UK, 1970. 25. Katz, J. Semantic Theory. Harper & Row: New York, 1972. 26. Putnam, H. Is semantics possible? In: Concepts: Core Readings; Margolis, E.; Laurence, S.; Eds. The MIT Press: Cambridge, MA, 1999; pp 177–187. 18 C O M P L E X I T Y Q 2009 Wiley Periodicals, Inc. DOI 10.1002/cplx
You can also read