
- •The role of natural language processing
- •Linguistics and its structure
- •What we mean by computational linguistics
- •Word, what is it?
- •The important role of the fundamental science
- •Current state of applied research on spanish
- •Conclusions
- •II. A historical outline
- •The structuralist approach
- •Initial contribution of chomsky
- •A simple context-free grammar
- •Transformational grammars
- •The linguistic research after chomsky: valencies and interpretation
- •Linguistic research after chomsky: constraints
- •Head-driven phrase structure grammar
- •The idea of unification
- •The meaning text theory: multistage transformer and government patterns
- •The meaning text theory: dependency trees
- •The meaning text theory: semantic links
- •Conclusions
- •III. Products of computational linguistics: present and prospective
- •Classification of applied linguistic systems
- •Automatic hyphenation
- •Spell checking
- •Grammar checking
- •Style checking
- •References to words and word combinations
- •Information retrieval
- •Topical summarization
- •Automatic translation
- •Natural language interface
- •Extraction of factual data from texts
- •Text generation
- •Systems of language understanding
- •Related systems
- •Conclusions
- •IV. Language as a meaning text transformer
- •Possible points of view on natural language
- •Language as a bi-directional transformer
- •Text, what is it?
- •Meaning, what is it?
- •Two ways to represent meaning
- •Decomposition and atomization of meaning
- •More on homonymy
- •Multistage character of the meaning text transformer
- •Translation as a multistage transformation
- •Two sides of a sign
- •Linguistic sign
- •Linguistic sign in the mmt
- •Linguistic sign in hpsg
- •Are signifiers given by nature or by convention?
- •Generative, mtt, and constraint ideas in comparison
- •Conclusions
- •V. Linguistic models
- •What is modeling in general?
- •Neurolinguistic models
- •Psycholinguistic models
- •Functional models of language
- •Research linguistic models
- •Common features of modern models of language
- •Specific features of the meaning text model
- •Reduced models
- •Do we really need linguistic models?
- •Analogy in natural languages
- •Empirical versus rationalist approaches
- •Limited scope of the modern linguistic theories
- •Conclusions
- •Exercises
- •Review questions
- •Problems recommended for exams
- •Literature
- •Recommended literature
- •Additional literature
- •General grammars and dictionaries
- •References
- •Appendices some spanish-oriented groups and resources
I. INTRODUCTION
THE ROLE OF NATURAL LANGUAGE PROCESSING
LINGUISTICS AND ITS STRUCTURE
WHAT WE MEAN BY COMPUTATIONAL LINGUISTICS
WORD, WHAT IS IT?
THE IMPORTANT ROLE OF THE FUNDAMENTAL SCIENCE
CURRENT STATE OF APPLIED RESEARCH ON SPANISH
CONCLUSIONS
II. A HISTORICAL OUTLINE
THE STRUCTURALIST APPROACH
INITIAL CONTRIBUTION OF CHOMSKY
A SIMPLE CONTEXT-FREE GRAMMAR
TRANSFORMATIONAL GRAMMARS
THE LINGUISTIC RESEARCH AFTER CHOMSKY: VALENCIES AND INTERPRETATION
LINGUISTIC RESEARCH AFTER CHOMSKY: CONSTRAINTS
HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR
THE IDEA OF UNIFICATION
THE MEANING Û TEXT THEORY: MULTISTAGE TRANSFORMER AND GOVERNMENT PATTERNS
THE MEANING Û TEXT THEORY: DEPENDENCY TREES
THE MEANING Û TEXT THEORY: SEMANTIC LINKS
CONCLUSIONS
III. PRODUCTS OF COMPUTATIONAL LINGUISTICS: PRESENT AND PROSPECTIVE
CLASSIFICATION OF APPLIED LINGUISTIC SYSTEMS
AUTOMATIC HYPHENATION
SPELL CHECKING
GRAMMAR CHECKING
STYLE CHECKING
REFERENCES TO WORDS AND WORD COMBINATIONS
INFORMATION RETRIEVAL
TOPICAL SUMMARIZATION
AUTOMATIC TRANSLATION
NATURAL LANGUAGE INTERFACE
EXTRACTION OF FACTUAL DATA FROM TEXTS
TEXT GENERATION
SYSTEMS OF LANGUAGE UNDERSTANDING
RELATED SYSTEMS
CONCLUSIONS
IV. LANGUAGE AS A MEANING Û TEXT TRANSFORMER
POSSIBLE POINTS OF VIEW ON NATURAL LANGUAGE
LANGUAGE AS A BI-DIRECTIONAL TRANSFORMER
TEXT, WHAT IS IT?
MEANING, WHAT IS IT?
TWO WAYS TO REPRESENT MEANING
DECOMPOSITION AND ATOMIZATION OF MEANING
NOT-UNIQUENESS OF MEANING Þ TEXT MAPPING: SYNONYMY
NOT-UNIQUENESS OF TEXT Þ MEANING MAPPING: HOMONYMY
MORE ON HOMONYMY
MULTISTAGE CHARACTER OF THE MEANING Û TEXT TRANSFORMER
TRANSLATION AS A MULTISTAGE TRANSFORMATION
TWO SIDES OF A SIGN
LINGUISTIC SIGN
LINGUISTIC SIGN IN THE MMT
LINGUISTIC SIGN IN HPSG
ARE SIGNIFIERS GIVEN BY NATURE OR BY CONVENTION?
GENERATIVE, MTT, AND CONSTRAINT IDEAS IN COMPARISON
CONCLUSIONS
V. LINGUISTIC MODELS
WHAT IS MODELING IN GENERAL?
NEUROLINGUISTIC MODELS
PSYCHOLINGUISTIC MODELS
FUNCTIONAL MODELS OF LANGUAGE
RESEARCH LINGUISTIC MODELS
COMMON FEATURES OF MODERN MODELS OF LANGUAGE
SPECIFIC FEATURES OF THE MEANING Û TEXT MODEL
REDUCED MODELS
DO WE REALLY NEED LINGUISTIC MODELS?
ANALOGY IN NATURAL LANGUAGES
EMPIRICAL VERSUS RATIONALIST APPROACHES
LIMITED SCOPE OF THE MODERN LINGUISTIC THEORIES
CONCLUSIONS
EXERCISES
The role of natural language processing
We live in the age of information. It pours upon us from the pages of newspapers and magazines, radio loudspeakers, TV and computer screens. The main part of this information has the form of natural language texts. Even in the area of computers, a larger part of the information they manipulate nowadays has the form of a text. It looks as if a personal computer has mainly turned into a tool to create, proofread, store, manage, and search for text documents.
Our ancestors invented natural language many thousands of years ago for the needs of a developing human society. Modern natural languages are developing according to their own laws, in each epoch being an adequate tool for human communication, for expressing human feelings, thoughts, and actions. The structure and use of a natural language is based on the assumption that the participants of the conversation share a very similar experience and knowledge, as well as a manner of feeling, reasoning, and acting. The great challenge of the problem of intelligent automatic text processing is to use unrestricted natural language to exchange information with a creature of a totally different nature: the computer.
For the last two centuries, humanity has successfully coped with the automation of many tasks using mechanical and electrical devices, and these devices faithfully serve people in their everyday life. In the second half of the twentieth century, human attention has turned to the automation of natural language processing. People now want assistance not only in mechanical, but also in intellectual efforts. They would like the machine to read an unprepared text, to test it for correctness, to execute the instructions contained in the text, or even to comprehend it well enough to produce a reasonable response based on its meaning. Human beings want to keep for themselves only the final decisions.
The necessity for intelligent automatic text processing arises mainly from the following two circumstances, both being connected with the quantity of the texts produced and used nowadays in the world:
Millions and millions of persons dealing with texts throughout the world do not have enough knowledge and education, or just time and a wish, to meet the modern standards of document processing. For example, a secretary in an office cannot take into consideration each time the hundreds of various rules necessary to write down a good business letter to another company, especially when he or she is not writing in his or her native language. It is just cheaper to teach the machine once to do this work, rather than repeatedly teach every new generation of computer users to do it by themselves.
In many cases, to make a well-informed decision or to find information, one needs to read, understand, and take into consideration a quantity of texts thousands times larger than one person is physically able to read in a lifetime. For example, to find information in the Internet on, let us say, the expected demand for a specific product in the next month, a lot of secretaries would have to read texts for a hundred years without eating and sleeping, looking through all the documents where this information might appear. In such cases, using a computer is the only possible way to accomplish the task.
Thus, the processing of natural language has become one of the main problems in information exchange. The rapid development of computers in the last two decades has made possible the implementation of many ideas to solve the problems that one could not even imagine being solved automatically, say, 45 years ago, when the first computers appeared.
Intelligent natural language processing is based on the science called computational linguistics. Computational linguistics is closely connected with applied linguistics and linguistics in general. Therefore, we shall first outline shortly linguistics as a science belonging to the humanities.