Ebook: Syntactic Wordclass Tagging
- Tags: Computational Linguistics, Information Storage and Retrieval, Artificial Intelligence (incl. Robotics), Probability Theory and Stochastic Processes
- Series: Text Speech and Language Technology 9
- Year: 1999
- Publisher: Springer Netherlands
- Edition: 1
- Language: English
- pdf
In both the linguistic and the language engineering community, the creation and use of annotated text collections (or annotated corpora) is currently a hot topic. Annotated texts are of interest for research as well as for the development of natural language pro cessing (NLP) applications. Unfortunately, the annotation of text material, especially more interesting linguistic annotation, is as yet a difficult task and can entail a substan tial amount of human involvement. Allover the world, work is being done to replace as much as possible of this human effort by computer processing. At the frontier of what can already be done (mostly) automatically we find syntactic wordclass tagging, the annotation of the individual words in a text with an indication of their morpho syntactic classification. This book describes the state of the art in syntactic wordclass tagging. As an attempt to give an overall view of the field, this book is of interest to (at least) two, possibly very different, types of reader. The first type consists of those people who are using, or are planning to use, tagged material and taggers. They will want to know what the possibilities and impossibilities of tagging are, but are not necessarily interested in the internal working of automatic taggers. This, on the other hand, is the main interest of our second type of reader, the builders of automatic taggers and other natural language processing software.
This book provides an in-depth discussion of the field of syntactic wordclass tagging, i.e. the annotation of the words in a text with tags indicating their syntactic properties. Represented are the viewpoints of the two main groups who take an interest in tagging: the users of tagged text and the developers of tagging software.
The book starts out by examining the field foremost from the user's point of view. After a brief historical overview, the nature and uses of tagging are discussed and current practice is described. Here the user will find what tagging is and the software developer what it is the user wants.
The book then switches to the other point of view and continues with a detailed explanation of the most common computational techniques for automatically tagging large amounts of text. Here the software developer finds information needed for the implementation of a tagger while the user gains insight into the possibilities and impossibilities of automatic tagging and how computer-provided tags should be interpreted.
This book provides an in-depth discussion of the field of syntactic wordclass tagging, i.e. the annotation of the words in a text with tags indicating their syntactic properties. Represented are the viewpoints of the two main groups who take an interest in tagging: the users of tagged text and the developers of tagging software.
The book starts out by examining the field foremost from the user's point of view. After a brief historical overview, the nature and uses of tagging are discussed and current practice is described. Here the user will find what tagging is and the software developer what it is the user wants.
The book then switches to the other point of view and continues with a detailed explanation of the most common computational techniques for automatically tagging large amounts of text. Here the software developer finds information needed for the implementation of a tagger while the user gains insight into the possibilities and impossibilities of automatic tagging and how computer-provided tags should be interpreted.
Content:
Front Matter....Pages i-xvii
Front Matter....Pages 1-1
Orientation....Pages 3-7
A Short History of Tagging....Pages 9-21
The Use of Tagging....Pages 23-36
Tagsets....Pages 37-54
Standards for Tagsets....Pages 55-80
Performance of Taggers....Pages 81-94
Selection and Operation of Taggers....Pages 95-104
Front Matter....Pages 107-107
Automatic Taggers: An Introduction....Pages 109-115
Tokenization....Pages 117-133
Lexicons for Tagging....Pages 135-147
Standardization in the Lexicon....Pages 149-174
Morphological Analysis....Pages 175-205
Tagging Unknown Words....Pages 207-216
Hand-Crafted Rules....Pages 217-246
Corpus-Based Rules....Pages 247-262
Hidden Markov Models....Pages 263-284
Machine Learning Approaches....Pages 285-304
Back Matter....Pages 305-334
This book provides an in-depth discussion of the field of syntactic wordclass tagging, i.e. the annotation of the words in a text with tags indicating their syntactic properties. Represented are the viewpoints of the two main groups who take an interest in tagging: the users of tagged text and the developers of tagging software.
The book starts out by examining the field foremost from the user's point of view. After a brief historical overview, the nature and uses of tagging are discussed and current practice is described. Here the user will find what tagging is and the software developer what it is the user wants.
The book then switches to the other point of view and continues with a detailed explanation of the most common computational techniques for automatically tagging large amounts of text. Here the software developer finds information needed for the implementation of a tagger while the user gains insight into the possibilities and impossibilities of automatic tagging and how computer-provided tags should be interpreted.
Content:
Front Matter....Pages i-xvii
Front Matter....Pages 1-1
Orientation....Pages 3-7
A Short History of Tagging....Pages 9-21
The Use of Tagging....Pages 23-36
Tagsets....Pages 37-54
Standards for Tagsets....Pages 55-80
Performance of Taggers....Pages 81-94
Selection and Operation of Taggers....Pages 95-104
Front Matter....Pages 107-107
Automatic Taggers: An Introduction....Pages 109-115
Tokenization....Pages 117-133
Lexicons for Tagging....Pages 135-147
Standardization in the Lexicon....Pages 149-174
Morphological Analysis....Pages 175-205
Tagging Unknown Words....Pages 207-216
Hand-Crafted Rules....Pages 217-246
Corpus-Based Rules....Pages 247-262
Hidden Markov Models....Pages 263-284
Machine Learning Approaches....Pages 285-304
Back Matter....Pages 305-334
....