Ebook: Mining Very Large Databases with Parallel Processing
- Tags: Data Structures Cryptology and Information Theory, Document Preparation and Text Processing
- Series: The Kluwer International Series on Advances in Database Systems 9
- Year: 2000
- Publisher: Springer US
- Edition: 1
- Language: English
- pdf
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely `intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms.
The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers.
It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science.
The primary audience for Mining Very Large Databases with ParallelProcessing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning.
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely `intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms.
The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers.
It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science.
The primary audience for Mining Very Large Databases with ParallelProcessing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning.
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely `intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms.
The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers.
It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science.
The primary audience for Mining Very Large Databases with ParallelProcessing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning.
Content:
Front Matter....Pages i-xiii
Introduction....Pages 1-4
Front Matter....Pages 5-5
Knowledge Discovery Tasks....Pages 7-17
Knowledge Discovery Paradigms....Pages 19-29
The Knowledge Discovery Process....Pages 31-40
Data Mining....Pages 41-50
Data Mining Tools....Pages 51-57
Front Matter....Pages 59-59
Basic Concepts on Parallel Processing....Pages 61-69
Data Parallelism, Control Parallelism, and Related Issues....Pages 71-78
Parallel Database Servers....Pages 79-86
Front Matter....Pages 87-87
Approaches to Speed Up Data Mining....Pages 89-108
Parallel Data Mining without DBMS Facilities....Pages 109-142
Parallel Data Mining with DBMS Facilities....Pages 143-172
Summary and Some Open Problems....Pages 173-179
Back Matter....Pages 181-208
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely `intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms.
The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers.
It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science.
The primary audience for Mining Very Large Databases with ParallelProcessing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning.
Content:
Front Matter....Pages i-xiii
Introduction....Pages 1-4
Front Matter....Pages 5-5
Knowledge Discovery Tasks....Pages 7-17
Knowledge Discovery Paradigms....Pages 19-29
The Knowledge Discovery Process....Pages 31-40
Data Mining....Pages 41-50
Data Mining Tools....Pages 51-57
Front Matter....Pages 59-59
Basic Concepts on Parallel Processing....Pages 61-69
Data Parallelism, Control Parallelism, and Related Issues....Pages 71-78
Parallel Database Servers....Pages 79-86
Front Matter....Pages 87-87
Approaches to Speed Up Data Mining....Pages 89-108
Parallel Data Mining without DBMS Facilities....Pages 109-142
Parallel Data Mining with DBMS Facilities....Pages 143-172
Summary and Some Open Problems....Pages 173-179
Back Matter....Pages 181-208
....