Ebook: Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery
Author: Graham Williams (auth.)
- Tags: Statistics for Engineering Physics Computer Science Chemistry and Earth Sciences
- Series: Use R
- Year: 2011
- Publisher: Springer-Verlag New York
- Edition: 1
- Language: English
- pdf
Data mining is the art and science of intelligent data analysis. By building knowledge from information, data mining adds considerable value to the ever increasing stores of electronic data that abound today. In performing data mining many decisions need to be made regarding the choice of methodology, the choice of data, the choice of tools, and the choice of algorithms.
Throughout this book the reader is introduced to the basic concepts and some of the more popular algorithms of data mining. With a focus on the hands-on end-to-end process for data mining, Williams guides the reader through various capabilities of the easy to use, free, and open source Rattle Data Mining Software built on the sophisticated R Statistical Software. The focus on doing data mining rather than just reading about data mining is refreshing.
The book covers data understanding, data preparation, data refinement, model building, model evaluation, and practical deployment. The reader will learn to rapidly deliver a data mining project using software easily installed for free from the Internet. Coupling Rattle with R delivers a very sophisticated data mining environment with all the power, and more, of the many commercial offerings.
Data Mining and Anlaytics are the foundation technologies for the new knowledge based world where we build models from data and databases to understand and explore our world. Data mining can improve our business, improve our government, and improve our life and with the right tools, any one can begin to explore this new technology, on the path to becoming a data mining professional. This book aims to get you into data mining quickly. Load some data (e.g., from a database) into the Rattle toolkit and within minutes you will have the data visualised and some models built. This is the first step in a journey to data mining and analytics. The book encourages the concept of programming by example and programming with data - more than just pushing data through tools, but learning to live and breathe the data, and sharing the experience so others can copy and build on what has gone before. It is accessible to many readers and not necessarily just those with strong backgrounds in computer science or statistics. Details of some of the more popular algorithms for data mining are very simply and, more importantly, clearly explained. Technology for transforming a database through data mining and machine learning into knowledge is now readily accessible.