Ebook: Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch
Author: Adi Polak
- Genre: Computers // Cybernetics: Artificial Intelligence
- Tags: Machine Learning, Unsupervised Learning, Supervised Learning, Python, Clustering, Apache Spark, Feature Engineering, TensorFlow, Distributed Systems, Monitoring, Pipelines, Deployment, Hyperparameter Tuning, scikit-learn, Ensemble Learning, PyTorch, PySpark, Spark MLlib, Descriptive Statistics, Workflows, Data Ingestion, Data Preprocessing, MLflow, Petastorm
- Year: 2023
- Publisher: O'Reilly Media
- City: Sebastopol, CA
- Edition: 1
- Language: English
- pdf
Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.
Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.
You will:
• Explore machine learning, including distributed computing concepts and terminology
• Manage the ML lifecycle with MLflow
• Ingest data and perform basic preprocessing with Spark
• Explore feature engineering, and use Spark to extract features
• Train a model with MLlib and build a pipeline to reproduce it
• Build a data system to combine the power of Spark with deep learning
• Get a step-by-step example of working with distributed TensorFlow
• Use PyTorch to scale machine learning and its internal architecture
Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.
You will:
• Explore machine learning, including distributed computing concepts and terminology
• Manage the ML lifecycle with MLflow
• Ingest data and perform basic preprocessing with Spark
• Explore feature engineering, and use Spark to extract features
• Train a model with MLlib and build a pipeline to reproduce it
• Build a data system to combine the power of Spark with deep learning
• Get a step-by-step example of working with distributed TensorFlow
• Use PyTorch to scale machine learning and its internal architecture
Download the book Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch for free or read online
Continue reading on any device:
Last viewed books
Related books
{related-news}
Comments (0)