Ebook: Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
Author: Valliappa Lakshmanan
- Genre: Computers // Cybernetics: Artificial Intelligence
- Tags: Google Cloud Platform, Machine Learning, Data Science, Python, Apache Spark, Spark ML, Feature Engineering, Keras, TensorFlow, Pipelines, MapReduce, Hyperparameter Tuning, Logistic Regression, Dashboards, Google BigQuery, Google Dataflow, Google Pub/Sub, MLOps, Data Ingestion, Data Exploration, XGBoost, Google Cloud Dataproc, Google Vertex AI
- Year: 2022
- Publisher: O'Reilly Media
- City: Sebastopol, CA
- Edition: 2
- Language: English
- pdf
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.
Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.
You'll learn how to:
• Employ best practices in building highly scalable data and ML pipelines on Google Cloud
• Automate and schedule data ingest using Cloud Run
• Create and populate a dashboard in Data Studio
• Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
• Conduct interactive data exploration with BigQuery
• Create a Bayesian model with Spark on Cloud Dataproc
• Forecast time series and do anomaly detection with BigQuery ML
• Aggregate within time windows with Dataflow
• Train explainable machine learning models with Vertex AI
• Operationalize ML with Vertex AI Pipelines
Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.
You'll learn how to:
• Employ best practices in building highly scalable data and ML pipelines on Google Cloud
• Automate and schedule data ingest using Cloud Run
• Create and populate a dashboard in Data Studio
• Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
• Conduct interactive data exploration with BigQuery
• Create a Bayesian model with Spark on Cloud Dataproc
• Forecast time series and do anomaly detection with BigQuery ML
• Aggregate within time windows with Dataflow
• Train explainable machine learning models with Vertex AI
• Operationalize ML with Vertex AI Pipelines
Download the book Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning for free or read online
Continue reading on any device:
Last viewed books
Related books
{related-news}
Comments (0)