Implementing a Machine Learning solution with Azure Databricks
in MicrosoftAbout this course
Course Overview
Azure Databricks is a cloud-scale platform for data analytics and machine learning. Data scientists and machine learning engineers can use Azure Databricks to implement machine learning solutions at scale.
Target Audience
This course is destinated to
- Data Scientists
- Data Engineers
- Data Analysts
- Machine Learning Engineers
- AI Developers
- Software Developers
- Cloud Solution Architects
- IT Managers and Decision Makers
- Business Intelligence Developers
- Anyone interested in learning about implementing machine learning solutions using Azure Databricks.
Course Objectives
During this course, you will learn to:
- Master Azure Databricks & Apache Spark architecture.
- Manage workspaces & clusters in Databricks.
- Utilize data storage options like Data Lake Storage, SQL Data Warehouse & Cosmos DB.
- Preprocess & clean data for machine learning models.
- Train & evaluate models for classification, regression, & clustering.
- Leverage AutoML for hyperparameter tuning.
- Deploy & manage models in production environments.
- Monitor & debug machine learning pipelines.
- Apply supervised & unsupervised learning techniques.
- Understand common machine learning algorithms & applications.
- Utilize Spark MLlib, TensorFlow, & PyTorch for model development.
- Perform feature engineering & dimensionality reduction.
- Implement data splitting, cross-validation, & evaluation techniques.
- Select & tune hyperparameters for optimal model performance.
- Interpret model results & explainability.
- Deploy models as services using MLflow & Databricks Model Serving.
- Integrate models with web applications & other systems.
- Monitor & diagnose model performance in production.
- Understand the business value of machine learning & its applications.
- Learn best practices for building & deploying machine learning solutions.
- Prepare for data scientist & machine learning engineer roles in the cloud.
Course Content
Module 1: Explore Azure Databricks
Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.
- Provision an Azure Databricks workspace.
- Identify core workloads and personas for Azure Databricks.
- Describe key concepts of an Azure Databricks solution.
Module 2: Use Apache Spark in Azure Databricks
Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to transform, analyze and visualize data at scale.
- Describe key elements of the Apache Spark architecture.
- Create and configure a Spark cluster.
- Describe use cases for Spark.
- Use Spark to process and analyze data stored in files.
- Use Spark to visualize data.
Module 3: Train a machine learning model in Azure Databricks
Machine learning involves using data to train a predictive model. Azure Databricks support multiple commonly used machine learning frameworks that you can use to train models.
- Prepare data for machine learning
- Train a machine learning model
- Evaluate a machine learning model
Module 4: Use MLflow in Azure Databricks
MLflow is an open source platform for managing the machine learning lifecycle that is natively supported in Azure Databricks.
- Use MLflow to log parameters, metrics, and other details from experiment runs.
- Use MLflow to manage and deploy trained models.
Module 5: Tune hyperparameters in Azure Databricks
Tuning hyperparameters is an essential part of machine learning. In Azure Databricks, you can use the Hyperopt library to optimize hyperparameters automatically.
- Use the Hyperopt library to optimize hyperparameters.
- Distribute hyperparameter tuning across multiple worker nodes.
Module 6: Use AutoML in Azure Databricks
AutoML in Azure Databricks simplifies the process of building an effective machine learning model for your data.
- Use the AutoML user interface in Azure Databricks
- Use the AutoML API in Azure Databricks
Module 7: Train deep learning models in Azure Databricks
Deep learning uses neural networks to train highly effective machine learning models for complex forecasting, computer vision, natural language processing, and other AI workloads.
- Train a deep learning model in Azure Databricks
- Distribute deep learning training by using the Horovod library
Course Prerequisites
This course assumes that you have experience of using Python to explore data and train machine learning models with common open source frameworks, like Scikit-Learn, PyTorch, and TensorFlow.