Koulutus
Overview
Azure Databricks is a cloud-scale platform for data analytics and machine learning. This course equips data scientists and machine learning engineers with the skills to implement scalable data processing and machine learning solutions using Azure Databricks. Participants will explore key concepts, including Apache Spark for data analytics, model training and tuning, AutoML, and MLflow for experiment tracking. The course emphasises data transformation, feature engineering, and model deployment. Through hands-on labs, learners will gain practical experience in building, training, and deploying models in a collaborative environment.
Prerequisites
Participants should have:
- Experience using Python to explore data and train machine learning models.
- Familiarity with common open-source frameworks such as Scikit-Learn, PyTorch, and TensorFlow.
- Recommended: Completion of the Create machine learning models learning path before starting this course.
Target audience
This course is designed for:
- Data scientists and machine learning engineers looking to scale their ML solutions.
- AI and data professionals working with large-scale data analytics and processing.
- Teams responsible for implementing machine learning workflows in Azure.
Objectives
By the end of this course, learners will be able to:
- Describe the capabilities of Azure Databricks and its role in both data analytics and machine learning.
- Use Apache Spark in Azure Databricks to process, transform, and analyse large-scale data.
- Train and evaluate machine learning models within Azure Databricks.
- Leverage MLflow for experiment tracking, model registration, and deployment.
- Optimise hyperparameters using Hyperopt and review trial results.
- Implement AutoML using both the Azure Databricks user interface and code-based approaches.
- Manage machine learning workflows in production, including data automation, versioning, and deployment strategies.
Outline
Explore Azure Databricks
- Overview of Azure Databricks as a cloud-scale platform for data analytics and machine learning
- Key concepts and workloads in Azure Databricks
- Data governance using Unity Catalog and Microsoft Purview
- Hands-on exercise: Exploring Azure Databricks
Use Apache Spark in Azure Databricks
- Introduction to Apache Spark and its role in large-scale data analytics
- Creating and managing Spark clusters
- Using Spark notebooks to process and transform large datasets
- Working with structured and unstructured data files in Spark
- Visualising data using Spark and Databricks
- Hands-on exercise: Using Spark for data analytics
Train a machine learning model in Azure Databricks
- Principles of machine learning and predictive modelling
- Preparing data for machine learning, including feature engineering and transformations
- Training machine learning models using Scikit-Learn, PyTorch, and TensorFlow
- Evaluating model performance using standard machine learning metrics
- Hands-on exercise: Training a machine learning model
Use MLflow in Azure Databricks
- Introduction to MLflow and its role in the machine learning lifecycle
- Running experiments and tracking performance metrics with MLflow
- Registering, serving, and managing models using MLflow
- Deploying trained models for inference within Databricks
- Hands-on exercise: Using MLflow for model management
Tune hyperparameters in Azure Databricks
- Understanding hyperparameter tuning and its impact on machine learning models
- Using Hyperopt for automated hyperparameter tuning in Azure Databricks
- Reviewing and analysing Hyperopt trials for optimisation insights
- Scaling Hyperopt trials for improved performance
- Hands-on exercise: Tuning model hyperparameters using Hyperopt
Use AutoML in Azure Databricks
- Overview of AutoML and its benefits in machine learning
- Running AutoML experiments via the Azure Databricks user interface
- Using Python code to execute AutoML workflows
- Comparing AutoML results with traditional model development
- Hands-on exercise: Using AutoML for machine learning model development
Train deep learning models in Azure Databricks
- Fundamentals of deep learning and neural networks
- Training deep learning models using PyTorch in Databricks
- Using TorchDistributor for distributed deep learning model training
- Deploying deep learning models for real-world AI tasks
- Hands-on exercise: Training and optimising deep learning models in Databricks
Manage machine learning in production with Azure Databricks
- Automating data transformations and machine learning workflows in Databricks
- Exploring model development, versioning, and lifecycle management
- Deploying models for real-time inference and decision-making
- Monitoring deployed models for performance and drift detection
- Hands-on exercise: Managing a machine learning model in production
Exams and assessments
This course does not include formal exams. Participants will complete interactive labs and knowledge checks to reinforce learning outcomes.
Hands-on learning
This course includes:
- Hands-on labs for data processing, model training, hyperparameter tuning, and model deployment.
- Practical exercises using Apache Spark, MLflow, AutoML, and Hyperopt.
- Real-world case studies on implementing scalable machine learning solutions.
Osta liput
QA’s online-courses from Tieturi
Questions about QA courses?
Find out how QA’s live online courses work, what you need to participate, and what to expect before booking your training.
Accreditation and trademark notice
ITIL® and PRINCE2® courses are provided by QA Ltd, an ATO of People Cert.
ITIL®, PRINCE2® are registered trademarks of the PeopleCert group. Used under licence from PeopleCert. All rights reserved.
TOGAF® is a registered trademark of The Open Group.