Koulutus
Overview
In this course, you learn about data engineering on Google Cloud, the roles and responsibilities of data engineers, and how those map to offerings provided by Google Cloud. You also learn about ways to address data engineering challenges.
Prerequisites
Before starting this learning path, you should already have:
- Prior Google Cloud experience at the fundamental level using Cloud Shell and accessing products from the Google Cloud console.
- Basic proficiency with a common query language such as SQL.
- Experience with data modelling and ETL (extract, transform, load) activities.
- Experience developing applications using a common programming language such as Python.
Target audience
The target audience is aimed at: Data engineers, Database administrators and System administrators.
Objectives
By the end of this course, learners will:
- Understand the role of a data engineer.
- Identify data engineering tasks and core components used on Google Cloud.
- Understand how to create and deploy data pipelines of varying patterns on Google Cloud.
- Identify and utilize various automation techniques on Google Cloud.
Outline
Data Engineering Tasks and Components
- Explain the role of a data engineer.
- Understand the differences between a data source and a data sink.
- Explain the different types of data formats.
- Explain the storage solution options on Google Cloud.
- Learn about the metadata management options on Google Cloud.
- Understand how to share datasets with ease using Analytics Hub.
- Understand how to load data into BigQuery using the Google Cloud console or the gcloud CLI.
Data Replication and Migration
- Explain the baseline Google Cloud data replication and migration architecture.
- Understand the options and use cases for the gcloud command-line tool.
- Explain the functionality and use cases for Storage Transfer Service.
- Explain the functionality and use cases for Transfer Appliance.
- Understand the features and deployment of Datastream.
The Extract and Load Data Pipeline Pattern
- Explain the baseline extract and load architecture diagram.
- Understand the options of the bq command-line tool.
- Explain the functionality and use cases for BigQuery Data Transfer Service.
- Explain the functionality and use cases for BigLake as a non-extract-load Pattern.
The Extract, Load, and Transform Data Pipeline Pattern
- Explain the baseline extract, load, and transform architecture diagram.
- Understand a common ELT pipeline on Google Cloud.
- Learn about BigQuery’s SQL scripting and scheduling capabilities.
- Explain the functionality and use cases for Dataform.
The Extract, Transform and Load Data Pipeline Pattern
- Explain the baseline extract, transform, and load architecture diagram.
- Learn about the GUI tools on Google Cloud used for ETL data pipelines.
- Explain batch data processing using Dataproc.
- Learn how to use Dataproc Serverless for Spark for ETL.
- Explain streaming data processing options.
- Explain the role Bigtable plays in data pipelines.
Automation Techniques
- Explain the automation patterns and options available for pipelines.
- Learn about Cloud Scheduler and Workflows.
- Learn about Cloud Composer.
- Learn about Cloud Run functions.
- Explain the functionality and automation use cases for Eventarc.
Exams and Assessments
No formal examinations for this course.
Hands-on Learning
Within this course there are opportunities for learners to engage in hands-on labs to support module learning.
In addition, each module will also have a quiz, to support knowledge capture.
Osta liput
QA’s online-courses from Tieturi
Questions about QA courses?
Find out how QA’s live online courses work, what you need to participate, and what to expect before booking your training.
Accreditation and trademark notice
ITIL® and PRINCE2® courses are provided by QA Ltd, an ATO of People Cert.
ITIL®, PRINCE2® are registered trademarks of the PeopleCert group. Used under licence from PeopleCert. All rights reserved.
TOGAF® is a registered trademark of The Open Group.