EtusivuHae koulutuksiaData Handling in Python

Data Handling in Python


Koulutusmuoto

Remote


Kesto

3 päivää


Hinta

2263 €

This three day course is aimed at those wishing to learn how to use Python to work with and handle Data. When combined with our Introduction to Data Science course you would be set up well to follow a Python learning journey into Data Science, Machine Learning, and Artificial Intelligence.

During the programme you will be introduced to Python and specific development environments and packages for working with Data, with a focus on NumPy, Pandas, Matplotlib, and Seaborn.

Along the way you will see how to clean and manipulate tabular data, apply simple statistical techniques and data visualisations, and learn about how to control the flow of your program in order to automate processes.

Throughout the course you will engage with activities and discussions with one of our Data Science technical specialists and complete technical lab activities to practice the techniques you have learnt and develop ideas for further practice.

  • To apply your knowledge of data practically using Python for handling data in roles that involve data analysis, data engineering, data science, machine learning and AI, and Data related Ops roles.
  • If you are in a Software or IT related role where you work with Python, this course will support you in learning how to work with Data.
  • To ensure you have the necessary pre-requisite knowledge when combined with Introduction to Data Science should you wish to progress onto Data Science and Machine Learning with Python.

No prior experience with Python is necessary, though it is assumed that you will be familiar with core data concepts such as simple table structures and data types – all the pre-requisites you need are covered by our Data Fundamentals course.

1. Introduction to Programming for Data Handling

  • Describe the pros and cons of using programming languages to work with data
  • Identify the languages most suitable for data handling
  • Explain the challenges of using programming languages versus data analysis tools

2. Introduction to Python and IDEs

  • Describe the key attributes of the Python programming language.
  • Explain the role of the Jupyter IDE for Python programming.
  • Use the Jupyter IDE to write a basic Python program.
  • Write a program which uses string, integer, float and boolean data types.

3. Data Structures, Flow Control, Functions, and Basic Types

  • Construct collections to solve data problems.
  • Utilise selection and iteration syntax to control the flow of a Python program.
  • Write reusable functions which can be used to alter data & automate repetitive tasks.
  • Use Python's built-in open function to create, read, and edit files.

4. Mathematical and Statistical Programming with NumPy

  • Describe the core features of NumPy arrays.
  • Create, index, and manipulate NumPy arrays to solve data problems.
  • Use masking and querying syntax to retrieve desired values.
  • Use vectorised ufuncs.

5. Introduction to Pandas

  • Create, manipulate, and alter Series and DataFrames with Pandas.
  • Define and change the indices of Series & Dataframes.
  • Use Pandas' functions and methods to change column types, compute summary statistics and aggregate data.
  • Read, manipulate, and write data from csv, xlsx, json and other structured file formats.

6. Data Cleaning with Pandas

  • Identify missing data and apply techniques to deal with it.
  • Deduplicate, transform and replace values.
  • Use DataFrame string methods to manipulate text data.
  • Write regular expressions which munge text data.

7. Data Manipulation with Pandas

  • Construct Pivot tables in Pandas.
  • Time series manipulation.
  • Stream data into Pandas to handle data size problems.

8. Methods for Visualising Data

  • Construct and tailor basic data visualisations using Matplotlib & Seaborn for both numeric & non-numeric data.
  • Meaningfully visualise aggregate data using Matplotlib and Seaborn.