Getting started with Dask: High Performance Data Science Course
Course Summary
This course will get you up to speed with Dask and show you how to easily convert pandas workloads to blazing Dask clusters (locally across cores or scaled-out across cloud servers). Future courses will get you up to speed with Dask and Pythonic distributed computation in other settings, such as machine learning.
What students are saying
Source code and course GitHub repository
github.com/coiled/talkpython-getting-started-with-daskWhat's this course about and how is it different?
This free course is a quick and no-fluff introduction to Dask. It's authored by the folks over at Coiled who offer Dask as a Service, including Matthew Rocklin, one of the co-creators of Dask. So you know you're getting definitive information from people who use Dask in practice.
What topics are covered
In this course, you will:
- Explore the problem solved by Dask: What is big data and how can you work with it?
- Setup your computer to run Dask locally in a Jupyter notebook
- Learn the Dask API and how to use it
- Convert pandas code to Dask code
- Analyze the NYC taxicab data set with Dask on a local cluster
- Scale that same computation to the cloud at coiled.io
- Connect to local and remote Dask cluster visualization and reporting dashboards
- And lots more
View the full course outline.
Who is this course for?
This course is for anyone with basic Python language experience who would like to use Dask to process more data faster than pandas easily handles. You'll need to know things like variables, modules, import statements, and things like this. Be the Python code used is not deep or advanced so it should be broadly available to most.
Note: All software used during this course, including editors, Python language, etc., are 100% free and open source. You won't have to buy anything to take the course.
Get hands-on for almost every chapter
While watching videos is great to give you that high-level overview of what you need to know about a technology, nothing makes that skill your own like writing actual code and scaling data science computations in your notebooks.
In this course, you'll have access to all the source code at github.com/coiled/talkpython-dask-course. You're encouraged to follow along and play with the notebook throughout this course.
This course is delivered in very high resolution
This course is delivered in 1440p (4x the pixels as 720p). When you're watching the videos for this course, it will feel like you're sitting next to the instructor looking at their screen.
Every little detail, menu item, and icon is clear and crisp. Watch the introductory video at the top of this page to see an example.
Follow along with subtitles and transcripts
Each course comes with subtitles and full transcripts. The transcripts are available as a separate searchable page for each lecture. They also are available in course-wide search results to help you find just the right lecture.
Free office hours keep you from getting stuck
One of the challenges of self-paced online learning is getting stuck. It can be hard to get the help you need to get unstuck.
That's why at Talk Python Training, we offer live, online office hours. You drop in and join a group of fellow students to chat about your course progress and see solutions via screen sharing.
Just visit your account page to see the upcoming office hour schedule.
The time to act is now
If you are working with data using pandas or other data science libraries, you owe it to yourself to see how to process significantly larger datasets and how to run Python computation outside the grips of the GIL and across cores all the way out to across an entire cluster. This free, short course will get you up to speed in less than one hour!