Fundamentals of Dask Transcripts
Chapter: Dask-ML
Lecture: Dask in the cloud

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Welcome back Now. It's time to talk about Dask in the cloud. What do I even mean by this and why would we want to do something along
0:09 these lines when we've seen how we can leverage the distributed computation of our local workstations
0:13 The truth is scaling to the cloud can help us a great deal larger workflows will benefit a huge amount from more computational resources,
0:25 such as large clusters, which you will not or may not have locally.
0:29 So there are cloud services that can help you leverage these types of clusters such as Amazon Web Services, Google Cloud Platform,
0:38 Microsoft Azure and so on. How do you get Dask Up and running on these services?
0:44 There are different types of Dask Cloud Deployments such as a 'Kubernetes integration',
0:49 'Yarn integration' among many others. Now there are a significant number to choose from and you will need to know a bunch about containerization,
0:58 Dockerization, maybe kubernetes these types of things in order to get this done
1:03 There are also significant challenges such as environment and data management.
1:09 These involved questions such as all the machines have all the same software installed.
1:15 Can many people share the same hardware and where is the actual data? Another challenge that's involved with cloud deployments,
1:22 security and compliance, which your team leads and IT will be very much interested in These are questions such as authentication,
1:30 do they have access to these machines and security? What stops others from connecting and running arbitrary code as me or you?
1:37 The user. Now, there's another challenge which is cost management and this is this is huge. You want to know what will stop a novice from jumping on an
1:46 idling 100 GPUs. You want to track costs so you want to know how
1:51 much money is everyone spending and you want to optimize costs and optimize workflow for cost So how do we profile and tune for cost?
1:58 So if you're going to get up and running on the cloud, these are the types of questions that you'll need to answer.
2:04 So I gave a talk at the 'Dask distributed summit' in 2021 about getting Dask working on the cloud and hoping to get Dask available to everyone.
2:13 And I encourage you to check that out a "bit.ly/task" for everyone if
2:17 you're interested. But what's happening next is we're going to jump into a notebook and
2:22 check out how to get Dask Up and running on the cloud with a particular service called Coiled and Disclaimer. I work for Coiled and I love it a lot.
2:30 I'll see you in the notebook.


Talk Python's Mastodon Michael Kennedy's Mastodon