Fundamentals of Dask Transcripts
Chapter: Dask Bag
Lecture: Dask bag to Dask DataFrame

Login or purchase this course to watch this video and the rest of the course contents.
0:00 Congrats on making it through that checkpoint. We've worked a bunch with Dask Bags. Sometimes though,
0:07 We really want to be working with 'Dask Data Frames'. Okay, so the 'Dask Bag' can be used for simple analysis but 'Dask Data Frame'
0:14 and 'Dask Arrays' are more useful sometimes for complex operations. One way to think about it is that they're faster than Dask Bags.
0:23 For the same reason that 'Pandas' and 'NumPy' are faster than Python. They also have more functionalities suited for data analysis.
0:31 How do we do it? Well, we have a wonderful "to_dataframe( )" method. Once again, we recreate our bag from before from our json files.
0:39 Then we apply the 'to_dataframe' method and then we let's check out the head of the data frame. All right, and look at that. So having done all that,
0:49 remember, it's good to be tidy in your workspace. So let's close the cluster with "client.close( )" we'll be back for one last video
0:57 on Dask Bag to talk about Dask Bags limitations and to provide some further references.


Talk Python's Mastodon Michael Kennedy's Mastodon