Fundamentals of Dask Transcripts
Chapter: Dask Bag
Lecture: Dask bag limitations

Login or purchase this course to watch this video and the rest of the course contents.
0:00 To wrap up on Dask Bag, I'd like to tell you about some of the limitations of Dask Bag.
0:06 So, firstly Dask Bag doesn't always perform that well on computations that include inter worker communication
0:11 which is due to restrictions in the default multi processing scheduler and we'll see this in the next chapter. On top of that bag,
0:19 operations are slower than 'Array Data Frame computations' as we saw in the previous video does
0:26 Python, of course, is slower than 'NumPy' or 'Pandas' for these types of operations 'groupby' is slow and you should use 'foldby' if possible.
0:34 As we've also already discussed. On top of this, note that Bags are immutable and so you cannot change individual elements.
0:43 Now, if you're excited by Bags and want to use them for your work,
0:47 we've provided a list of references in the notebook and I'd also encourage you to check
0:52 out the wonderful Dask documentation that the open source community has built for us.


Talk Python's Mastodon Michael Kennedy's Mastodon