Intro to ARC series: “Introduction to DASK”

This course is a part of SHARCNET's ongoing "Introduction to Advanced Research Computing" series of online courses for 2021-2022.

Course Syllabus:

Some common libraries for data analytics in Python, such as Numpy, Pandas, Scikit-Learn, etc. usually work well if the dataset fits into the existing RAM on a single machine. However, when dealing with large datasets, it can be a significant challenge to work around such memory constraints. This is where Dask can help. Dask provides a framework and libraries that can handle large datasets on a single multi-core machine or on a cluster.

This course provides an introduction to Dask.

