SDSC6009

Skip to navigation

Skip to content

COURSES >>>

SDSC6009 - Machine Learning at Scale

Offering Academic Unit	Department of Data Science
Credit Units	3
Course Duration	One Semester
Pre-requisite(s)	SDSC5001 Statistical Machine Learning I
*Course Offering Term:**	Not offering in current academic year

* The offering term is subject to change without prior notice

Course Aims

This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. The course covers principles of scaling machine learning process under big data via deploying the MapReduce parallel computing. In addition, the hands-on algorithmic design and development of machine learning algorithms in parallel computing environments (Spark) will be discussed. Students will use MapReduce parallel computing frameworks for machine learning in industrial applications and deployments for various fields, including advertising, finance, healthcare, and search engines.

Assessment (Indicative only, please check the detailed course information)

Continuous Assessment: 65%

Examination: 35%

Examination Duration: 2 hours

Detailed Course Information

SDSC6009.pdf