SDSC6009 - Machine Learning at Scale | ||||||||||
| ||||||||||
* The offering term is subject to change without prior notice | ||||||||||
Course Aims | ||||||||||
This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. The course covers principles of scaling machine learning process under big data via deploying the MapReduce parallel computing. In addition, the hands-on algorithmic design and development of machine learning algorithms in parallel computing environments (Spark) will be discussed. Students will use MapReduce parallel computing frameworks for machine learning in industrial applications and deployments for various fields, including advertising, finance, healthcare, and search engines. | ||||||||||
Assessment (Indicative only, please check the detailed course information) | ||||||||||
Continuous Assessment: 65% | ||||||||||
Examination: 35% | ||||||||||
Examination Duration: 2 hours | ||||||||||
Detailed Course Information | ||||||||||
SDSC6009.pdf | ||||||||||
Useful Links | ||||||||||
Department of Data Science |