Mathematics and statistics concepts for machine learning and data science (application in healthcare)

Speaker
Day 3, 10:30-12:00 am (lecture)
Richard J Munthali

🐦 @RichardMunthali

This lecturer will aim to systematically help participants to “master” and “recognize” the core concepts in statistics and mathematics which are used at different stages in machine learning and data science projects. Concepts like statistics and probability, descriptive statistics, distributions, hypothesis testing, missing data, clustering, regression, classification, errors, bias, sensitivity and specificity, linear algebra (optimization) and multivariate calculus will be discussed. We will explore how to connect these concepts and when to use them depending on the problem at hand. We will go through practical examples in healthcare data using python and Jupyter notebooks. From Exploratory data analysis (EDA) to model building, model selection and saving the best model and use it to predict new data. We will use some statistical and mathematical concepts to choose the best metrics at each point. Also, we will use the SHapley Additive exPlanations (SHAP) for explainability of our models.

Speaker biography

Dr. Richard Munthali is a Statistician at the Wits Reproductive Health and HIV Institute (WRHI), Wits University and an independent data scientist/analyst consultant. He completed a PhD in Bioinformatics from University of the Witwatersrand in 2017. Dr Munthali’s PhD research focused on longitudinal genetic risk factors of obesity in African populations. He did spend some time at the University of Cambridge, UK and Emory University, USA as a visiting PhD researcher under different fellowships. Dr Munthali also holds an MSc in Mathematical Science from Stellenbosch University, South Africa and did his undergraduate studies in Mathematical Sciences Education (Statistics and Computing) at the Polytechnic under the University of Malawi.

After his PhD, Richard did his postdoctoral training at the University of British Columbia, Canada in statistical genetics. He was involved in the development and application of algorithms for analyzing large-scale biological data through bioinformatics, data science and statistics in the areas of longitudinal genetic and epigenetic association studies, targeted whole genome bisulfite sequencing, and the role of DNA methylation patterns in the aetiology of asthma and other allergic conditions.

Throughout his academic and professional endeavors, Dr Munthali has attended seminars and bootcamps that helped him to gain skills in data science and it’s applications in different fields. He worked on projects that applied data science techniques in healthcare, finance and security. His interests include development and application of mathematical, statistical and computational tools/algorithms in healthcare predictive and prescriptive analytics such health status prediction, real time patient monitoring and disease surveillance, pricing and readmission among others, using lifestyle, clinical, environmental and genetic health care data.