I recently completed my PhD in the Department of Statistics at the University of Manitoba, with guidance from Dr. Saman Muthukumarana and Dr. Mike Domaratzki. My academic path has been grounded in the fields of statistics and machine learning, enriched by an MSc in Statistics from the University of Manitoba and a BSc in Statistics from the University of Sri Jayewardenepura, Sri Lanka.
My research focused on creating innovative methods to tackle class imbalance in classification tasks, enhancing model accuracy and interpretability. With over seven years of combined industry and academic research experience, I’ve led the design and implementation of machine learning solutions across diverse domains, from public health to education.
Currently, I am working as a Data Scientist, at the Government of Manitoba, where I apply my skills to support data-driven decision-making and policy formulation, contributing to impactful projects in the public sector.
Matharaarachchi S., Domaratzki M, Muthukumarana S. (2024). “Enhancing SMOTE for Imbalanced Data with Abnormal Minority Instances.” Machine Learning with Applications.
Matharaarachchi, S., M. Domaratzki, and S. Muthukumarana (2021). “Assessing feature selection method performance with class imbalance data.” Machine Learning with Applications. This paper was awarded with the Reproducibility Badge Initiative (RBI).
Matharaarachchi S., Domaratzki M., Katz A., Muthukumarana S. (2022). “Discovering Long COVID Symptom Patterns: Association Rule Mining and Sentiment Analysis in Social Media Tweets.” JMIR Form Res.
Matharaarachchi S., Domaratzki M., Marasinghe C., Muthukumarana S., and Tennakoon V. (2022). “Modeling and Feature Assessment of the Sleep Quality among Chronic Kidney Disease Patients.” Sleep Epidemiology.
Matharaarachchi S., Domaratzki M., Muthukumarana S. (2022). “Minimizing features while maintaining performance in data classification problems.” PeerJ Computer Science 8:e1081.
Enns, J., Katz, A., Yogendran, M., Urquia, M., Muthukumarana S., Matharaarachchi, S., Singer, A., Nickel, N., Star, L., Cavett, T., Keynan, Y., Lix, L. and Sanchez-Ramirez, D. (2022) “A population data-driven approach to identifying ‘Long COVID’ cases in support of diagnosis and treatment.” International Journal of Population Data Science, 7(3).
Matharaarachchi S., M. Domaratzki, A. Katz, S. Muthukumarana. (2024). “Long COVID Prediction in Manitoba Using Clinical Notes Data: A Machine Learning Approach.” Intelligence-Based Medicine.
Matharaarachchi S., M. Domaratzki, S. Muthukumarana. (2024). “Deep-ExtSMOTE: Integrating Autoencoders for Advanced Mitigation of Class Imbalance in High-Dimensional Data Classification.” Journal of Data Science.
Katz A., O. Ekuma, J. E. Enns, T. Cavett, A. Singer, D. C. Sanchez-Ramirez, Y. Keynan, L. M. Lix, R. Walld, M. S. Yogendran, N. Nickel, M. L. Urquia, L. Star, K. Olafson, S. Logsetty, R. Spiwak, J. Waruk, S. Matharaarachichi. (2024). “Identifying people with post-COVID condition using linked, population-based administrative health data from Manitoba, Canada: Prevalence and predictors in the COVID-positive population.” BMJ Open.
GPA: 4.13/4.5
Thesis: New Developments for Addressing Class Imbalance Issue in Classification Tasks.
GPA: 3.8/4.0
First two years included coursework in Mathematics, Computer Science, and Statistics.
Dissertation: Study on Parliamentary General Electoral Systems in Sri Lanka.
Summer 2022
Year | Course | Term |
---|---|---|
2022 | STAT 4150 - Bayesian Analysis and Computing | Fall 2022 |
2022 | STAT 7270 - Bayesian Inference | Fall 2022 |
2022 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Winter 2022 |
2021 | STAT 4150 - Bayesian Analysis and Computing | Fall 2021 |
2022 | STAT 7270 - Bayesian Inference | Fall 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Fall 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 | Summer 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Winter 2021 |
2020 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Fall 2020 |
2020 | STAT 4150 - Bayesian Analysis and Computing | Winter 2020 |
2022 | STAT 7270 - Bayesian Inference | Winter 2020 |
2020 | STAT 1000 - Basic Statistical Analysis 1 (n=2) | Winter 2020 |
2019 | STAT 1000 - Basic Statistical Analysis 1 | Fall 2019 |
Note: n = Number of sections conducted for the same course.
International Statistics Conference 2024 (ISC2024), Colombo, Sri Lanka.
International Statistics Conference 2024 (ISC2024), Colombo, Sri Lanka.
PhD Theis Defense, Department of Statistics, Faculty of Graduate Studies, University of Manitoba.
Faculty of Graduate Studies, University of Manitoba.
4th International Conference on Future of Preventive Medicine & Public Health (Future of PMPH 2024).
2024 WNAR/IMS/Graybill Annual Meeting, Fort Collins, Colorado - Student Paper Competition presentation.
CANSSI Show Case 2023
Data to Action Day 2023, organized by the Data Science Program, Government of Manitoba.
Statistical Society of Canada (SSC) Annual Meeting 2022.
Joint Statistical Meetings (JSM) 2021.
MSc Theis Defense, Department of Statistics, Faculty of Graduate Studies, University of Manitoba.
Machine Learning, Large Language Models (LLM), NLP, Knowledge Representation, Model Optimization, Statistical Learning, Deep Learning Techniques, Data Imbalance, Bayesian Methods.
My research focuses on machine learning, NLP, and statistical learning, with a special interest in LLMs, knowledge representation, and model optimization. I work on methods for high-dimensional data and data imbalance, developing Bayesian approaches and resampling techniques that enhance model accuracy in fields like healthcare and education. Additionally, I aim to bridge theory and practice, creating efficient, interpretable models that offer reliable, actionable insights for real-world applications.