Welcome! I am an Assistant Professor of Data Science at the New York Institute of Technology (NYIT), Vancouver campus. I hold a Msc and a PhD in Statistics from the University of Manitoba, Canada, where my research—guided by Prof. Saman Muthukumarana and Dr. Mike Domaratzki—focused on creating innovative methods to address class imbalance in classification tasks, enhancing both accuracy and interpretability.
With over nine years of combined academic, industry and government experience, I have worked at the intersection of statistics, machine learning, and real-world applications, delivering impactful solutions in public health, education, and beyond. My passion lies in developing methods that are not only powerful, but also interpretable and accessible, enabling data-driven decision-making for meaningful change.
Last updated: August 15, 2025
Matharaarachchi S., Domaratzki M, Muthukumarana S. (2024). “Enhancing SMOTE for Imbalanced Data with Abnormal Minority Instances.” Machine Learning with Applications.
Matharaarachchi, S., M. Domaratzki, and S. Muthukumarana (2021). “Assessing feature selection method performance with class imbalance data.” Machine Learning with Applications. This paper was awarded with the Reproducibility Badge Initiative (RBI).
Matharaarachchi S., Domaratzki M., Katz A., Muthukumarana S. (2022). “Discovering Long COVID Symptom Patterns: Association Rule Mining and Sentiment Analysis in Social Media Tweets.” JMIR Form Res.
Matharaarachchi S., Domaratzki M., Marasinghe C., Muthukumarana S., and Tennakoon V. (2022). “Modeling and Feature Assessment of the Sleep Quality among Chronic Kidney Disease Patients.” Sleep Epidemiology.
Matharaarachchi S., Domaratzki M., Muthukumarana S. (2022). “Minimizing features while maintaining performance in data classification problems.” PeerJ Computer Science 8:e1081.
Enns, J., Katz, A., Yogendran, M., Urquia, M., Muthukumarana S., Matharaarachchi, S., Singer, A., Nickel, N., Star, L., Cavett, T., Keynan, Y., Lix, L. and Sanchez-Ramirez, D. (2022) “A population data-driven approach to identifying ‘Long COVID’ cases in support of diagnosis and treatment.” International Journal of Population Data Science, 7(3).
Katz, A., Ekuma, O., Enns, J., Cavett, T., Singer, A., Sanchez-Ramirez, D., Keynan, Y., Lix, Y., Walld, R., Yogendran, M., Nickel, N., Urquia, M., Star, L., Olafson, K., Logsetty, S., Spiwak, R., Waruk, J., Matharaarachichi, S. (2025) “Identifying people with post-COVID condition using linked, population-based administrative health data from Manitoba, Canada: prevalence and predictors in a cohort of COVID-positive individuals.” BMJ open, 15 (1), e087920.
Matharaarachchi S., M. Turgeon, M. Domaratzki, S. Muthukumarana. (2025). “Sequential Bayesian Estimation of the F1 Score Using the Dirichlet-Multinomial Model.” International Journal of Data Science and Analytics.
Matharaarachchi S., M. Domaratzki, A. Katz, S. Muthukumarana. (2024). “Long COVID Prediction in Manitoba Using Clinical Notes Data: A Machine Learning Approach.”
Matharaarachchi S., M. Domaratzki, S. Muthukumarana. (2024). “Deep-ExtSMOTE: Integrating Autoencoders for Advanced Mitigation of Class Imbalance in High-Dimensional Data Classification.” Journal of Big Data Research.
GPA: 4.13/4.5
Thesis: New Developments for Addressing Class Imbalance Issue in Classification Tasks.
GPA: 3.8/4.0
First two years included coursework in Mathematics, Computer Science, and Statistics.
Dissertation: Study on Parliamentary General Electoral Systems in Sri Lanka.
Fall 2025
Fall 2025
Summer 2022
Year | Course | Term |
---|---|---|
2022 | STAT 4150 - Bayesian Analysis and Computing | Fall 2022 |
2022 | STAT 7270 - Bayesian Inference | Fall 2022 |
2022 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Winter 2022 |
2021 | STAT 4150 - Bayesian Analysis and Computing | Fall 2021 |
2022 | STAT 7270 - Bayesian Inference | Fall 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Fall 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 | Summer 2021 |
2021 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Winter 2021 |
2020 | STAT 2000 - Basic Statistical Analysis 2 (n=2) | Fall 2020 |
2020 | STAT 4150 - Bayesian Analysis and Computing | Winter 2020 |
2022 | STAT 7270 - Bayesian Inference | Winter 2020 |
2020 | STAT 1000 - Basic Statistical Analysis 1 (n=2) | Winter 2020 |
2019 | STAT 1000 - Basic Statistical Analysis 1 | Fall 2019 |
Note: n = Number of sections conducted for the same course.
Statistical Society of Canada (SSC) Annual Meeting 2025
International Statistics Conference 2024 (ISC2024), Colombo, Sri Lanka.
International Statistics Conference 2024 (ISC2024), Colombo, Sri Lanka.
PhD Theis Defense, Department of Statistics, Faculty of Graduate Studies, University of Manitoba.
Faculty of Graduate Studies, University of Manitoba.
4th International Conference on Future of Preventive Medicine & Public Health (Future of PMPH 2024).
2024 WNAR/IMS/Graybill Annual Meeting, Fort Collins, Colorado - Student Paper Competition presentation.
CANSSI Show Case 2023
Data to Action Day 2023, organized by the Data Science Program, Government of Manitoba.
Statistical Society of Canada (SSC) Annual Meeting 2022.
Joint Statistical Meetings (JSM) 2021.
MSc Theis Defense, Department of Statistics, Faculty of Graduate Studies, University of Manitoba.
Machine Learning, Statistical Learning, Classification, Algorithmic Approaches, Deep Learning Techniques, Bayesian Methods, High Dimensional Data Analysis, Computational Statistics
My research focuses on machine learning, NLP, and statistical learning, with a special interest in high dimensional data analysis, feature engineering, class imbalance, LLMs, knowledge representation, and model optimization. I work on methods for high-dimensional data and data imbalance, developing Bayesian approaches and resampling techniques that enhance model accuracy in fields like healthcare and education. Additionally, I aim to bridge theory and practice, creating efficient, interpretable models that offer reliable, actionable insights for real-world applications.