I am a Data Scientist. I love solving real data problems and helping stakeholders make better informed decisions. My passion is building models adding value in business.
I started learning to code my own custom functions when I was Studying Mathematics. Over time, I have gained experience and decided to pursue my graduate studies in Applied Statistics where I developped a new algorithm cPIE (calculated Panel Of Incompatible Epitopes) based on Incompatible Epitopes in Kidney Transplantation. At the same time I was developing my own Shiny Apps for Data Science Projects I worked on. I migrated to Python and I was able to incorporate the strength of different programming languages to carry out a data science project.
My areas of focus is applying data science to business and health care. My best skill is my solid background in statistics. It allows me to understand and see beyond the models available in libraries or packages so I can create value for companies with my solutions.
When I am free I enjoy reading technical books (currently learning 👩💻 about the Julia programming language and cybersecurity). I also enjoy baking and cooking and try to come up with new recipes 👩🍳.
Download CVDetection of breast cancer. In this project, we compare several supervised / unsupervised machine learning techniques (SVM, KNN, random forests, naive Bayes (NB), adaboost, logistic regression, ANN-Back-propagation and PCA).
The model performance is measured against precision, AUC, sensitivity, specificity, recall and F1-Score.
The analyzes showed that the neural networks outperform all other techniques when a variable selection step is carried out during the Pre-Processing of the data. In comparison with other studies our models perform better in terms of accuracy.
It is a standard classification problem. We are interested to assess the default risk (probability). TabNet and XGBoost models were applied and a Shiny App was built to facilitate the job for the client (Internal Auditor).
We were interested, after that, to see if there is a significant relationship, in business accounts, between the risk rating and the person granting the loan.
Text analysis techniques were applied to extract information about the loan granting to enhance our machine learning models in order to help the Auditor interven before the borrower could access the money.
Reports or transcripts of the meetings. Sentiment Analysis was performed to search for important information (like presence of the committee, adoption of budget, fraud subjects) in long documents. To better assit the auditor, a Shiny App is built to automate the text extraction and NLP process.
Investment. Do we have a positive Return on Investment? What is the likelihood of success?. Risk analysis and Monte Carlo Simulation were applied to answer theses questions. We were able to inform the client how to save money and how to be more profitable in the years to come. Automate the analysis and building an App.
Using Customer Credit Card History to Cluster with Network Analysis. Segments were developed to predict and explain customer segments. Key Clusters were detected in order to Customize Products and Services for them.
Statistical inference, Survival analysis & Nonparametric methods | ⭐⭐⭐⭐⭐ | R, Python, SAS, Spark SQL | ⭐⭐⭐⭐⭐ |
Statistical analysis & Machine learning | ⭐⭐⭐⭐⭐ | Shiny App Development | ⭐⭐⭐⭐⭐ |
Data Visualization & Wrangling | ⭐⭐⭐⭐⭐ | JavaScript, HTML, CSS, PowerBI | ⭐⭐⭐⭐ |
Deep learning | ⭐⭐⭐⭐ | Web Development | ⭐⭐⭐⭐ |
Computer Vision, Algorithmic Trading & Android App Development | ⭐⭐ | iOS & Swift, Julia, Django, Tableau | ⭐⭐ |
University Of Quebec In Montreal.
Developing a new statistical algorithm calculating the probability of finding a matching donor based on In/compatible HLA epitopes in kidney transplantation.
University Of Quebec In Montreal.
Mathematics & Statistics
Consultant in Biostatistics. Experiment planning, selection of the appropriate statistical analysis, Data analysis and report writing.
Consultant in Statistics. Provide Statistical advice for clients, from different sectors (Marketing, Business, Education, Psychology, Medecine, ...etc.), to make evidence-based decisions by collecting and analyzing the data.
Bureau De La Surveillance Desjardins
Building Machine learning models for fraud prediction and classification of sites in terms of performance. Set up the structure and the work methodology for the data science team. Recruitment of analysts, junior data scientists and interns.
Statistics Canada
Generate the sample list for the monthly production across Canada for the Labor Force Survey. Create scenario scripts to test the behavior of the upcoming new platform for the Labor Force Survey. Work closely with the developer team to fix bugs.
UQAM - University of Sherbrooke
Prepare and explain the labs for undergrade students. Prepare revision labs for exams and correct homework and exams.
Principal Author of "The Calculated Panel of Incompatible Epitopes (cPIE) in the Service of Equitable Access to Transplantation". [Link]
Principal Author of "Is Equitable Access to Transplantation Possible in the Era of HLA Epitope Compatibility?". [Link]
International Society of Nephrology "On Path to Informing Hierarchy of Eplet Mismatches as Determinants of Kidney Transplant Loss". [DOI]
Presentation of some statistical models in data analysis that I performed as part of my internship at the Consultation Center in Data Analysis at UQAM and share my experience with undergraduate students.
Presentation of parametric and non parametric models in survival analysis.