Survival Analysis
Lung cancer is the second most common form of cancer in both men and women, accounting for 2.3 million cases of the 17 million total estimated cases. In this project, I analyzed clinical trial data of ~800 patients obtained from the cancer genome atlas program (TCGA) and estimated survival rates of two kinds of non-small cell lung cancers using Kaplan-Meir curves and Cox proportional hazard model. I also explored demographic, pathological, and smoking history of patients and identified statistically-significant covariates that influence survival rates.
Packages : survminer, survival, RTCGA, RTCGA.clinical
Population Estimates - Code for Philly Datathon 2020
This project was done as part of a datathon organized by code for philly and R-ladies philly to assist (Prevention Point), a non-profit organization, which works with communities affected by drug use. The goal of our team was to estimate the number of intravenous drug users in the city of Philadelphia. In order to aid our calculations, we integrated data provided by prevention point with data from city of Philadelphia including fatal and non-fatal drug overdoses, medically assisted treament, and drug arrests. Additionally, we also explored treatment addiction datasets from (SAMHSA). Since the data precluded any granaular data due to HIPAA, we used indirect estimation methods addition, multiplier and truncated Poisson estimation methods to estimate the number of intravenous drug users in the city.
Packages : tidyverse, ggplot2