Technical Program Breadth
I maintain end-to-end visibility of the data science lifecycle by applying a "zero-discrepancy" mindset to every stage of development—from raw clinical data remediation to the deployment of deep learning engines.
Automated Clinical Remediation Pipeline
Designed a modular Python pipeline to remediate missingness (97% weight data) and noise in the UCI Diabetes dataset. Preserved over 101,000 patient encounters while increasing the Data Quality Index (DQI) by 25%.
LA Equity & Social Resource Allocation Model
Geospatial analysis framework synthesizing U.S. Census data to identify service deserts in Los Angeles. Built a custom Priority Score to guide equitable public funding decisions.
Deep Learning Sentiment Engine (LSTM)
Engineered a Long Short-Term Memory (LSTM) neural network for NLP sentiment classification. Improved accuracy by accounting for linguistic sequence and context in unstructured text data.
3NF Relational Database Architecture
Strategic design of a normalized database schema (DSC-450) to ensure data integrity and referential consistency for high-volume enterprise reporting.
E-Commerce Sales Forecasting & Segmentation
Coupled K-Means clustering with Facebook Prophet time-series models to predict retail demand trends and define high-value customer personas.
Financial Fraud Detection (Imbalanced Data)
Developed a robust fraud detection framework utilizing SMOTE and precision-recall optimization to identify high-risk transactions in skewed financial datasets.
Customer Churn Prediction Strategy
Classification framework (Logistic Regression & Gradient Boosting) used to quantify retention ROI by identifying at-risk users through behavioral patterns.
Full-Stack Housing Predictor (Flask)
Live-deployable web application using Lasso Regression to provide real-time property estimates for the Anchorage housing market.
Interactive Equity Performance Analysis
Visualizing market volatility and rolling financial averages using Plotly to create interactive, dynamic equity dashboards.
NLTK Sentiment Classification Pipeline
End-to-end NLP pipeline demonstrating raw text cleaning, tokenization, and sentiment classification using Python and NLTK.
Applied Data Science Visualizer (P1)
Statistical storytelling project focused on complex data distributions and visual forensics to identify data drift and statistical outliers.
RFM Customer Analytics Engine
Implemented Recency, Frequency, and Monetary (RFM) analysis to segment customer bases into actionable personas for hyper-targeted marketing.