Abhijeet Sahdev

Comparative Study of Selected Clustering Algorithms with Generated Datasets Notebook Report

Performed data mining in benchmarking unsupervised ML algorithms such as K-Means and Hierarchical Agglomerative Clustering on GenAI-generated datasets with outliers removed using IQR to enhance pattern discovery.
Achieved top silhouette scores with K-Means on Dataset 1 (0.359, 3 clusters) and Dataset 2 (0.625, 2 clusters), and with Agglomerative on Dataset 3 (0.984, 2 clusters).

Analyzing OxCGRT Code Report

Conducted data engineering, data wrangling and cleaning appending suitable columns with appropriate imputation strategies followed by exploratory data analysis on a dataset with 56 columns and 202,760 tuples.
Curated experiments through research for two supervised machine learning approaches multivariate linear regression on daily mortality rates, achieving an R2 score of 0.974 and MSE of 0.0003 and daily policy effectiveness using a decision tree classifier with an F1 score of 0.984.

Sentiment Analysis on Movie Review Dataset Notebook

Conducted binary sentiment classification on The Large Movie Review Dataset developed by Stanford NLP research group. This dataset provides 25,000 movie reviews each for training and testing, along with additional unlabeled data.
For preprocessing, first, each tuple under text was converted to lowercase to extract word tokens. These tokens were filtered to remove stopwords and maintain alphabets as per the language of the text selected, i.e. English. These tokens were shortened to their root words using lemmatization. These set of tokens are used to create a Bag Of Words of a trigram model using the train set, which was used to transform the test set.
Classification reports of an MLP classifier was compared with a SGD model, identifying that MLP outperformed SGD on all metrics except recall for positive reviews.

Morrish Health Services Code Report

Implemented a healthcare management system with CRUD operations for each entity along with specific management and reporting views in Django and MySQL with a user guide for clinical workflows.

Open Source Contribution PR

Added opt-in selected-count display to react-native-multiple-select (500k+ downloads), keeping backward compatibility for existing apps.

DataCamp Data Scientist Code

Completed 6 projects, 23 courses, and 3 assessments; archived notebooks and exercises in Python.

Competitive Programming Code

Solved LeetCode competitive programming coursework in Python with pattern-focused implementations.

Machine Learning at Verzeo Code

Conducted data wrangling and exploratory analysis across four ML mini-projects; documented results and code.

Waabec Hybrid App Report

Feasibility study for a hybrid language-training app serving sector-specific curricula across Spain.

Covid-19 Dashboard Code

A dashboard that shows the number of active Covid cases, deaths and patients recovered using REST protocol to fetch data from https://ncov2019-admin.firebaseapp.com with offline support via shared preferences and structured error handling.

Minor in Computational Mathematics Code

Implemented time-series smoothing and ANOVA experiments for coursework in applied statistics and experimental design.

Coursework & Exploratory Projects