Projects
READ: This is a list of some of the projects I’ve done. Recently, most of the projects I’ve undertaken have either been research-based or company-based projects, and thus haven’t been included here due to obvious reasons. The research projects that have been completed and led to publications are residing in a separate category (publications). Some of listed the projects were done as self-projects to either tackle real-world problems or to learn and experiment with new tools and technologies. The other projects were done as a part of some of the courses (info) I’ve taken at IIIT-Delhi. I’ve added links to the demo/website, code and a blog post describing the project in detail wherever applicable.
Table of Contents
- Table of Contents
Plotex- TrenDetect
- NewsHelper
Newsemble- Neural Complexity Measures (Implementation)
- ColorSwitch
SudokuSolver
IPL Score Prediction- Safeya
Plotex
- Links: [
website,code,PyPI] - Description: A minimal wrapper over matplotlib for rapid prototyping and elegant plots for publications
- Type: Self-Project
- Duration: Jan’23 - Present
- Work:
- Details coming soon!
TrenDetect
- Links: [
website,code,blog] - Description: A source to keep up with all the trending news topics, in an instant.
- Type: Self-Project
- Duration: Jun’21 - Sep’21
- Work:
- Detection of trending keywords based on custom designed metrics
- Clustered keywords into six different categories
- Created a connection graph to aid in the visualization of connections with relevance-based vertex and edge sizes
- Utilized weighted heuristics like source credibility, time since publishing etc. on top of NER to assign weights to the keywords
- Employed the Newsemble API for collecting the news data
NewsHelper
- Links: [
demo] - Description: Automated summary and headline generation as well as source classification for news articles, specifically designed for Indian news.
- Type: CSE556: Natural Language Processing Course Project, Monsoon’21
- Guide: Dr Md Shad Akhtar
- Duration: Jun’21 - Sep’21
- Work:
- Utilized transfer learning by trained transformer models BART, T5 and BERT on a novel dataset of over 60k Indian news articles
- Performed two-stage fine-tuning to increase the performance of the generation model by first training on an existing large-scale news corpus.
- Open-sourced the custom trained models: 1, 2, 3
- Hosted the demo application using on HuggingFace Spaces using Gradio.
Newsemble
- Links: [
api,code,blog] - Description: API for real-time fetching of complete content and meta-data of news articles.
- Type: Self-Project
- Duration: May’21 - Jul’21
- Work:
- Developed a REST-API using Flask with custom scrapers for each selected news source (currently, a total of six news sources are defined)
- Utilized a CRON scheduler to run the scrapers every hour in the cloud to improce performance
- To reduce the latency factor caused by scraping, the articles are directly fetched from the database whenever a request is instantiated.
- Employed tools like Flask, BeautifulSoup, PyMongo, MongoDB, Heroku etc. to facilitate this API.
Neural Complexity Measures (Implementation)
- Links: [
code,writeup] - Description: An implementation of the NeurIPS’20 paper, Neural Complexity Measures on a new multitask dataset
- Type: CSE663A: Meta-Learning Course Project, Winter’22
- Guide: Dr Gautam Shroff
- Duration: Feb’22 - Apr’22
- Work:
- The key idea of the paper is to estimate the generalization capabilites of a task-learner by meta-learning a model that incorporates its predictions
- Implemented the paper (in Pytorch) on a multitask dataset with multiple regression tasks and mutiple multiclass classification tasks
- Along with the basic implementation, proposed Uncertainty Aware Gap Estimation, which takes uncertainty of the meta-learner into account when estimating the generalization gap
- Demonstrated that incorporating uncertainty improves performance on classification along with providing further analysis
ColorSwitch
- Links: [
code,info] - Description: Java implementation of the android game ColorSwitch, employing different design patterns and with bonus features!
- Type: CSE201: Advanced Programming Course Project, Monsoon’20
- Guide: Dr Vivek Kumar
- Duration: Sept’20 - Nov’20
- Work:
- Implemented the game in JavaFX from scratch, while retaining gameplay smoothness by employing various physics principles
- Utilized various design patterns to handle different parts of the game.
- Implemented various bonus components like cheat codes, other multi-functional shapes and modules etc.
- Used (and learned about) Object-Oriented Programming, Design Principles, JavaFX etc.
SudokuSolver
- Links: [
code,info] - Description: Automatically detecting, recognizing and solving sudokus using OpenCV, PIL and TensorFlow.
- Type: Self-Project
- Duration: Feb’21 - Mar’21
- Work:
- Designed an automated framework for solving sudokus (either image or video) using image processing techniques and CNN’s
- Integrated support for generating and overlaying the solved suodku on top of the input, as well as generating a digital copy of the solved sudoku
- Extracted of the location of the digits using various image processing strategies like thresholding, contour-detection, homography, perspective-transformation, perspective-warping amongst others
- Combined the MNIST dataset with a custom artificially generated dataset using PIL and OpenCV to train the recognition model
- Implemented the recognition of digits using an ensemble of CNNs built on top of the standard LeNet architecture
- Employed Peter Norvig’s algorithm for solving the sudoku
IPL Score Prediction
- Links: [
website,code,info] - Description: Prediction of the score of an IPL match using Machine Learning
- Type: Self-Project
- Duration: Nov’20 - Dec’20
- Work:
- Designed a website to predict the first-innings score of an IPL match
- Collected and processed the dataset from 2008-2020 to train the regression model
- Utilized feature-engineering to extrapolate new features like toss decision, score in the last six overs amongst other to boost performance of the regressor
- Experimented with various different regression models for the task and chose XGBoostRegressor as it achieved the lowest $MSE$ as well as $R^2$ score on the validation set
- Designed the responsive site from scratch using HTML, CSS and JavaScript (and got to learn these tools)
Safeya
- Description: An app for digitizing various maintenance tasks at IIIT-D
- Type: CSE202: Fundamentals of Database Management Systems Course Project, Winter’21
- Guide: Dr Mukesh Mohania
- Duration: Feb’21 - Apr’21
- Work:
- Designed a model application for digitizing various maintenanced and cleaning tasks at IIIT-D.
- Responsible for conceptualization, data population, table creation, composing queries and implementing the backend (using Flask and MySQL).
- Used and learned about various fundamental database concepts like ER Diagrams, Normalization, Query Composition etc.
