Projects

READ: This is a list of some of the projects I’ve done. Recently, most of the projects I’ve undertaken have either been research-based or company-based projects, and thus haven’t been included here due to obvious reasons. The research projects that have been completed and led to publications are residing in a separate category (publications). Some of listed the projects were done as self-projects to either tackle real-world problems or to learn and experiment with new tools and technologies. The other projects were done as a part of some of the courses (info) I’ve taken at IIIT-Delhi. I’ve added links to the demo/website, code and a blog post describing the project in detail wherever applicable.

Table of Contents

Plotex

  • Links: [website, code, PyPI]
  • Description: A minimal wrapper over matplotlib for rapid prototyping and elegant plots for publications
  • Type: Self-Project
  • Duration: Jan’23 - Present
  • Work:
    • Details coming soon!

TrenDetect

  • Links: [website, code, blog]
  • Description: A source to keep up with all the trending news topics, in an instant.
  • Type: Self-Project
  • Duration: Jun’21 - Sep’21
  • Work:
    • Detection of trending keywords based on custom designed metrics
    • Clustered keywords into six different categories
    • Created a connection graph to aid in the visualization of connections with relevance-based vertex and edge sizes
    • Utilized weighted heuristics like source credibility, time since publishing etc. on top of NER to assign weights to the keywords
    • Employed the Newsemble API for collecting the news data

NewsHelper

  • Links: [demo]
  • Description: Automated summary and headline generation as well as source classification for news articles, specifically designed for Indian news.
  • Type: CSE556: Natural Language Processing Course Project, Monsoon’21
  • Guide: Dr Md Shad Akhtar
  • Duration: Jun’21 - Sep’21
  • Work:
    • Utilized transfer learning by trained transformer models BART, T5 and BERT on a novel dataset of over 60k Indian news articles
    • Performed two-stage fine-tuning to increase the performance of the generation model by first training on an existing large-scale news corpus.
    • Open-sourced the custom trained models: 1, 2, 3
    • Hosted the demo application using on HuggingFace Spaces using Gradio.

Newsemble

  • Links: [api, code, blog]
  • Description: API for real-time fetching of complete content and meta-data of news articles.
  • Type: Self-Project
  • Duration: May’21 - Jul’21
  • Work:
    • Developed a REST-API using Flask with custom scrapers for each selected news source (currently, a total of six news sources are defined)
    • Utilized a CRON scheduler to run the scrapers every hour in the cloud to improce performance
    • To reduce the latency factor caused by scraping, the articles are directly fetched from the database whenever a request is instantiated.
    • Employed tools like Flask, BeautifulSoup, PyMongo, MongoDB, Heroku etc. to facilitate this API.

Neural Complexity Measures (Implementation)

  • Links: [code, writeup]
  • Description: An implementation of the NeurIPS’20 paper, Neural Complexity Measures on a new multitask dataset
  • Type: CSE663A: Meta-Learning Course Project, Winter’22
  • Guide: Dr Gautam Shroff
  • Duration: Feb’22 - Apr’22
  • Work:
    • The key idea of the paper is to estimate the generalization capabilites of a task-learner by meta-learning a model that incorporates its predictions
    • Implemented the paper (in Pytorch) on a multitask dataset with multiple regression tasks and mutiple multiclass classification tasks
    • Along with the basic implementation, proposed Uncertainty Aware Gap Estimation, which takes uncertainty of the meta-learner into account when estimating the generalization gap
    • Demonstrated that incorporating uncertainty improves performance on classification along with providing further analysis

ColorSwitch

  • Links: [code, info]
  • Description: Java implementation of the android game ColorSwitch, employing different design patterns and with bonus features!
  • Type: CSE201: Advanced Programming Course Project, Monsoon’20
  • Guide: Dr Vivek Kumar
  • Duration: Sept’20 - Nov’20
  • Work:
    • Implemented the game in JavaFX from scratch, while retaining gameplay smoothness by employing various physics principles
    • Utilized various design patterns to handle different parts of the game.
    • Implemented various bonus components like cheat codes, other multi-functional shapes and modules etc.
    • Used (and learned about) Object-Oriented Programming, Design Principles, JavaFX etc.

SudokuSolver

  • Links: [code, info]
  • Description: Automatically detecting, recognizing and solving sudokus using OpenCV, PIL and TensorFlow.
  • Type: Self-Project
  • Duration: Feb’21 - Mar’21
  • Work:
    • Designed an automated framework for solving sudokus (either image or video) using image processing techniques and CNN’s
    • Integrated support for generating and overlaying the solved suodku on top of the input, as well as generating a digital copy of the solved sudoku
    • Extracted of the location of the digits using various image processing strategies like thresholding, contour-detection, homography, perspective-transformation, perspective-warping amongst others
    • Combined the MNIST dataset with a custom artificially generated dataset using PIL and OpenCV to train the recognition model
    • Implemented the recognition of digits using an ensemble of CNNs built on top of the standard LeNet architecture
    • Employed Peter Norvig’s algorithm for solving the sudoku

IPL Score Prediction

  • Links: [website, code, info]
  • Description: Prediction of the score of an IPL match using Machine Learning
  • Type: Self-Project
  • Duration: Nov’20 - Dec’20
  • Work:
    • Designed a website to predict the first-innings score of an IPL match
    • Collected and processed the dataset from 2008-2020 to train the regression model
    • Utilized feature-engineering to extrapolate new features like toss decision, score in the last six overs amongst other to boost performance of the regressor
    • Experimented with various different regression models for the task and chose XGBoostRegressor as it achieved the lowest $MSE$ as well as $R^2$ score on the validation set
    • Designed the responsive site from scratch using HTML, CSS and JavaScript (and got to learn these tools)

Safeya

  • Description: An app for digitizing various maintenance tasks at IIIT-D
  • Type: CSE202: Fundamentals of Database Management Systems Course Project, Winter’21
  • Guide: Dr Mukesh Mohania
  • Duration: Feb’21 - Apr’21
  • Work:
    • Designed a model application for digitizing various maintenanced and cleaning tasks at IIIT-D.
    • Responsible for conceptualization, data population, table creation, composing queries and implementing the backend (using Flask and MySQL).
    • Used and learned about various fundamental database concepts like ER Diagrams, Normalization, Query Composition etc.