Projects
Here is an outline of my projects, which are primarily NLP/Deep Learning-focused or university course projects
(publications are seperately described here).
Graph Neural Networks (with NLP)
- Open Link Prediction in Open Knowledge Graphs (in-progress)
2024- Demonstrates how inductive graph neural networks enhance entity-mention node vectors to improve reasoning in the Open Link Prediction task in OLPBench.
- Investigates the effect of the hashing trick and of FP-16 precision.
- Conductes ablation studies with semantic node features.
Natural Language Processing
- Retrieval-Augmented Generation [GitHub]
Neo4j & LLMs course
Oct ‘24- Made a GenAI RAG agent that retrieves context from a Neo4j vector DB and stores chat history. Explored prompting strategies.
- Deployed a sample PDF RAG chatbot on Streamlit, with chunking and embedding in the pipeline using the LangChain framework.
- Statistical Machine Translation [PPT]
Language Technologies Resouce Center, International Institute of Information Technology - Hyderabad
June ‘22
This project was done and presented at IASNLP’22 - IIIT-H’s Advanced Summer School on Natural Language Processing.- Investigated the effects of tuning English to Hindi, Telugu, Tamil SMT models on different evaluation metrics (BLEU, chrF, TER, etc).
The following NLP projects were done as part of the Natural Language Processing team at IvLabs, VNIT’s AI & Robotics Lab.
- Sentiment Analysis (Text Classification) [GitHub]
January ‘22- Aimed at classifying the polarity of a topic.
- Read papers on different architectures for Sentiment Analysis and Text Classification.
- Benchmarked the following landmark architectures in PyTorch on the IMDb Movie Reviews dataset.
- LSTM - Test Accuracy: 87.62%
- FastText - Test Accuracy: 86.28%
- BERT - Test Accuracy: 91.44%
- Neural Machine Translation [Github]
December ‘21- Read papers on novel architectures for translation among languages.
- Implemented landmark architectures in PyTorch using the Multi30k Dataset for German-English.
- Attention Is All You Need - BLEU Score: 37.5
- Neural Machine Translation by Jointly Learning to Align and Translate - BLEU Score: 31
- Sequence to Sequence Learning with Neural Networks - BLEU Score: 20
- Text Generation [GitHub]
June ‘21- Generated dinosaur names by building a character-level Language Model.
- Compared the results of different sequence models such as Vanilla RNN, LSTM and GRU, in PyTorch.
CS Course Projects
The following were done as assignments for various Computer Science undergraduate courses at VNIT.
- Traveling Salesman Problem Solver using A* Search [GitHub]
Artificial Intelligence
October ‘22- Implemented the A* search algorithm to solve the TSP, using the minimum spanning tree heuristic function.
- N-tile Puzzle Solver using Bidirectional Search [GitHub]
Artificial Intelligence
August ‘22- Implemented the bidirectional search algorithm to solve the puzzle, using breadth-first search in both directions.
- Linux Command Shell using Multi-threading [GitHub]
Operating Systems
September ‘21- Implemented a basic Linux shell using multithreading in C POSIX.
- Built to handle multiple serial/parallel commands, output redirection, change of directory and signal interrupts.
- Weather Tracking Network using AVL BST [GitHub]
Data Structures
April ‘21- Developed a weather data repository that keeps a record of data collected from a (hypothetical) sensor grid spanning a city, and tracks all the sensors.
- Implemented AVL Binary Search Trees for the Weather Data and Sensor Network data structures, with various operations for interaction and management.
- Heap Memory Manager [GitHub]
Concepts in Programming Languages
February ‘21- Implemented the malloc and free functions in C, on a heap, using the first-fit allocation strategy.
Miscellaneous
- Handwritten Digits Recognizer [GitHub]
December ‘20- Aimed at building a handwritten digits classifier, trained on the MNIST dataset.
- Implemented the LeNet-5 architecture in PyTorch.
- Startup Simulation
Introduction to Entrepreneurship Course Project, VNIT
May - Aug ‘21- Led a team of 5 to simulate a startup idea based on sustainability.
- Developed a roadmap for the business model and partnerships.
- Strategized pitching, marketing channels and the full client journey in detailed documentation and presentations.
- Made detailed financial statements for CapEx, OpEx, revenue and cash flow sheets of up to 3 years in spreadsheets.
- This hands-on course was personally taken by the awesome Shashikant Chauhary - a serial entrepreneur and angel investor.
- Google Sheets / MS Excel - based functional trackers
July ‘21- Used complex and important sheets’ functionalities like Macros, App Scripts, cross-sheet queries, pivot tables, conditional formatting, VLookups, charts, formulae, etc.
- With the above know-how, built a wide range of useful custom interfaces:
- Personal Finance Workbook: Made a complete functional custom workbook to track expenses, income and upcoming expenses across months throughout a year, with category-wise budgets and a consolidated dashboard to visualize the overall health of your wealth.
- Courses-and-GPA Sheet: Tracks and visualizes performance, record test marks, and calculate elective credits and GPAs to plan into future requirements of their degree.
- Time Tracker: Visualizes time spent throughout the day. Flexibly records time spent by category and visualizes how useful which parts of the day were, hours of productivity (or otherwise, like unavoidable work) for the day and elegentaly consolidates to the week-level.
- PS: All three of them are highly personalized to my needs/schedule, the customization has been necessary to make these meta-trackers actually useful. Thus, I highly encourage everyone to learn basic Excel/Sheets (not asking AI helps to value and use your sheets more) and you had be surprised by how useful you can get it to be.