Hello, I'm Mayank Vyas
Brewing Software with AI Solutions. ☕️
Work Experience

Software Engineer, Research at Coral Labs
Tempe, Arizona
- Engineered a scalable and fault-tolerant data ingestion pipeline on AWS using Apache Spark, processing 1.2 TB of data (NQ tables) while reducing system latency by 3x
- Optimized system performance by designing pruning algorithms to discard irrelevant data segments, improving information retrieval recall and reducing resource consumption
- Developed and tested a distributed, 3-stage meta-reasoning engine, achieving an 11% performance improvement over existing baselines on temporal-reasoning benchmarks

Founding Software Engineer at Job-Hunt AI
Tempe, Arizona (Self-Employed)
- Architected an end-to-end AI automation pipeline using Elasticsearch for high-speed semantic search and a knowledge graph to map complex relationships between job requirements and user skills
- Developed an LLM-based agentic workflow that autonomously parses job descriptions and aligns them with candidate resumes, improving job match relevance by 92% and reducing manual search time by 98%

Software Engineer, Machine Learning Architecture at Indian Institute of Information Technology
Chennai, TamilNadu
- Architected and deployed a resilient end-to-end IoT system on resource-constrained hardware (Raspberry Pi), automating deployment and monitoring processes
- Optimized C++ inference kernels to achieve 35% lower latency, enabling real-time anomaly detection and data streaming to AWS
- Designed an on-device predictive filtering algorithm that reduced sensor data transmission by 95%, significantly lowering operational costs and network load
Projects

Transform educational questions into interactive, story-based visualizations using AI. Features a 4-layer pipeline that intelligently routes content from documents (PDF/DOCX) to 18 distinct game templates with intelligent caching and real-time progress tracking.

Research project investigating efficiency, scalability, and linguistic adaptability of Fine-Tuned LLMs for code generation. Explores LoRA rank optimization, data scaling effects, and cross-language generalization using GPT-2.

Python pipeline to convert inspection JSON data into populated TREC (Texas Real Estate Commission) HTML reports. Features smart mapping across 6 TREC sections, automatic empty section removal, and proper formatting for comments, images, and videos.

Automated data extraction and real-time visualization pipeline for Intel's retail edge computing platform. Built Python scripts to extract metrics from results logs and publish to an MQTT broker, integrated with Grafana dashboards via the MQTT plugin. Created custom Docker images for Grafana and MQTT configured to communicate on the same Docker network using Docker Compose.

This project is part of a Bachelor's Research Thesis, aiming to detect and segment primary roots in plant images using a customized version of the Mask R-CNN model adapted for TensorFlow 2.0 and Keras 2.2.8. The original codebase from Matterport's Mask R-CNN was modified for compatibility and to support training and inference on annotated root datasets.

This projects implementation of a Multi-Layer Perceptron (MLP) from scratch using Python. It demonstrates the fundamental concepts of building and training a neural network, including forward propagation, backward propagation, and parameter optimization.
Hackathons

Transform educational questions into interactive, story-based visualizations using AI. Built a 4-layer pipeline with 18 game templates, intelligent caching reducing processing time by 80%, and real-time progress tracking.
Built an Agentic AI system using LangChain and LangGraph for automated, personalized interview prep—cut manual effort by 90% via modular orchestration, relevant question generation, and evaluation with feedback.

Gamify
Zoom App Hackathon
Zoom
Developed a Zoom application leveraging real-time transcription to automatically generate interactive quizzes using Gemini AI, with seamless platform integration.

TwinGenius
Devils Invent Hackathon
Honeywell & Arizona State University
Revolutionized industrial digital twin creation by generating complete environments from natural language prompts in under 60 seconds using Gemini AI and AWS IoT TwinMaker.
About Me
As a Data Science master's student at ASU, I architect intelligent systems by specializing in RAG (Retrieval-Augmented Generation) pipelines for LLMs and developing sophisticated AI Agents. My core expertise lies in Natural Language Processing, where I design high-performance retrieval algorithms to power next-generation AI applications.
I translate complex theory into real-world impact. My project experience includes analyzing Time Series data to build robust IoT Pipelines for smart agriculture and engineering a production-ready, dockerized pipeline for Intel's automated self-checkout system to visualize critical data on Grafana.
I also engineered a Masked R-CNN pipeline to intelligently detect the primary root length of plant species like wheat, brassica napus, and arabidopsis thaliana, enabling biologists to study the root phenome more effectively.
GitHub Contributions
contributions in 2025
Education

Master of Science in Data Science
Aug 2024 - May 2026
Tempe, Arizona

Bachelors of Science in Electrical Engineering
Aug 2020 - May 2024
Ahmedabad, India

