Samyak Rajesh Jain
About Me
๐ Hello! I'm Samyak, a Machine Learning Engineer with over 5 years of experience in Natural Language Processing (NLP), computer vision, and on-device ML. Currently pursuing my Master's in NLP at UC Santa Cruz, I'm passionate about bridging the gap between theoretical AI and real-world applications. ๐ผ My journey in AI has taken me through innovative projects at Coherent and Samsung Research.
๐ My current focus areas include:
- Enhancing LLM reasoning with graph-structured plans (collaboration with Meta AI)
- Multimodal QA systems for technical data using LLMs and VLMs (with Bosch Research)
- Efficient on-device ML solutions
๐ค I'm always excited to collaborate on challenging AI projects. Let's connect and innovate together!
Download Resume ๐
Work Experience
Coherent - Machine Learning Intern
Santa Clara, CA | Jun 2024 - Sept 2024
- Developed a GPT-4o based multimodal chatbot to automate optical and electrical simulations, achieving 90% error-free simulations.
Samsung Research - Lead Machine Learning Engineer
Bengaluru, India | Jun 2018 - Jun 2023
- Led development of a real-time low-light video restoration model for Galaxy S23, achieving 4x better temporal consistency and 6% higher SSIM than state-of-the-art.
- Optimized multi-frame image processing pipeline for Galaxy S23 FE, boosting performance by 10%.
Siemens - Research Intern
Bengaluru, India | Jul 2017 - Dec 2017
- Developed an algorithm to extract mathematical formulae & charts from technical PDFs with 76% accuracy.
- Built a chart classifier using transfer learning on GoogLeNet, achieving 91% accuracy.
Samsung Research - Software Intern
Bengaluru, India | May 2017 - Jul 2017
- Developed a closed domain QA system for home appliances using Dialogflow and seq2seq LSTM.
DataPhi Labs - Data Science Intern
Bengaluru, India | May 2016 - Jun 2016
- Built a customer churn prediction engine using random forest, achieving an F-score of 0.76.
- Developed a dynamic insights mining wrapper for user-specific retention strategies.
Research Projects
Closed Domain Multimodal QA using LLMs and VLMs
Santa Clara, CA | Jan - Jun 2024 (collaboration with Bosch Research)
- Fine-tuned VLMs like Phi-3-vision, Idefics2, and LLaVA-NeXT using LoRA for a multimodal QA system, achieving a 12.4% improvement in multi-hop QA performance on technical graphs and tables.
Graph-based Planning System for Large Language Model (LLM)
Santa Clara, CA | May 2024 - Present (collaboration with Meta AI)
- Trained a text encoder and a Graph Neural Network (GNN) in PyTorch to retrieve graph-structured plans, enhancing LLMsโ complex reasoning and long-term planning capabilities for multi-hop Question Answering (QA).
Question Answering using Retrieval Augmented Generation (RAG)
Santa Clara, CA | Mar 2024 | Code
- Developed a retrieval augmented QA system using Googleโs FLAN-T5 model and Facebook AI Similarity Search (FAISS) vector database in LangChain.
Publications
- Legal Answer Validation using Few-Shot Multi-Choice QA (SemEval-2024)
- System, Apparatus and Method of Managing Knowledge Generated from Technical Data (US Patent 2019)
Education
- University of California, Santa Cruz (UCSC), Santa Clara, CA
- Master of Science in Natural Language Processing (Sept 2023 - Dec 2024)
- National Institute of Technology Karnataka (NITK), Surathkal, India
- Bachelor of Technology in Information Technology (Aug 2014 - May 2018)
Technical Skills
- Languages: Python, C++, C, SQL
- Frameworks: PyTorch, TensorFlow, HuggingFace transformers, LoRA fine-tuning, LangChain, pandas, numpy, scipy, scikit-learn, OpenCV, Gradio, Jupyter
- Tools & Technologies: Git, Linux, Embedded ML, Large Language Model (LLM), Retrieval Augmented Generation (RAG), Vision-Language Model (VLM), Data Generation
Contact
- Email: srajeshj@ucsc.edu / samyak24jain@gmail.com
- GitHub: github.com/samyak24jain
- LinkedIn: linkedin.com/in/samyak24jain