website picture

Samyak Rajesh Jain

About Me

๐Ÿ‘‹ Hello! I'm Samyak, a Machine Learning Engineer with over 5 years of experience in Natural Language Processing (NLP), computer vision, and on-device ML. Currently pursuing my Master's in NLP at UC Santa Cruz, I'm passionate about bridging the gap between theoretical AI and real-world applications. ๐Ÿ’ผ My journey in AI has taken me through innovative projects at Coherent and Samsung Research.

๐Ÿš€ My current focus areas include:

  • Enhancing LLM reasoning with graph-structured plans (collaboration with Meta AI)
  • Multimodal QA systems for technical data using LLMs and VLMs (with Bosch Research)
  • Efficient on-device ML solutions

๐Ÿค I'm always excited to collaborate on challenging AI projects. Let's connect and innovate together!

Download Resume ๐Ÿ”—

Work Experience

Coherent - Machine Learning Intern

Santa Clara, CA | Jun 2024 - Sept 2024

  • Developed a GPT-4o based multimodal chatbot to automate optical and electrical simulations, achieving 90% error-free simulations.

Samsung Research - Lead Machine Learning Engineer

Bengaluru, India | Jun 2018 - Jun 2023

  • Led development of a real-time low-light video restoration model for Galaxy S23, achieving 4x better temporal consistency and 6% higher SSIM than state-of-the-art.
  • Optimized multi-frame image processing pipeline for Galaxy S23 FE, boosting performance by 10%.

Siemens - Research Intern

Bengaluru, India | Jul 2017 - Dec 2017

  • Developed an algorithm to extract mathematical formulae & charts from technical PDFs with 76% accuracy.
  • Built a chart classifier using transfer learning on GoogLeNet, achieving 91% accuracy.

Samsung Research - Software Intern

Bengaluru, India | May 2017 - Jul 2017

  • Developed a closed domain QA system for home appliances using Dialogflow and seq2seq LSTM.

DataPhi Labs - Data Science Intern

Bengaluru, India | May 2016 - Jun 2016

  • Built a customer churn prediction engine using random forest, achieving an F-score of 0.76.
  • Developed a dynamic insights mining wrapper for user-specific retention strategies.

Research Projects

Closed Domain Multimodal QA using LLMs and VLMs

Santa Clara, CA | Jan - Jun 2024 (collaboration with Bosch Research)

  • Fine-tuned VLMs like Phi-3-vision, Idefics2, and LLaVA-NeXT using LoRA for a multimodal QA system, achieving a 12.4% improvement in multi-hop QA performance on technical graphs and tables.

Graph-based Planning System for Large Language Model (LLM)

Santa Clara, CA | May 2024 - Present (collaboration with Meta AI)

  • Trained a text encoder and a Graph Neural Network (GNN) in PyTorch to retrieve graph-structured plans, enhancing LLMsโ€™ complex reasoning and long-term planning capabilities for multi-hop Question Answering (QA).

Question Answering using Retrieval Augmented Generation (RAG)

Santa Clara, CA | Mar 2024 | Code

  • Developed a retrieval augmented QA system using Googleโ€™s FLAN-T5 model and Facebook AI Similarity Search (FAISS) vector database in LangChain.

Publications

Education

  • University of California, Santa Cruz (UCSC), Santa Clara, CA
    • Master of Science in Natural Language Processing (Sept 2023 - Dec 2024)
  • National Institute of Technology Karnataka (NITK), Surathkal, India
    • Bachelor of Technology in Information Technology (Aug 2014 - May 2018)

Technical Skills

  • Languages: Python, C++, C, SQL
  • Frameworks: PyTorch, TensorFlow, HuggingFace transformers, LoRA fine-tuning, LangChain, pandas, numpy, scipy, scikit-learn, OpenCV, Gradio, Jupyter
  • Tools & Technologies: Git, Linux, Embedded ML, Large Language Model (LLM), Retrieval Augmented Generation (RAG), Vision-Language Model (VLM), Data Generation

Contact