Rishabh Jain | AI/ML Researcher & Engineer

<about>

About Me

Available for AI Consulting & Collaboration

Hi, I'm Rishabh Jain

Postdoctoral Researcher at Trinity College Dublin specializing in Multimodal AI

I completed my PhD in AI, focusing on Speech Technologies (TTS/STT) at the University of Galway, Ireland. With a strong foundation in Machine Learning, Deep Learning, and Large Language Models, I'm passionate about pushing the boundaries of AI research and application.

My work spans across speech recognition, text-to-speech synthesis, multimodal learning, and cutting-edge LLM development. I'm actively seeking opportunities to contribute to innovative ML/AI projects as a Machine Learning Engineer, Research Scientist, or Data Scientist.

Publications

Years Experience

Projects

AI Research Speech Technologies LLMs PyTorch Multimodal AI Deep Learning

Technical Arsenal

AI/ML Core

Machine Learning / Deep Learning 95%

PyTorch & TensorFlow 90%

Speech Technologies (TTS/STT) 95%

Large Language Models 88%

Programming

Python

Java

C/C++

JavaScript

SQL

Bash/Shell

YAML/JSON

Tools & Platforms

Docker

Git/GitHub

AWS

GCP

Linux/Slurm

Visualization

Unity

Jupyter/Colab

Research & Soft Skills

Research Methodology & Publication
Team Collaboration & Leadership
Problem Solving & Critical Thinking
Communication & Presentation
Project Management
Multilingual (English, Hindi, Spanish)

AI Frameworks & Libraries

Hugging Face

Transformers

Keras

scikit-learn

NumPy/Pandas

Weights & Biases

LangChain

ESPnet

Research Interests

Audio-Visual AI

Gesture Recognition

Facial Animation

Face Detection

Child Speech Tech

Low-Resource NLP

Academic Background

2021 - 2024

PhD in Sound Processing and Artificial Intelligence

University of Galway, Ireland

Thesis: Child Speech Understanding and Generation via Neural ASR and TTS Models

Enhanced child speech technologies in TTS and STT for the DAVID project
Optimized Tacotron 2 and FastPitch models for realistic synthetic child speech
Improved child ASR performance using wav2vec2, Whisper, and Conformer models
Applied data augmentation techniques to boost ASR accuracy for child speech
Integrated TTS and ASR technologies into interactive smart toys
Developed facial animation pipelines for synthetic-speaking children

Download Thesis View Online

2019 - 2020

MSc in Data Analytics

University of Galway, Ireland

Grade: 1.1 Honours

Thesis: Toolkit for Facial Landmarks Identification and Recognition based on Video Data Analysis

Key Modules:

Machine Learning Deep Learning NLP Applied Regression Data Visualization Web & Network Science Information Retrieval Python & R

2015 - 2019

B.Tech in Computer Science (Bioinformatics)

VIT University, India

CGPA: 8.73 / 10.0

Thesis: Patient Health Monitoring System and Detection of Atrial Fibrillation, Fall, and Air Pollutants using IoT Technologies

Professional Journey

Nov 2024 - Present

Research Fellow

Trinity College Dublin, Ireland

Conducting cutting-edge research in multimodal large language models
Developing tools for gesture recognition and audio-visual modalities
Implementing state-of-the-art audio-visual speech recognition techniques
Collaborating with cross-functional teams on large-scale multimodal models

Multimodal AI Audio-Visual LLMs PyTorch

Feb 2024 - Jun 2024

Research Scientist

Oxford Wave Research, UK

Specialized in STT and LLM training for low-resource languages
Fine-tuned cutting-edge models: Mistral, LLama, Gemma, Phi-3
Implemented LLM quantization for optimization and efficiency
Developed speaker diarization systems with Whisper ASR
Utilized LangChain, Ollama, LM Studio for LLM deployment

LLMs Whisper Quantization LangChain

Oct 2020 - May 2024

Research Intern

Xperi Corporation, Ireland

Led DTIF-DAVID project: AI platform for voice-enabled toys - a collaboration between XPERI, SoapBox Labs, and University of Galway funded by Enterprise Ireland
Developed multimodal (sound and vision) AI processing platform with low cost and low power consumption for smart toys
Implemented embedded (on-device) processing to deliver high-quality AI experience with long battery life
Built Object/Gesture Detection/Recognition, Automatic Speech Recognition, and Text-to-Speech Synthesis technologies
Integrated TTS/ASR technologies into interactive smart toys for children

Embedded AI Object Detection Gesture Recognition Smart Toys

Oct 2020 - Feb 2024

Research Assistant

University of Galway, Ireland

Conducted research on speech and audio technology for the DTIF-DAVID project
Developed child speech synthesis models using Tacotron 2 and FastPitch for TTS applications
Enhanced child ASR performance with wav2vec2, Whisper, and Conformer models
Explored synthetic data generation from TTS models to improve ASR technologies for child speech
Implemented data augmentation techniques to boost ASR accuracy for children's voices
Published multiple research papers on child speech recognition and synthesis

TTS ASR Child Speech Tacotron 2 wav2vec2

Jan 2020 - Apr 2020

Teaching Assistant

University of Galway, Ireland

Taught OOP, Network Communications, and Software Design courses
Mentored students in Java, C#, SQL, XML, and Linux
Evaluated assignments and conducted lab sessions

Java C# Teaching

Apr 2018 - Jul 2018

Software Engineer Intern

Enbake Consulting, India

Engineered geocoding application middleware using Python
Worked with NodeJS, MongoDB, and Google Cloud Platform
Developed data mining and modeling solutions

Python NodeJS GCP

Research & Publications

A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis

R. Jain, M. Y. Yiwere, D. Bigioi, P. Corcoran and H. Cucu

IEEE Access, vol. 10, pp. 47628-47642, 2022

Read Paper

Exploring Native and Non-Native English Child Speech Recognition With Whisper

R. Jain, A. Barcovschi, M. Y. Yiwere, P. Corcoran and H. Cucu

IEEE Access, vol. 12, pp. 41601-41610, 2024

Read Paper

A WAV2VEC2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition

R. Jain, A. Barcovschi, M. Y. Yiwere, D. Bigioi, P. Corcoran and H. Cucu

IEEE Access, vol. 11, pp. 46938-46948, 2023

Read Paper

Augmentation Techniques for Adult-Speech to Generate Child-Like Speech Data Samples at Scale

M. Y. Yiwere, A. Barcovschi, R. Jain, H. Cucu and P. Corcoran

IEEE Access, 2023

Read Paper

Pose-Aware Speech Driven Facial Landmark Animation Pipeline for Automated Dubbing

D. Bigioi, H. Jordan, R. Jain, R. McDonnell and P. Corcoran

IEEE Access, vol. 10, pp. 133357-133369, 2022

Read Paper

Adaptation of Whisper models to child speech recognition

Jain, R., Barcovschi, A., Yiwere, M., Corcoran, P., Cucu, H.

Interspeech 2023, pp. 5242-5246

Read Paper

Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning

R. Jain and P. Corcoran

SpeD 2023, Bucharest, Romania, pp. 54-59

Read Paper

A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving child speech recognition

A. Barcovschi, R. Jain and P. Corcoran

SpeD 2023, Bucharest, Romania, pp. 42-47

Read Paper

Data Center Audio/Video Intelligence on Device (DAVID) - An Edge-AI Platform for Smart-Toys

G. Cosache, F. Salgado, R. Jain, C. Rotariu, G. Sterpu and P. Corcoran

SpeD 2023, Bucharest, Romania, pp. 66-71

Read Paper

Synthetic Speaking Children – Why We Need Them and How to Make Them

M. Ali Farooq, D. Bigioi, R. Jain, W. Yao, M. Yiwere and P. Corcoran

SpeD 2023, Bucharest, Romania, pp. 36-41

Read Paper

Child Speech Understanding and Generation via Neural ASR and TTS Models

Rishabh Jain

PhD Thesis, University of Galway, Ireland (2021-2024)

Comprehensive research on enhancing child speech technologies through advanced TTS and ASR models, including Tacotron 2, FastPitch, wav2vec2, Whisper, and Conformer. Applied data augmentation techniques and developed ethical applications for interactive smart toys and facial animation pipelines.

View Online Download PDF

Let's Connect

Get In Touch

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision. Whether you're looking for research collaboration or have questions about AI/ML, feel free to reach out!

Email

j_rishabh@outlook.com

GitHub

@rishabhjain16

                            
                            
                            
                            connect.py
                        

                            class Researcher:
    def __init__(self):
        self.name = "Rishabh Jain"
        self.role = "AI/ML Researcher"
        self.location = "Dublin, Ireland"
        self.interests = [
            "Multimodal AI",
            "Speech Technologies",
            "Large Language Models",
            "Deep Learning"
        ]
    
    def get_in_touch(self):
        return {
            "email": "j_rishabh@outlook.com",
            "github": "@rishabhjain16",
            "linkedin": "rishabhjain16"
        }

# Let's collaborate!
rishabh = Researcher()
contact = rishabh.get_in_touch()
                        

About Me

Hi, I'm Rishabh Jain

Technical Arsenal

AI/ML Core

Programming

Tools & Platforms

Research & Soft Skills

AI Frameworks & Libraries

Research Interests

Available for AI Consulting

Academic Background

PhD in Sound Processing and Artificial Intelligence

University of Galway, Ireland

MSc in Data Analytics

University of Galway, Ireland

B.Tech in Computer Science (Bioinformatics)

VIT University, India

Professional Journey

Research Fellow

Trinity College Dublin, Ireland

Research Scientist

Oxford Wave Research, UK

Research Intern

Xperi Corporation, Ireland

Research Assistant

University of Galway, Ireland

Teaching Assistant

University of Galway, Ireland

Software Engineer Intern

Enbake Consulting, India

Research & Publications

A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis

Exploring Native and Non-Native English Child Speech Recognition With Whisper

A WAV2VEC2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition

Augmentation Techniques for Adult-Speech to Generate Child-Like Speech Data Samples at Scale

Pose-Aware Speech Driven Facial Landmark Animation Pipeline for Automated Dubbing

Adaptation of Whisper models to child speech recognition

Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning

A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving child speech recognition

Data Center Audio/Video Intelligence on Device (DAVID) - An Edge-AI Platform for Smart-Toys

Synthetic Speaking Children – Why We Need Them and How to Make Them

Child Speech Understanding and Generation via Neural ASR and TTS Models

Let's Connect

Get In Touch

Email

GitHub

LinkedIn