Research Scientist Intern, Multimodal AI

Meta

Posted 22 days ago

Internship

Redmond, Washington

In Person

Smart Summary

Responsibilities include designing, implementing, and maintaining comprehensive evaluation protocols for large language models, developing high-quality datasets and benchmarks, and analyzing model outputs to provide actionable insights. The role also involves designing novel algorithms for audio research problems and collaborating with teams building Meta’s language AI products.

Meta is seeking a Research Scientist Intern with experience in audio signal processing, machine learning, and audio-visual learning. The ideal candidate should be pursuing a Ph.D. in Computer Science, AI, or a related field and have strong skills in Python, Matlab, and machine learning platforms like PyTorch or TensorFlow. Experience building novel audio computational models and LLMs is required.

Must Have Skills for ATS

Audio Signal Processing

Machine Learning

Audio Visual Learning

Multimodal Representation Learning

Audio Visual Scene Analysis

Egocentric Audio Visual Learning

Multi-sensory Speech Enhancement

Acoustic Activity Localization

Large Language Models

Python

Matlab

PyTorch

TensorFlow

Algorithm Design

Evaluation Protocols

Data Curation

Job Description

The Meta Reality Labs Research Team brings together a world-class team of researchers, developers, and engineers to create the future of virtual and augmented reality, which together will become as universal and essential as smartphones and personal computers are today. And just as personal computers have done over the past 45 years, AR, VR and MR will ultimately change everything about how we work, play, and connect. We are developing all the technologies needed to enable breakthrough AR glasses and VR headsets, including optics and displays, computer vision, audio, graphics, brain-computer interfaces, haptic interaction, eye/hand/face/body tracking, perception science, and true telepresence. Some of those will advance much faster than others, but they all need to happen to enable AR, VR and MR that are so compelling that they become an integral part of our lives. In particular, the Meta Reality Labs Research audio team is focused on two goals; creating virtual sounds that are perceptually indistinguishable from reality, and redefining human hearing. See more about our work here: Inside Facebook Reality Labs Research: The future of audio and Filter Out the Noise With Conversation Focus. These two initiatives will allow us to connect people by allowing them to feel together despite being physically apart, and allow them to converse in even the most difficult listening environments. Meta Reality Labs Research is looking for experienced interns who are passionate about ground breaking research in audio signal processing, machine learning and audio visual learning to solve important audio-driven problems for AR/VR applications. We currently have open positions for a range of projects in multimodal representation learning, audio visual scene analysis, egocentric audio visual learning, multi-sensory speech enhancement and acoustic activity localization. Our internships are twelve (12) to twenty four (24) weeks long and we have various start dates throughout the year.

Responsibilities
  • Design, implement, and maintain comprehensive evaluation protocols for large language models, including both automated and human-in-the-loop assessments
  • Develop and curate high-quality datasets and benchmarks to measure model performance, safety, fairness, and robustness across a variety of tasks and modalities
  • Analyze model outputs to identify strengths, weaknesses, and failure modes, and provide actionable insights to research and engineering teams
  • Design and implementation of novel algorithms to solve audio research problems
  • Collaboration with teams building Meta’s language AI products.. Collaborate with researchers, engineers, and cross-functional partners to define evaluation goals, communicate findings, and drive improvements in model quality
  • Develop tools and infrastructure to streamline and scale evaluation processes, including dashboards, annotation platforms, and reporting systems
  • Stay up-to-date with the latest research in audio LLM evaluation, benchmarking, and responsible AI, and incorporate best practices into Meta’s workflows
  • Disseminate evaluation results through internal reports, presentations, and, when appropriate, external publications


Minimum Qualifications
  • Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science, Artificial Intelligence, Generative AI, Transformer Models, Machine Learning, Signal Processing or Computer vision
  • 3+ years experience with Python, Matlab, or similar
  • 3+ years experience with machine learning software platforms such as PyTorch, TensorFlow, etc
  • Experience building novel audio computational models and LLM
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment


Preferred Qualifications
  • Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. Github)
  • Experience in advancing AI techniques, including core contributions to open source libraries and frameworks in computer vision or audio processing
  • Experience with audio and speech quality assessment
  • Experience with multichannel audio processing
  • Experience in visual and acoustic scene analysis
  • Experience manipulating and analyzing complex, large scale, high-dimensionality data from varying sources
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or top computer vision and machine learning conferences such as NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, ICCV, ECCV, ICASSP, InterSpeech or similar
  • Experience in utilizing theoretical and empirical research to solve problems
  • Experience working and communicating cross functionally in a team environment
  • Intent to return to a degree-program after the completion of the internship/co-op


$7,650/month to $12,134/month + benefits

Meta

Meta's mission is to build the future of human connection and the technology that makes it possible. Our technologies help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. To help create a safe and respectful online space, we encourage constructive conversations on this page. Please note the following: • Start with an open mind. Whether you agree or disagree, engage with empathy. • Comments violating our Community Standards will be removed or hidden. Please treat everybody with respect. • Keep it constructive. Use your interactions here to learn about and grow your understanding of others. • Our moderators are here to uphold these guidelines for the benefit of everyone, every day. • If you are seeking support for issues related to your Facebook account, please reference our Help Center (https://www.facebook.com/help) or Help Community (https://www.facebook.com/help/community). For a full listing of our jobs, visit https://www.metacareers.com

Runway Icon
Boost Your Interview Chances

With Runway

See Your Fit for This Role

1-5 min

Your Score

?

Top Applicants

90%

Your Job Search Advantage

Key Gaps & Next Steps:

Address these in your resume & Interview

Top Strengths For This Role

Highlight these in your cover letter & interview

Your Interview Guide

A Personalized Interview Strategy

Freshest Opportunities

Never Miss a Good Fit

Get notified when jobs mach your criteria