Audio Trainer - $21/hr (Last Day)

Source: reddit-r-forhire

About the Role

In this position, you’ll record short spoken descriptions of images to help train next-generation AI systems that understand both vision and audio. Your voice work will directly support cutting-edge research in AI.

Responsibilities

View images and generate natural-sounding spoken descriptions.
Record short audio clips (2–3 minutes each) using provided tools.
Ensure recordings are high-quality (no background noise or distortion).
Follow stylistic/linguistic guidelines from the research team.
Collaborate with QA/researchers on improving dataset quality.

Qualifications

Excellent verbal communication and enunciation.
Native or near-native fluency in English (other languages are a plus).
Strong attention to detail; ability to follow guidelines.
Prior experience with voice recording/annotation is helpful but not required.
Comfortable with repetitive, independent work.

What You’ll Gain

$21/hour, hourly contract.
Flexible, remote-friendly work.
Contribute to foundational AI research.
Experience at the intersection of audio, language, and computer vision.

Interview Process

15-minute AI interview + short availability form.
Responses typically within a week

Apply Here

https://work.mercor.com/jobs/list_AAABmF1oddizkrET0sdOqoLG?referralCode=22ad5755-7386-433a-8bce-3a817719fab4&utm_source=referral&utm_medium=share&utm_campaign=job_referral