Are you an Research Engineer expert in training custom AI ? Reply is a company that specialises in Consulting, Systems Integration and Digital Services with a focus on the conception, design and implementation of solutions based on the new communication channels and digital media. Reply partners with key industrial groups in defining and developing business models made possible by the new technological and communication paradigms such as Artificial Intelligence, Big Data, Cloud Computing, Digital Communication, the Internet of Things and Mobile and Social Networking. You will design synthetic dataset generation pipelines from task descriptions, teacher-student distillation or production traces. You will build auto-research harnesses using AI coding to navigate the design space across recipes, architectural design, dataset mixes and ablations. You will run post-training workflows, with a focus on RL and agentic reasoning, either via RL and distillation, on custom RL environments across domains for professional use on the Pareto frontier. You will schedule jobs on a distributed cluster across local compute and cloud compute and optimize serving confiturations either for online RL, evaluation or synthetic generation. Technologies. You will work with cloud compute clusters (Lambda, Nebius, AWS, GCP), local compute clusters (NVIDIA DGX), AI workload managers (SkyPilot), RL environments and stack (NeMo-RL, SkyRL, TRL), synthetic dataset generation pipelines (DataFlow, DataTrove), experiment trackers (MLflow), model and dataset hubs (Hugging Face Hub), evaluation metrics and environments (LM Evaluation Harness), kernels (Triton), serving (vLLM, Speculators) You will collaborate with a young, dynamic team of engineers and scientists in a hybrid setup. You will participate in code reviews, share knowledge, and work in a friendly, informal, and fun office environment that values creativity and continuous learning. We are looking for people with a degree in Computer Engineering, AI, Math, Software Engineering, or related quantitative fields. A strong foundation in programming, especially Python, Triton and Rust, is essential, to handle training runs as well as training infrastructure. Valuable expertise You should have a track record of training LLMs, with practical experience in either mid-training and post-training. We value candidates that have had in-depth experience on a specific training direction, such as tiny multimodal architectures for computer use, hybrid architectures, or large-scale architectures with long-context, rather than generic, surface-level notebook playtesting, and experience in crafting datasets and ablations, rather than using ready-made benchmarks and recipes. PhD in a related field and a proven track record of publication at Tier 1 conferences. Claude Code and OpenClaw are your best friends. Clear communication in English and attention to detail will help you succeed in this role. Hundreds of small units with their own projects and teams. Even though it still may look unreal. Reply is committed to embracing diversity and creating an inclusive work environment by valuing the uniqueness of people regardless of age, gender, sexual orientation, religion, nationality, or disabilities as protected by Italian Law (L. Furthermore, Reply is committed to ensuring a fair and accessible selection process: to help you during the recruitment process, please let us know of any kind of support you may need.
Experienced Research Engineer
REPLY
Sant'Ambrogio di Torino, Piemonte
Pubblicato 9 giorni fa
Segnala lavoro