-
View all jobs
WHAT YOU DO AT AMD CHANGES EVERYTHING
We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
AMD together we advance_
Join AMD Silo AI's evaluation team as a hands-on evaluation engineer. We need a strong engineer to implement, scale, and operationalize our evaluation frameworks for large-scale language model development for multilingual settings.
You'll be the technical implementation backbone of our evaluation strategy, translating research insights into robust, scalable evaluation systems. Working closely with the pre- and post- training team, you'll focus on the engineering execution that makes high-quality LLM evaluation possible at scale.
The role offers significant technical ownership and the chance to shape how evaluation is done. You'll have the opportunity to work on cutting-edge LLM evaluation challenges while building systems and creating benchmarks that directly impact open-source model development decisions.
Main Responsibilities
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
AMD together we advance_
Join AMD Silo AI's evaluation team as a hands-on evaluation engineer. We need a strong engineer to implement, scale, and operationalize our evaluation frameworks for large-scale language model development for multilingual settings.
You'll be the technical implementation backbone of our evaluation strategy, translating research insights into robust, scalable evaluation systems. Working closely with the pre- and post- training team, you'll focus on the engineering execution that makes high-quality LLM evaluation possible at scale.
The role offers significant technical ownership and the chance to shape how evaluation is done. You'll have the opportunity to work on cutting-edge LLM evaluation challenges while building systems and creating benchmarks that directly impact open-source model development decisions.
Main Responsibilities
- Extend and modernize our benchmark suite to ensure we are using the most relevant evaluations for base models and post-trained models, with an additional emphasis on expanding coverage of European and low resource language evaluations
- Publish code, benchmark datasets, and analysis notebooks under permissive licenses; engage with upstream tools and contribute fixes or extensions
- Optimize evaluation pipelines for distributed computing environments and multi-GPU setups
- Develop lightweight proxy tasks and ablation protocols that surface issues early in long training runs
- Work closely with pre-training and post-training teams to surface the right information, and help drive decision making for training techniques, data mixes, and data pipelines
- Coordinate with dev infra on experiment tracking, reporting and logging, establishing requirements and driving needed changes
- Collaborate with the OpenEuroLLM project on evaluations for European languages.
- Audit current evaluation infrastructure, identify technical bottlenecks and scalability issues
- Framework analysis: Evaluate existing evaluation tools and frameworks, documenting gaps between research needs and current technical capabilities
- Take ownership of the technical side of the existing evaluation framework maintenance
- Define a roadmap for extended experiment tracking capabilities
- Python programming and software engineering best practices
- Experience with PyTorch/Transformers ecosystem
- Experience with evaluation of large machine learning models
- MLOps familiarity: experiment tracking, model versioning, automated pipelines
- Computer Science or Engineering background: BS/MS in related field
- We welcome candidates from mid-level to senior level depending on experience and demonstrated capabilities
- Multilingual evaluation experience, particularly European languages
- Distributed computing experience (multi-GPU evaluation pipelines)
- Academic publications or industry blog posts on ML evaluation
- Experience with EleutherAI LM Evaluation Harness or similar frameworks
- Working knowledge of more than one language
- Strong communicator and collaborator
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
Key Skills
Ranked by relevance
artificial intelligence
distributed computing
machine learning
embedded
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
C/C++ Software Development Engineer
2026-05-19
Full-time
Mid-Senior
India
Semiconductor Manufacturing
Engineering
View Job Details
Related
Software ML Engineer
2026-05-16
Full-time
Associate
United Kingdom
Semiconductor Manufacturing
Engineering
View Job Details
Related
Senior Systems Engineer (DSP) - C/C++ / MATLAB / Python
2026-05-22
Full-time
Mid-Senior
Ireland
Semiconductor Manufacturing
Information Technology
Login to Apply
- Posted
- Jul 08, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Helsinki
- Company
- AMD
Industries
Semiconductor Manufacturing
Categories
Engineering
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
C/C++ Software Development Engineer
2026-05-19
Full-time
Mid-Senior
India
Semiconductor Manufacturing
Engineering
View Job Details
Related
Software ML Engineer
2026-05-16
Full-time
Associate
United Kingdom
Semiconductor Manufacturing
Engineering
View Job Details
Related
Senior Systems Engineer (DSP) - C/C++ / MATLAB / Python
2026-05-22
Full-time
Mid-Senior
Ireland
Semiconductor Manufacturing
Information Technology