Job Description
DATA ENGINEERING
What We Do
Our mission is to provide a world-class platform that empowers the business to leverage data that will enhance, monitor, and support our products. We are responsible for data ingestion systems, processing pipelines, and various data stores all operating in the cloud. We operate at a petabyte scale, and support near real-time use cases as well as more traditional batch approaches.
What You'll Do
Epic Games is seeking a data specialist with a solid background in data science who is skilled in conducting comprehensive, end-to-end data collection projects, including data labeling, synthetic data generation, and data quality assessments, to support the training and evaluation of machine learning models. The ideal candidate should be able to leverage their experience to work with large teams of data labelers, establish robust data quality metrics, develop an efficient remediation process, and define meaningful strategies to drive data cleansing activities where needed.
In this role, you will
• Lead the end-to-end logistics of data labeling, ensuring that we're hitting dataset volume targets for training/evaluation runs
• Own our data collection efforts, managing a team of (in-house) data labelers
• Own our synthetic data collection efforts, working with a variety of tools to generate large-scale synthetic training data
• Work closely with ML and data engineers to detail data requirements, supporting large-scale data sourcing, enabling data labeling, and quality control
• Develop and enforce stringent data quality standards, implementing QA processes to ensure data meets ML-readiness benchmarks.
What we're looking for
• Strong analytical background: BSc or MSc in data science/machine learning or related topics - candidates without a degree are welcome as long as they have proven hands-on experience.
• Programming experience with Python.
• Experience creating datasets for machine learning, including establishing quality metrics (e.g., agreement).
• Experience with working with teams of data labelers to label datasets.
• Experience with processes to generate synthetic data where real-world data is unavailable.
• Experience with data labeling tools such as Human Signal / Label Studio.
Note to Recruitment Agencies: Epic does not accept any unsolicited resumes or approaches from any unauthorized third party (including recruitment or placement agencies) (i.e., a third party with whom we do not have a negotiated and validly executed agreement). We will not pay any fees to any unauthorized third party. Further details on these matters can be found here.
Jobcode: Reference SBJ-rb4z16-35-215-180-183-42 in your application.