Job Detail

Senior Machine Learning Operations Engineer (AWS/GCP)

SG

Job Description

The Senior Machine Learning Engineer will build our machine learning platform from inception through to a scalable set of services across millions of customers and machines. We are a start-up within Dyson having the benefit of the resources, a solid understanding of innovation and large customer base. This provides an opportunity that you rarely see on the market.

This role is formative for machine learning operations at Dyson. We’re looking for someone that can bring solid experience leading and running production-grade machine learning services at scale. While experience is a must for this role, we need someone comfortable with uncertainty, innovation and the ability to drive into new territory, learning, growing and continuing to deliver along with the wider team. 

You will work closely with Google and Google certified partners to build out these capabilities that are crucial for our 5-year plan, where we seek to double the business and grow the D2C proposition.

You will work amongst a growing and multi-skilled set of teams ranging from app engineers, data engineers, data scientists and a top class product team to deliver a scalable machine learning platform and services. The majority of the projects you will work on will be greenfield. Examples of what you can expect day-to-day.

  • Productionising recommender systems

  • Working with new product data scientists to move from POC through to production-grade services

  • Developing cookie-cutter jobs and services to enable the wider data science community

  • Establishing and monitoring clear metrics for service and model performance

  • Developing runbooks, fall back solutions

  • Working with technical and non-technical stakeholder groups to support feature or platform capability

Accountabilities:

  • Accountable for building and deploying models into resilient, scalable machine learning pipelines.

  • Accountable for the reliability of pipelines and services in production. Developing and maintaining clear SLis, SLos, and data contracts for services you maintain. This includes fallback solutions and alerting. A combination of owning the service reliability and working with a centralised SRE team.

  • A principal of ownership, you build it, you own it.

  • Accountable for maintaining, augmenting and improving the machine learning platform. We are in the early/ growth phase and new team members should expect to be a key driver and contributor to our machine learning operations.

  • Accountable for contributing towards, and encouraging among the team a supportive and safe team environment.


Job Requirement

About you

  • You can communicate clearly with both technical and non-technical colleagues. You accurately judge the level of detail required in different situations.  

  • Advanced SQL

  • Advanced Python and Tensorflow

  • Experience with GCP or other cloud environments. GCP preferred.

  • Experience working across multiple engineering and data science teams to deliver production machine learning systems

  • Experience with Kubeflow, MlFlow or other similar machine learning workflow frameworks

  • Experience with development environments: e.g. Git, Bash, CI/CD, AWS, Kubernetes

  • Understanding of Agile methodologies, and the ability to apply these for machine learning systems development

Logo
×

Full Name*
Email address*
Upload a different Resume (Your application will be submitted using this resume instead)
Choose a file
Only .pdf is allowed
HACKERBUCK AWARDED