Full-Time Data Engineer III
Job Description
US citizens, Greencard holders preferred
Job Summary
The Direct to Consumer Group is a technology company within Client. We are building a global streaming video platform (OTT) which covers search, recommendation, personalization, catalogue, video transcoding, global subscriptions and really much more. We build user experiences ranging from classic lean-back viewing to interactive learning applications. We build for connected TVs, web, mobile phones, tablets and consoles for a large footprint of Client owned networks (Client, Food Network, Golf TV, MotorTrend, Eurosport, Client Play, and many more). This is a growing, global engineering group crucial to Client’s future.
We are hiring Senior Software Engineers to join the Personalization, Recommendation and Search team. As part of a rapidly growing team, you will own complex systems that will provide a personalized and unique experience for millions of users across over 200 countries for all the Client brands. You will be responsible for building scalable and distributed data pipelines and will contribute to the design of our data platform and infrastructure.
You will handle big data, both structured and unstructured, at the scale of millions of users.
You will lead by example and define the best practices, will set high standards for the entire team and for the rest of the organization. You have a successful track record for ambitious projects across cross-functional teams. You are passionate and results-oriented. You strive for technical excellence and are very hands-on. Your co-workers love working with you. You have built respect in your career through concrete accomplishments.
Qualifications:
- 5+ years of experience designing, building, deploying, testing, maintaining, monitoring and owning scalable, resilient and distributed data pipelines.
- High Proficiency in at least two of Scala, Python, Spark or Flink applied to large scale data sets.
- Strong understanding of workflow management platforms (Airflow or similar).
- Familiarity with advanced SQL.
- Expertise with big data technologies (Spark, Flink, Data Lake, Presto, Hive, Apache Beam, NoSQL, …).
- Knowledge of batch and streaming data processing techniques.
- Obsession for service observability, instrumentation, monitoring and alerting.
- Understanding of the Data Lifecycle Management process to collect, access, use, store, transfer, delete data.
- Strong knowledge of AWS or similar cloud platforms.
- Expertise with CI/CD tools (CircleCI, Jenkins or similar) to automate building, testing and deployment of data pipelines and to manage the infrastructure (Pulumi, Terraform or CloudFormation).
- Understanding of relational databases (e.g., MySQL, PostgreSQL), NoSQL databases (e.g., key-value stores like Redis, DynamoDB, RocksDB), and Search Engines (e.g., Elasticsearch). Ability to decide, based on the use case, when to use one over the other.
- Familiarity with recommendation and search to personalize the experience for millions of users across million items.
- Masters in Computer Science or related discipline.
If you are motivated to succeed, self-driven and excited by the idea that your work will define Client’s success and the daily viewing experience for millions of users, please connect with us, we would love to chat with you!
- The role has 4 different levels, the client is considering candidates at all levels. The compensation will depend on what level the candidate falls into. For the lower level candidates, they must be located in the Seattle, WA area. These are the minimum expectations. GO experience is a HUGE PLUS!
Will transfer H1’s and even sponsor candidates!
- Will consider candidates who are willing to work at San Francisco, CA location.
- MUST HAVE:
- 5+ years of experience designing, building, deploying, testing, maintaining, monitoring and owning scalable, resilient and distributed data pipelines.
- High Proficiency in at least two of Scala, Python, Spark or Flink applied to large scale data sets.
- Strong understanding of workflow management platforms (Airflow or similar).