Data Engineer (P784)
About Us:
As a Mid-level Data Engineer at Kenility, you’ll join a tight-knit family of creative developers, engineers, and designers who strive to develop and deliver the highest quality products into the market.
Technical Requirements:
- Bachelor's degree in Engineering, Computer Science, or a related field.
- 4+ years of experience with AWS Cloud services: S3, Glue, Athena, Lake Formation, DataZone, Redshift, Lambda, ECS, EKS, API Gateway, EMR, Step Functions, Airflow.
- 3+ years of experience with Infrastructure as Code (IaC) and DevOps.
- 3+ years building data lakes using DataMesh or LakeHouse architectures, with data governance policies.
- 4+ years of experience using Python for data engineering.
- 3+ years configuring CI/CD pipelines using tools like: GitLab CI, Azure DevOps, or Jenkins.
- 3+ years working with PySpark or similar frameworks.
- 3+ years using Docker and container orchestration.
- 3+ years developing microservices using containers or serverless architectures.
- 4+ years working with Amazon Redshift or other data warehouse solutions.
- 3+ years using Terraform for infrastructure management.
- 3+ years working with NoSQL databases like: DynamoDB, DocumentDB.
- Knowledge of data quality and data governance frameworks
- Experience with data catalogs and metadata solutions
- Required Certifications: AWS Solutions Architect, AWS Data Engineer or Data Analytics, DevOps (desirable)
- Minimum B2 (Upper-Intermediate) or C1 (Proficient) level in English.
Tasks and Responsibilities:
- Lead and manage projects assigned to the team (cell), maintaining direct communication with clients to define scope and requirements.
- Provide technical guidance to junior and mid-level team members and help define technology strategies aligned with organizational goals.
- Design and build robust data processing solutions using Python and PySpark as the main tools.
- Write high-quality code with unit testing using PyTest, build microservices using serverless or container-based architectures, and optimize data transformations on platforms like Redshift.
- Design and implement data architectures based on the Data Mesh model, ensuring a balance between decentralization and governance.
- Build Lakehouse architectures and real-time processing systems that make efficient use of cloud capabilities.
- Ensure all designs are scalable, cost-effective, and aligned with industry best practices.
- Use Terraform for Infrastructure as Code (IaC) and build CI/CD pipelines using tools like GitLab CI or Azure DevOps.
- Set up monitoring dashboards to detect and resolve data pipeline issues, ensuring service reliability and availability.
- Optimize systems to reduce costs and improve performance; integrate generative AI into data pipelines and explore advanced data cataloging methods.
- Define and implement data quality and governance frameworks to ensure integrity and trust in data assets.
- Document technical solutions clearly, present outcomes to stakeholders at all levels, and collaborate with cross-functional teams.
- Participate in internal communities of practice, sharing knowledge and best practices with peers.
Soft Skills:
- Responsibility
- Proactivity
- Flexibility
- Great communication skills