Skip to main content

Data Engineer (P784)

About Us:

As a Mid-level Data Engineer at Kenility, you’ll join a tight-knit family of creative developers, engineers, and designers who strive to develop and deliver the highest quality products into the market.

 

Technical Requirements:

  • Bachelor's degree in Engineering, Computer Science, or a related field.
  • 4+ years of experience with AWS Cloud services: S3, Glue, Athena, Lake Formation, DataZone, Redshift, Lambda, ECS, EKS, API Gateway, EMR, Step Functions, Airflow.
  • 3+ years of experience with Infrastructure as Code (IaC) and DevOps.
  • 3+ years building data lakes using DataMesh or LakeHouse architectures, with data governance policies.
  • 4+ years of experience using Python for data engineering.
  • 3+ years configuring CI/CD pipelines using tools like: GitLab CI, Azure DevOps, or Jenkins.
  • 3+ years working with PySpark or similar frameworks.
  • 3+ years using Docker and container orchestration.
  • 3+ years developing microservices using containers or serverless architectures.
  • 4+ years working with Amazon Redshift or other data warehouse solutions.
  • 3+ years using Terraform for infrastructure management.
  • 3+ years working with NoSQL databases like: DynamoDB, DocumentDB.
  • Knowledge of data quality and data governance frameworks
  • Experience with data catalogs and metadata solutions
  • Required Certifications: AWS Solutions Architect, AWS Data Engineer or Data Analytics, DevOps (desirable)
  • Minimum B2 (Upper-Intermediate) or C1 (Proficient) level in English.

 

Tasks and Responsibilities:

  • Lead and manage projects assigned to the team (cell), maintaining direct communication with clients to define scope and requirements.
  • Provide technical guidance to junior and mid-level team members and help define technology strategies aligned with organizational goals.
  • Design and build robust data processing solutions using Python and PySpark as the main tools.
  • Write high-quality code with unit testing using PyTest, build microservices using serverless or container-based architectures, and optimize data transformations on platforms like Redshift.
  • Design and implement data architectures based on the Data Mesh model, ensuring a balance between decentralization and governance.
  • Build Lakehouse architectures and real-time processing systems that make efficient use of cloud capabilities.
  • Ensure all designs are scalable, cost-effective, and aligned with industry best practices.
  • Use Terraform for Infrastructure as Code (IaC) and build CI/CD pipelines using tools like GitLab CI or Azure DevOps.
  • Set up monitoring dashboards to detect and resolve data pipeline issues, ensuring service reliability and availability.
  • Optimize systems to reduce costs and improve performance; integrate generative AI into data pipelines and explore advanced data cataloging methods.
  • Define and implement data quality and governance frameworks to ensure integrity and trust in data assets.
  • Document technical solutions clearly, present outcomes to stakeholders at all levels, and collaborate with cross-functional teams.
  • Participate in internal communities of practice, sharing knowledge and best practices with peers.

 

Soft Skills:

  • Responsibility
  • Proactivity
  • Flexibility
  • Great communication skills