Description
We are seeking an experienced Data Engineer to lead the design, development, and optimization of end-to-end data pipelines and cloud-based solutions. You will be responsible for architecting scalable data and analytic systems, ensuring data integrity, and implementing software engineering best practices and patterns. The ideal candidate has a strong background in ETL, big data technologies, and cloud services, with a proven ability to drive complex projects from concept to production.
Primary Responsibilities / Essential Functions
This job description in no way states or implies that these are the only duties to be performed by the teammate occupying this position. The selected candidate may perform other related duties assigned to meet the ongoing needs of the business.
Key Responsibilities
Data Architecture and Engineering
- Design and implement scalable data pipelines for data ingestion, transformation, and storage.
- Architect and optimize data lakes and data warehouses to support analytics and reporting needs.
- Develop robust ETL processes to integrate structured and unstructured data from diverse sources.
- Ensure high data quality through cleaning, validation, and transformation techniques.
Cloud and Big Data Solutions
Lead the implementation of big data frameworks such as Hadoop and Spark for processing large datasets.Develop and optimize solutions on cloud platforms, including AWS S3, Azure Data Lake, Google BigQuery, and Snowflake.Manage data lakes to facilitate efficient data access and processing for downstream applications.Database and Data Warehousing
Design, implement, and manage relational (SQL) and non-relational (NoSQL) database systems.Lead database architecture efforts, including schema design, query optimization, and performance tuning.Oversee the design and management of data warehouses, ensuring reliability, scalability, and security.Software Development and Automation
Utilize Python and SQL to develop efficient, production-ready code for data pipelines and integrations.Implement scripting automation using Bash and PowerShell to streamline workflows.Leverage version control (Git) and follow best practices in code optimization, unit testing, and debugging.Collaboration and Leadership
Act as a technical leader, providing guidance on best practices.Collaborate with cross-functional teams (Data Scientists, Software Engineers, Analysts) to meet business objectives.Drive innovation by evaluating and integrating emerging tools, technologies, and frameworks.Establish and maintain CI / CD pipelines to ensure efficient deployment and system reliability.Qualifications
Required :
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.5+ years of experience as a Data Engineer with expertise in building large-scale data solutions.Proficiency in Python, SQL, and scripting languages (Bash, PowerShell).Deep understanding of big data tools (Hadoop, Spark) and ETL processes.Hands-on experience with cloud platforms (AWS S3, Azure Data Lake, Google BigQuery, Snowflake).Strong knowledge of database systems (SQL, NoSQL), database design, and query optimization.Experience designing and managing data warehouses for performance and scalability.Proficiency in software engineering practices : version control (Git), CI / CD pipelines, and unit testing.Preferred
Strong experience in software architecture, design patterns, and code optimization.Expertise in Python-based pipelines and ETL frameworks.Experience with Azure Data Services and Databricks.Excellent problem-solving, analytical, and communication skills.Experience working in agile environments and collaborating with diverse teams.