Talent.com
Data Engineer
Data EngineerQloo • New York City, New York, USA
Data Engineer

Data Engineer

Qloo • New York City, New York, USA
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About Us

At Qloo we harness large-scale behavioral and catalog data to power recommendations and insights across entertainment dining travel retail and more. Our platform is built on a modern AWS data stack and supports analytics APIs and machine-learning models used by leading brands. We are looking for an experienced Data Engineer to help evolve and scale this platform.

Role Overview

As a Data Engineer at Qloo you will design build and operate the pipelines that move data from external vendors internal systems and public sources into our S3-based data lake and downstream services. Youll work across AWS Glue EMR (Spark) Athena / Hive and Airflow (MWAA) to ensure that our data is accurate well-modeled and efficiently accessible for analytics indexing and machine-learning workloads.

You should be comfortable owning end-to-end data flows from ingestion and transformation to quality checks monitoring and performance tuning.

Responsibilities

  • Design develop and maintain batch data pipelines using Python Spark (EMR) and AWS Glue loading data from S3 RDS and external sources into Hive / Athena tables.
  • Model datasets in our S3 / Hive data lake to support analytics (Hex) API use cases Elasticsearch indexes and ML models.
  • Implement and operate workflows in Airflow (MWAA) including dependency management scheduling retries and alerting via Slack.
  • Build robust data quality and validation checks (schema validation freshness / volume checks anomaly detection) and ensure issues are surfaced quickly with monitoring and alerts.
  • Optimize jobs for cost and performance (partitioning file formats join strategies proper use of EMR / Glue resources).
  • Collaborate closely with data scientists ML engineers and application engineers to understand data requirements and design schemas and pipelines that serve multiple use cases.
  • Contribute to internal tooling and shared libraries that make working with our data platform faster safer and more consistent.
  • Document pipelines datasets and best practices so the broader team can easily understand and work with our data.

Qualifications

  • B achelors degree in Computer Science Software Engineering or a related field or equivalent practical experience.
  • Experience with Python and distributed data processing using Spark (PySpark) on EMR or a similar environment.
  • Hands-on experience with core AWS data services ideally including :
  • S3 (data lake partitioning lifecycle management)
  • AWS Glue (jobs crawlers catalogs)
  • EMR or other managed Spark platforms
  • Athena / Hive and SQL for querying large datasets
  • Relational databases such as RDS (PostgreSQL / MySQL or similar)
  • Experience building and operating workflows in Airflow (MWAA experience is a plus).
  • Strong SQL skills and familiarity with data modeling concepts for analytics and APIs.
  • Solid understanding of data quality practices (testing validation frameworks monitoring / observability).
  • Comfortable working in a collaborative environment managing multiple projects and owning systems end-to-end.
  • We Offer

  • Competitive salary and benefits package including health insurance retirement plan and paid time off.
  • The opportunity to shape a modern cloud-based data platform that powers real products and ML experiences.
  • A collaborative low-ego work environment where your ideas are valued and your contributions are visible.
  • Flexible work arrangements (remote and hybrid options) and a healthy respect for work-life balance.
  • We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.

    Required Experience :

    IC

    Key Skills

    Apache Hive,S3,Hadoop,Redshift,Spark,AWS,Apache Pig,NoSQL,Big Data,Data Warehouse,Kafka,Scala

    Employment Type : Full-Time

    Department / Functional Area : Engineering

    Experience : years

    Vacancy : 1

    [job_alerts.create_a_job]

    Data Engineer • New York City, New York, USA

    [internal_linking.similar_jobs]
    Staff Software Engineer, Data Platform

    Staff Software Engineer, Data Platform

    Scale AI, Inc. • New York, NY, United States
    [job_card.full_time]
    Software is eating the world, but AI is eating software.We live in unprecedented times - AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, co...[show_more]
    [last_updated.last_updated_1_hour] • [promoted] • [new]
    Data Engineer II

    Data Engineer II

    VirtualVocations • New York, New York, United States
    [job_card.full_time]
    A company is looking for a Data Engineer II - Gen AI - Music.Key Responsibilities Build and maintain large-scale data pipelines using data processing frameworks on Google Cloud Platform Drive op...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Data and Analytics Engineer

    Data and Analytics Engineer

    Resonance • New York, NY, US
    [job_card.full_time]
    [filters_job_card.quick_apply]
    Resonance is transforming the fashion industry by building a more sustainable and valuable ecosystem for designers, brands, manufacturers, consumers, and the planet. Our AI-powered operating system,...[show_more]
    [last_updated.last_updated_30]
    Senior Data Engineer

    Senior Data Engineer

    Rokt • New York, NY, US
    [job_card.full_time]
    We are Rokt, a hyper-growth ecommerce leader.Rokt is the global leader in ecommerce, unlocking real-time relevance in the moment that matters most. Rokt’s AI Brain and ecommerce Network powers...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Staff Data Engineer

    Staff Data Engineer

    Gemini • New York, NY, United States
    [job_card.full_time]
    Gemini is a global crypto and Web3 platform founded by Cameron and Tyler Winklevoss in 2014, offering a wide range of simple, reliable, and secure crypto products and services to individuals and in...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Data Engineer

    Data Engineer

    MetroPlus Health Plan • New York, NY, United States
    [job_card.full_time] +1
    Water Street, 7th Floor, New York, NY 10004 .The position of Data Engineer in the Analytics and Reporting Department is responsible in managing encounter data submission to NYSDOH, CMS and other...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Data Engineer

    Senior Data Engineer

    New York Blood Center • New York, NY, United States
    [job_card.permanent]
    Design, implement, and optimize robust and scalable data pipelines using SQL, Python, and cloud-based ETL tools such as Databricks. Ensure efficient data flow and processing to support large-scale d...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Data Analytics Engineer

    Data Analytics Engineer

    J.McLaughlin • New York, NY, US
    [job_card.full_time]
    McLaughlin was founded in 1977 by brothers Kevin and Jay McLaughlin with a mission to create an American Sportswear brand that offered two key components : classic clothing with current relevance an...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Data Engineer (Hybrid : New York, NY - US)

    Senior Data Engineer (Hybrid : New York, NY - US)

    Energy Solutions • New York, NY, United States
    [job_card.full_time]
    Interested in joining a growing company where you will work with talented colleagues, enhance a supportive and energetic culture, and be part of the climate solution? At Energy Solutions, we focus ...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Data Engineer

    Data Engineer

    Tabs • New York, NY, US
    [job_card.full_time]
    AI-native revenue platform for modern finance and accounting teams.Tabs agents automates the entire contract-to-cash lifecycle, including billing, collections, revenue recognition, and reporting, t...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Core Data Engineer I — Pipelines & Data Quality

    Core Data Engineer I — Pipelines & Data Quality

    Etsy, Inc. • New York, NY, United States
    [job_card.full_time]
    A global marketplace for creative goods is seeking a Data Engineer to join their Core Data team.This role involves building and maintaining data pipelines, ensuring data quality, and collaborating ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Data Engineer

    Senior Data Engineer

    Maven Clinic • New York, NY, US
    [job_card.full_time]
    Maven is the world's largest virtual clinic for women and families on a mission to make healthcare work for all of us.Maven's award-winning digital programs provide clinical, emotional, and...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Data Engineer

    Data Engineer

    PROGYNY • New York, NY, United States
    [job_card.full_time]
    Thank you for considering Progyny!.We are looking for a Data Engineer to join our team.You will be responsible for all aspects of the design, development and delivery of data and database solutions...[show_more]
    [last_updated.last_updated_1_hour] • [promoted] • [new]
    Data Engineer

    Data Engineer

    Jobot • New York, NY, US
    [job_card.full_time]
    This Jobot Job is hosted by : Amanda Preston.Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume. Salary : $155,000 - $175,000 per year.We are a global organiza...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Data Analytics Engineer

    Data Analytics Engineer

    Gecko Robotics Inc • New York, NY, United States
    [job_card.full_time]
    Gecko Robotics is helping the world’s most important organizations ensure the availability, reliability, and sustainability of critical infrastructure. Gecko's complete and connected solutions combi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Data Engineer II

    Data Engineer II

    Capital Rx • New York, NY, United States
    [job_card.full_time]
    Judi Health is an enterprise health technology company providing a comprehensive suite of solutions for employers and health plans, including : . PBM) solutions to self-insured employers,.Enterprise H...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Data Engineer

    Data Engineer

    SPS-North America • New York, NY, US
    [job_card.full_time]
    Collaborate with software engineers, business stake holders and / or domain experts to translate business requirements into product features, tools, projects. Develop, implement, and deploy ETL soluti...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Data Engineer

    Senior Data Engineer

    Curinos Inc • New York, NY, US
    [job_card.part_time]
    Curinos empowers financial institutions to make better, faster and more profitable decisions through industry-leading proprietary data, technologies and insights. With decades-long expertise in the ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]