Data Engineer Jobs
DCM Infotech

DCM Infotech

Data Engineer

Dallas

Contract

1 week ago

Back to jobs

Job Description

Client is looking for a Data Engineer to join our Advanced Analytics team under the Wholesale organization. This engineer will design and deliver data solutions for the analytics and data science teams in the consumer organization. The engineer will be able to work with all types of interesting data: fiber network and outage data, billing, digital behavior, customer interactions, and customer demographic data.
In this role, you will design and build out the modern data infrastructure to streamline the data science workflows for the data science and overarching advanced analytics teams in the consumer organization. These teams are primarily responsible for building predictive models and driving retention and sales strategy for our consumer base.

Responsibilities:
Develop and implement ETL pipelines to efficiently transfer data from SQL Server to Databricks, ensuring accuracy, scalability, and optimal performance.
Validate and transform data within Databricks, including testing for data integrity, consistency, and quality across ingestion and processing stages.
Collaborate with cross-functional teams to design, build, and optimize curated tables in Databricks for analytics and reporting purposes.
Monitor and troubleshoot data workflows, addressing any issues related to performance, reliability, or data anomalies.
Document data engineering processes and provide technical support for ongoing maintenance and enhancements to the Databricks environment.

Required Qualifications:
2+ years of experience as a Data Engineer in a similar role
Experience with data modeling, warehousing and building pipelines
Proven experience in designing and implementing comprehensive data pipelines for a variety of flows (data integration across systems, ETL processes, machine learning infrastructures)
Proficient in SQL and Python
Experience in Databricks ETL

Preferred Qualifications:
The more experience, the better, when it comes to the AWS ecosystem (e.g. GLUE, Athena, S3, Lambda, IAM, SageMaker, CloudWatch, API Gateway), Delta Lake, PySpark, Apache Spark, Airflow, APIs (REST, SOAP, RPC), streaming event data
Understanding of system architecture and experience with large distributed systems (familiarity with the Apache Spark ecosystem)
Has prior experience working with Teradata SQL, MS SQL, and IBM DB2 or similar dialects
Has prior experience in continuous integration and continuous deployment of large scalable data systems
Has prior experience working with and supporting a data science team
Familiarity with working on data involving telecoms, mobile providers, ISPs or cable companies