Job Description
Position Overview: We are looking for a high-level Data Engineer to oversee the engineering and optimization of massive big data processing systems. You will focus on building resilient infrastructure that supports high-velocity data ingestion and complex downstream analytics.
Detailed Responsibilities:
Big Data Tech: Build, maintain, and scale distributed data processing systems using Spark, Hadoop, and Kafka.
ETL/ELT Architecture: Design and automate robust ETL processes and data warehousing solutions capable of handling terabytes of structured and unstructured data.
Pipeline Optimization: Fine-tune data flows to ensure low-latency consumption by machine learning models and executive-level BI reporting.
Data Integrity: Implement end-to-end automated data quality checks and validation logic to maintain a definitive "single source of truth."
Scalability: Partner with cloud architects to ensure the data infrastructure is cost-effective and horizontally scalable.