Key Responsibilities
- Data Pipeline Development: Design and implement end-to-end data pipelines and workflows using Fabric Pipelines, Dataflows Gen2, and Notebooks.
- Fabric Lakehouse/Warehouse: Create, manage, and optimize Lakehouse/Warehouse structures and Delta Lake tables.
- Data Processing & Transformation: Use Python, PySpark, and Spark SQL for complex data ingestion, transformation, and processing.
- Performance Optimization: Tune SQL queries and PySpark jobs for performance, efficiency, and minimal compute usage.
- Data Governance & Quality: Ensure data quality, integrity, and security using Purview, RBAC, and OneLake governance.
- Power BI Integration: Collaborate with analytics teams to build semantic models, direct lake/direct query datasets, and DAX calculations.
Required Skills and Qualifications
- Experience: 3-10+ years in Data Engineering, with specific experience in Microsoft Fabric.
- Platform Expertise: Strong knowledge of Microsoft Fabric components (OneLake, Data Factory, Data Engineering, Synapse) and Azure data services (ADLS Gen2, Azure SQL).
- Technical Skills: Proficient in SQL, Python/PySpark, Spark SQL, and Delta Lake.
- Architecture: Experience with modern data warehousing, Medallion Architecture, and ETL/ELT methodologies.
- Certifications: Microsoft Certified: Fabric Data Engineer Associate (DP-700 or DP-600) is highly preferred.
- Soft Skills: Strong collaboration, communication, and problem-solving