Pyspark Engineer

Details of the offer

Project description: We are looking for skilled PySpark Engineers to join our team, working on a high-impact data engineering project.
The project involves processing large datasets, optimizing ETL pipelines, and building scalable solutions to manage complex data workflows.
The ideal candidate will collaborate closely with data scientists, data analysts, and software engineers to drive robust, data-driven insights for business decisions.
ResponsibilitiesDesign, develop, and maintain ETL pipelines using PySpark, optimizing for performance and scalability.Work with large volumes of structured and unstructured data, transforming data to meet business needs.Integrate data from multiple sources into the data platform, ensuring data integrity and quality.Collaborate with cross-functional teams to understand data requirements and translate them into efficient data workflows.Implement best practices for data governance, monitoring, and data security.Debug and troubleshoot issues across ETL pipelines and data workflows.Continuously improve performance, scalability, and reliability of existing data pipelines.Provide documentation and training for data workflows and processes.Skills Must haveProficiency in PySpark: In-depth experience with PySpark for data processing and transformation tasks.SQL Knowledge: Strong command of SQL for querying and processing data.Data Warehousing Concepts: Familiarity with data warehousing, data lakes, and data integration principles.Cloud Platforms: Experience with cloud environments like AWS, GCP, or Azure for data storage and processing.Big Data Technologies: Hands-on experience with Hadoop and Spark ecosystem (Spark SQL, Spark Streaming).Data Modeling: Experience in designing and implementing efficient data models.Python Programming: Strong Python skills, particularly in data manipulation and analysis.Nice to haveExperience with Airflow or Other Orchestration Tools: Knowledge of workflow orchestration tools for scheduling and monitoring data pipelines.Knowledge of Apache Kafka: Understanding of Kafka for real-time data streaming and integration.Familiarity with Data Visualization Tools: Knowledge of visualization tools like Tableau, Power BI, or similar.Machine Learning Exposure: Familiarity with machine learning concepts, particularly with integrating ML models in data workflows.Agile Methodology: Experience working in Agile / Scrum environments.Data Governance and Compliance Knowledge: Understanding of data governance frameworks and compliance standards, such as GDPR.Other Languages: English (C1 Advanced)
Seniority: Senior
#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Talent_Dynamic-Ppc

Requirements

Cloud Solution Architect

A BIT ABOUT US HUB24 Group (ASX:HUB) leads the wealth industry as the best provider of integrated platform, technology and data solutions, and we're not done...


Hub24 Group - New South Wales

Published 8 days ago

Senior Fullstack Javascript Developer - Certified Great Place To Work.

Are you looking for an opportunity to work at one of the best technology companies to work for in Australia? Join 4mation Technologies, as a Senior Fullstack...


4Mation - New South Wales

Published 8 days ago

Technical Seo Specialist

Hey there! Thanks for stopping by. We're Airtasker, the tech company that connects people who need to get things done with those who have the skills to do it...


Airtasker - New South Wales

Published 8 days ago

Platform Screen Door Techs

· $33.67 - $38.46 Per Hour / $70 - $80K, annual salary, Plus super.   · Permanent night Position, Immediate Start. · 22.00 - 6.00am Sunday/Monday - Thursday/...


2Xm Recruit - New South Wales

Published 8 days ago

Built at: 2024-11-21T17:06:26.036Z