Senior Data Engineer
We are looking for a Senior Data Engineer with strong experience in building and optimising data pipelines using Databricks, Apache Spark, and PySpark. The ideal candidate is passionate about data architecture, performance optimisation, and working with high-scale distributed data systems.
You will play a key role in designing and developing scalable data ingestion, transformation, and processing pipelines, enabling reliable and timely data for downstream analytics, reporting, and machine learning.
About the Client
The world's largest human resources consulting firm is headquartered in New York City, with its main branches in 40+ countries. Over 20,500 employees operate internationally in more than 130 countries. Its services are used by 97% of Fortune 500 companies.
What You’ll Do
- Design, develop, and maintain scalable and efficient data pipelines using Databricks, Apache Spark, and PySpark
- Collaborate with data scientists, analysts, and product teams to understand data requirements and ensure reliable data delivery
- Implement ETL/ELT workflows to extract, cleanse, transform, and load data from various structured and unstructured sources
- Optimize Spark jobs and workflows for performance, scalability, and cost-efficiency
- Develop reusable components, frameworks, and libraries to accelerate pipeline development
- Monitor data quality and pipeline health; implement data validation and error-handling mechanisms
- Ensure compliance with security, privacy, and governance policies
- Contribute to best practices in data engineering and cloud-native data architecture
What You Bring
- 3–6+ years of experience in data engineering or software engineering with a focus on large-scale data processing
- Strong hands-on experience with Apache Spark and PySpark
- Proficiency in working with Databricks platform (including notebooks, jobs, clusters, and workspace management
- Solid knowledge of data formats (Parquet, Avro, JSON, etc.) and data modeling concepts
- Experience building and orchestrating ETL/ELT pipelines (e.g., using Airflow, Databricks Workflows, Azure Data Factory, etc.)
- Familiarity with cloud platforms (Azure, AWS, or GCP) and their data services
- Strong programming skills in Python; SQL expertise is a must
- Understanding of CI/CD practices and version control (Git)
- Ability to work in Agile development environments and collaborate with cross-functional teams
Nice to have
- Experience with Delta Lake or other transactional data lake technologies
- Familiarity with data lakehouse architecture
- Exposure to data warehousing tools and MPP databases (Snowflake, Redshift, BigQuery, etc.)
- Knowledge of data governance, lineage, and cataloging tools (e.g., Unity Catalog, DataHub, Collibra)
- Experience with streaming data (Kafka, Spark Structured Streaming)
- English level - Upper-Intermediate
- Department
- Data Engineering
- Locations
- Armenia, Bulgaria, Latvia, Poland, Serbia, Spain, Turkey, Uzbekistan
- Remote status
- Fully Remote
About Bonapolia
For job seekers, BONAPOLIA offers a gateway to exciting career prospects and the chance to thrive in a fulfilling work environment. We believe that the right job can transform lives, and we are committed to making that happen for you.
Already working at Bonapolia?
Let’s recruit together and find your next colleague.