Software Developer - ETL
Overview
The Health Services Cluster, under the Ministry of Public and Business Service Delivery and Procurement, is seeking a Senior Software Developer – ETL to join their centralized data platform team in Toronto. The successful consultant will design, build, and optimize complex ETL/ELT pipelines and Medallion Data Lakehouse architectures supporting large-scale, mission-critical health services systems. This is a hands-on senior technical role with a strong emphasis on Azure-based data engineering, pipeline automation, and data quality assurance.
Key Responsibilities
- Design technical solutions for data acquisition and storage into a centralized data repository
- Develop ELT scripts, implement data-driven logic, and conduct unit testing to ensure solution integrity
- Design and build scalable Medallion Data Lakehouse architectures using Delta Lake and Databricks
- Build, automate, and optimize complex ETL/ELT pipelines using Azure Data Factory (ADF), Databricks (PySpark, SQL, Delta Live Tables), and dbt
- Conduct database modeling and design to improve overall performance using star and snowflake schemas
- Implement orchestrated workflows and job scheduling within Azure environments
- Investigate and resolve data incidents, determining whether issues originate from loading code or upstream data providers
- Execute service requests related to routine and ad-hoc data loads
- Provide data quality checks and report on data quality issues
- Implement data lineage, cataloging, metadata management, and data quality frameworks including Unity Catalog for Databricks permission management
- Develop and maintain system design models, technical documentation, and specifications
- Produce design artifacts and documentation to support the long-term maintenance of implemented solutions
Must-Have Requirements
- 10+ years of experience designing and developing scalable Medallion Data Lakehouse architectures
- 10+ years of expertise in data ingestion, transformation, and curation using Delta Lake and Databricks
- 10+ years of experience building, automating, and optimizing complex ETL/ELT pipelines using Azure Data Factory (ADF), Databricks (PySpark, SQL, Delta Live Tables), and dbt
- 10+ years of experience integrating structured and unstructured data sources into star/snowflake schemas
- 10+ years of strong knowledge of relational databases (SQL Server, Synapse, PostgreSQL) and dimensional modeling
- 10+ years of advanced SQL query optimization, indexing, partitioning, and data replication strategies
- 10+ years of experience with Apache Spark, Delta Lake, and distributed computing frameworks in Azure Databricks
- 10+ years of deep expertise with Azure Data Lake Storage (ADLS), Azure Synapse Analytics, Azure SQL, Event Hubs, and Azure Functions
- 10+ years of proficiency in Python (PySpark), SQL, and PowerShell for data engineering workflows
- 10+ years of experience working with Parquet, ORC, and JSON formats for optimized storage and retrieval
- 10+ years of experience in technical systems specifications and translating them into working, tested applications for large, complex, mission-critical systems
- 10+ years of experience in technical analysis, program design, writing and/or generating code, and conducting unit tests
- 10+ years of experience with software across various computing platforms, operating systems, database technologies, communication protocols, middleware, and gateways
- 10+ years of experience developing and maintaining system design models, technical documentation, and specifications
- 5+ years of experience with CI/CD automation (Azure DevOps, GitHub Actions) for data pipelines
- 5+ years of experience conducting technical evaluation and assessment of options for design issues, integration capabilities, and gap analysis
- Experience implementing cloud security, RBAC, and data governance best practices
- Experience with Unity Catalog for managing permissions in Databricks environments
- Expertise in Power BI including DAX, data modeling, performance tuning, and integration with Azure Synapse and Databricks SQL Warehouses
- Familiarity with MLflow, AutoML, and embedding AI-driven insights into data pipelines
- End-to-end SDLC experience
- Security clearance: CRJMC required
Nice-to-Have Skills
- Knowledge of Public Sector Enterprise Architecture artifacts, processes, and practices with ability to produce standards-compliant technical documentation
- Knowledge of Project Management Institute (PMI) and Public Sector I&IT project management methodologies
- Knowledge and understanding of Ministry policy and IT project approval processes
- Experience with large, complex IT health-related projects
Work Environment
This is a fully onsite role based at 5700 Yonge Street, Toronto. The consultant is expected to work 7.25 hours per calendar day between 8:00 AM and 5:00 PM, Monday to Friday. A CRJMC security clearance is required. The role operates within an Ontario Public Service (OPS) environment supporting the Health Services Cluster.