Job Description |
The Data Engineer will collaborate with various other IT groups, business partners, external service providers and play a key role in the design, development and operations of new data analytics and data integration platform for which includes Enterprise Data Lake and Enterprise data warehouse. Duties and Responsibilities: Participate in Requirements Gathering: work with key business partner groups and other Data Engineering personnel to understand department-level data requirements. Provide a unified strategy for GCP transformation governance, security, implementation and operations Drive innovation within Data Engineering by playing a lead role in technology decisions for the future of our data science, analysis, and reporting needs Responsible for the design and implementation of the enterprise data lake/data warehouse/data platforms ensuring reliable data infrastructure and creating data solutions for business partners. Design Data Pipelines: work with other Data Engineering personnel on an overall design for flowing data from various internal and external sources into the data platform Build Data Pipelines: leverage standard toolset and develop ETL/ELT code to move data from various internal and external sources into the data platform. Support Data Quality Program: work with Data QA Engineer to identify automated QA checks and associated monitoring & alerting to ensure data platform maintains consistently high-quality data Migrate data from current state to new enterprise solutions Thought leader for data stewardship and data governance. Support Operations: triage alerts channeled to you and remediate as necessary. Key contributor to defining, implementing and supporting: Data Services Data Dictionary Tool Standards Best Practices Data Lineage User Training Qualifications: 5+ years’ experience with data warehouse design, build, QA and upkeep Bachelor’s Degree in Computer Science, related field, or equivalent experience Experience working with data lake architecture in data domain Strong ELT/ ETL designer/developer Strong SQL and NoSQL database development skills Strong Python Structured & unstructured data expertise Cloud environment development & operations experience with GCP Preference for candidates experienced with: Google Cloud Platform (GCP) and associated services; e.g. BigQuery, GCS, Cloud Composer, Dataproc, Dataflow, Dataprep, Cloud Pub/Sub, Metadata DB, Data Studio, Datalab, others Other important tools: Apache Airflow (scheduler), Bitbucket and git (version control), alert notification tools, Docker Familiarity with data visualization tools (e.g. Tableau, D3.js, Python, and R) Real-time data replication/streaming tools Data Modeling Excellent verbal and written communications Basic understanding of GCP AI and Machine Learning Strong team player Education Requirements Bachelor’s degree in Computer Science, Digital Strategy combined with technology execution, or similar. Additional considerations for demonstrated post degree continuing education, or self-study credentials. Communication Skills Ability to communicate effectively with technical and non-technical colleagues, in person and remotely. Demonstrated ability to create the full lifecycle of documentation (Enterprise data strategy, data architecture, logical/conceptual/physical data model documents, metadata and business glossary documents) |