Description:
• Collaborate with business groups at data.world to ask thoughtful questions and gather requirements around key insights and reporting needs.
• Engineer lasting solutions that turn application data and streaming events into self-service reports and insights using AWS Cloud Services and SaaS tools.
• Develop and maintain a suite of production SQL data models using dbt to transform our streaming and transactional data into a form suitable and efficient for analytics and data science work.
• Administer and optimize our Snowflake data warehouse and BI infrastructure.
• Evolve our star schema and data flows to ensure utility and performance. Diagram and document as needed to support understanding and troubleshooting.
• Write and maintain SQL jobs in support of ETL/ELT and BI analysis, reporting, and visualization, as well as ability to troubleshoot SQL jobs as required.
• Build and maintain internal data catalog including data dictionaries, glossary, and curated datasets in support of easy self-service by the rest of the company.
• Develop BI data reports, visualizations, and queries to support measuring our KPIs and supporting the success of our SaaS business.
• Implement new productized data and analytics capabilities in support of customers understanding their usage and improving data governance.
• Conduct evidence-based investigations and draw actionable conclusions in support of company and team goals and overall product success.
• Be a steward and evangelist for data driven culture and data best practices within the company.
• Be customer zero, leveraging our product and providing feedback as one of the key target users that data.world is actually intended for.
Our data stack:
• Segment as our customer data platform
• AWS cloud services and Stitch/Fivetran for extraction and load jobs
• Dagster for workflow orchestration
• Snowflake and dbt for in-warehouse transformations
• data.world for data catalog, governance, and collaboration
• Tableau as our data serving and BI layer
• CircleCI for CI/CD
Experience and capabilities you have:
• 2+ years of core data engineering experience writing production grade ELT jobs using scripting languages such as Python and any workflow orchestration tools such as Airflow, Dagster, Prefect, Luigi, etc.
• 2+ years of experience writing production grade SQL and working with any of the modern data warehouses such as Snowflake, BigQuery or Redshift.
• 2+ years of experience in deploying cloud resources using Infrastructure as Code (IaC) tools such as AWS Cloud Formation, Terraform, etc.
• Familiarity with dbt (data build tool) for cloud data warehouse transformations.
• Strong data modeling experience using star schema or other data modeling patterns.
• Strong interpersonal skills and experience interfacing with others internally and externally from the company.
• Good communication and presentation skills with the ability to explain concepts and conclusions around data and insights in a clear, concise, and compelling way.
Big pluses:
• Experience working with dbt (data build tool) either in cloud or self-hosted environments.
• Experience developing data visualizations and choosing the best way to present information using BI tools such as Tableau, Looker etc.
• Experience maintaining production grade CI/CD pipelines.
• Experience with Operational Analytics or Reverse ETL tools such as Hightouch, Census, etc.
• Experience working with streaming data infrastructure such as AWS Kinesis, Kafka, Materialize, etc.
• Experience working in SaaS or enterprise software companies in the data or analytics space.