Determine best practices and architecture for all product data including images, machine learning classifications, user labels, metadata and more
Develop and implement data pipelines which range from real-time streaming to connected users to long-term queryable warehousing for retrieval and/or machine learning training.
Implement solutions in AWS, leveraging tools such as S3, Athena, Redshift, RDS or others
Determine the need for and/or drive partnerships with third-parta data processing and/or warehousing companies such as Snowflake, Databricks, or others
Work closely with the security team to ensure stringent data privacy requirements are being met.
Work closely with backend, devops and machine learning teams to integrate architectural solutions into the end user product
Develop cost-effective data practices that meet all stakeholders' needs
Minimum Qualifications
Bachelor’s Degree in Computer Science or a related field
5+ years of experience utilizing AWS for scaled production big data problems
3+ years of experience developing data pipelines including relational databases and long-term warehouse-level storage
3+ years of experience with Postgres
Experience with AWS RDS
Experience with AWS Athena
Experience with data warehousing and processing solutions such as RedShift, Snowflake and/or Databricks