As a Data Engineer on the Askuity team, you will be working to understand and improve the primary dataset and processing architecture. You will develop efficient data capture and transformation processes and complex data models that form the core data foundation to enable analysts and data scientists, extending to self-serve solutions for the broader business teams.

Askuity is a Toronto-based retail analytics software company operating as an innovative new division within The Home Depot. Askuity’s mission is to transform merchant-vendor collaboration between The Home Depot and its product suppliers by enabling best-in-class data-driven decision-making and real-time retail execution.

You will work closely with the COO, VP of Client Operations, and external experts to build out Askuity’s Data Science capabilities. We’re looking for self-starters with a strong sense of urgency who thrive when operating in a fast-paced environment, enjoy the challenge of highly complex technical contexts working with hundreds of terabytes of data, and, above all else, are passionate about data and analytics.

Position Responsibilities:

  • Design, construct, test and maintain a scalable data pipeline and environment to support future analytics.
  • Assemble large, complex data sets that meet business requirements
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies
  • Build analytics tools that utilize the data pipeline to provide actionable insights for vendors, merchants, and other stakeholders
  • Work closely with experts (whether from BlackLocus, Home Depot, an outside consultancy, and others) to cleanse, analyze and interpret data
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader

Experience/Knowledge Required:

Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.

  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets
  • Strong analytic skills related to working with unstructured datasets
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management
  • Strong written and verbal communication skills to support the clear communication of your objectives and findings
  • Strong project management and organizational skills.
  • A degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
  • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.

If you’re interested in working for a fast-growing Toronto startup, then drop us a line at