We are looking for a skilled Data Scientist to join our analytics team. The ideal candidate has an eye for building and optimizing data systems and will work closely with our systems architects, data scientists, and analysts to help direct the flow of data within the pipeline and ensure consistency of data delivery and utilization across multiple projects.
- Work closely with other data and analytics team members to optimize the company’s data systems and pipeline architecture
- Design and build the infrastructure for data extraction, preparation, and loading of data from a variety of sources using technology such as SQL
- Build data and analytics tools that will offer deeper insight into the pipeline, allowing for critical discoveries surrounding key performance indicators and customer activity
- Always angle for greater efficiency across all of our company data systems.
- Graduate degree in Computer Science, Information Systems or equivalent quantitative field and 2+ years of experience in a similar Data Scientist role.
- Experience working with and extracting value from large, disconnected and/or unstructured datasets
- Demonstrated ability to build processes that support data transformation, data structures, metadata, dependency and workload management
- Strong interpersonal skills and ability to project manage and work with cross-functional teams
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Experience with the following tools and technologies:
- Hadoop, Spark, Kafka, MapReduce, HDFS
- Relational SQL and NoSQL databases
- Data pipeline/workflow management tools such as Azkaban and Airflow
- Stream-processing systems such as Storm and Spark-Streaming
- Object-oriented/object function scripting languages such as Python, Java, C++, etc.