logo

View all jobs

Data Scientist

Cambridge, MA
Data Scientist
Cambridge, MA

 
The Big Data Scientist is responsible for modeling complex Institute problems, discovering Institute insights and identifying opportunities through the use of statistical, algorithmic, mining and visualization techniques. In addition to advanced analytic skills, this role is also proficient at integrating and preparing large, varied datasets, architecting specialized database and computing environments, and communicating results.
 
Data Scientists work closely with clients, data stewards, project/program managers, and other IT teams to turn data into critical information and knowledge that can be used to make sound organizational decisions. Other responsibilities include providing data that is congruent and reliable. They need to be creative thinkers and propose innovative ways to look at problems by using data mining (the process of discovering new patterns from large datasets) approaches on the set of information available. They will need to validate their findings using an experimental and iterative approach. Also, Data Scientists will need to be able to present back their findings to the business by exposing their assumptions and validation work in a way that can be easily understood by their business counterparts.
 
These professionals will need a combination of business focus, strong analytical and problem solving skills and programming knowledge to be able to quickly cycle hypothesis through the discovery phase of the project. Excellent written and communications skills to report back the findings in a clear, structured manner are required.
 
Essential Functions: 
 
  • Designs experiments, test hypotheses, and build models.
  • Conducts advanced data analysis and complex designs algorithm.
  • Works with Institute stakeholders to identify the business requirements and the expected outcome.
  • Models and frames business scenarios that are meaningful and which impact on critical business processes and/or decisions.
  • Identifies what data is available and relevant, including internal and external data sources, leveraging new data collection processes.
  • Collaborates with Institute subject matter experts to select the relevant sources of information.
  • Works with IT teams to support data collection, integration, and retention requirements based on the input collected with the business.
  • Solves client analytics problems and communicates results and methodologies.
  • Works in iterative processes with the client and validates findings.
  • Develops experimental design approaches to validate finding or test hypotheses.
  • Validates analysis by comparing appropriate samples.
  • Defines the validity of the information, how long the information is meaningful, and what other information it is related to.
  • Works with the data steward to ensure that the information used is in compliance with the regulatory and security policies in place.
  • Qualifies where information can be stored or what information, external to the organization, may be used in support of the use case.
  • Identifies and analyzes patterns in the volume of data supporting the initiative, the type of data (e.g., images, text, clickstream or metering data) and the speed or sudden variations in data collection.
  • Partners with the data stewards to define the data quality expectation in the context of the specific use case.
  • Recommends ongoing improvements to methods and algorithms that lead to findings, including new information.
  • Presents and depicts the rationale of their findings in easy to understand terms for the business.
  • Presents back results that contradict common belief, if needed.
  • Educates the organization both from IT and the business perspectives on new approaches, such as testing hypotheses and statistical validation of results. Helps the organization understand the principles and the math behind the process to drive organizational buy-in.
  • Provides business metrics for the overall project to show improvements (contribution to the improvement should be monitored initially and over multiple iterations).
  • Demonstrates the following scientist qualities: clarity, accuracy, precision, relevance, depth, breadth, logic, significance, and fairness.
  • Provides on-going tracking and monitoring of performance of decision systems and statistical models.
 
Qualifications & Technical Skills:
 
  • Bachelor's degree in mathematics, statistics, computer science or related field.
  • Typically requires 3-5 years of relevant quantitative and qualitative research and analytics experience.
  • Solid knowledge of statistical techniques.
  • Knowledge of Hadoop, Hive and/or MapReduce.
  • Strong programming skills (i.e. Python, R, Java, SQL) and statistical modeling software (e.g. SAS).
  • Experience using machine-learning algorithms.
  • Proficiency in the use of statistical packages.
  • Proficiency in statistical analysis, quantitative analytics, forecasting/predictive analytics, multivariate testing, and optimization algorithms.
  • The ability to come up with solutions to loosely defined business problems by leveraging pattern detection over potentially large datasets.
  • Enjoys discovering and solving problems.
  • Strong communication and interpersonal skills.
  • Knowledge of one or more business/functional areas.
  • Ability to work in teams and collaborate with others to clarify requirements.

Share This Job

Powered by