Senior Data Engineer
USA, Remote
$110,560 to $155,840
Job Description
You are a driven and motivated problem solver ready to pursue meaningful work. You strive to make an impact every day & not only at work, but in your personal life and community too. If that sounds like you, then you've landed in the right place.
The Data Science AI Factory team is committed to exploring new ways to use data and analytics to solve business problems. The team utilizes a variety of data sources, with a strong focus on unstructured and semi-structured text using NLP to enhance outcomes related to claim, underwriting, operations and the customer experience.
As a Sr. Data Engineer, you will be an established thought leader through close partnerships with expert resources to design, develop, and implement data assets for a wide range of new initiatives across multiple lines of business. The role involves heavy data exploration, proficiency with SQL and Python, knowledge of service-based deployments and APIs, and the ability to discover and learn quickly through collaboration. There is a need to think analytically and outside of the box while questioning current processes and continuing to build on the individual’s business acumen.
There will be a combination of team collaboration and independent work efforts. We seek candidates with strong quantitative background and excellent analytical and problem-solving skills. This position combines business and technical skills involving interaction with business customers, data science partners, internal and external data suppliers and information technology partners.
Responsibilities
Identify and validate internal and external data sources for availability and quality. Work with SME’s to describe and understand data lineage and suitability for a use case.
Create data assets and build data pipelines that align to modern software development principles for further analytical consumption. Perform data analysis to ensure quality of data assets.
Create summary statistics/reports from data warehouses, marts, and operational data stores.
Extract data from source systems, and data warehouses, and deliver in a pre-defined format using standard database query and parsing tools.
Understand ways to link or compare information already in our systems with new information.
Perform preliminary exploratory analysis to evaluate nulls, duplicates and other issues with data sources.
Work with data scientists and knowledge engineers to understand the requirements and propose and identify data sources and alternatives.
Produce code artifacts and documentation using Github for reproducible results and hand-off to other data science teams.
Propose ways to improve and standardize processes to enable new data and capability assessment and to enable pivoting to new projects.
Understand data classification and adhere to the information protection and privacy restrictions on data.
Collaborate closely with data scientists, business partners, data suppliers, and IT resources.
Experience & Skills
Candidates must have the technical skills to transform, manipulate and store data, the analytical skills to relate the data to the business processes that generates it, and the communication skills to document & disseminate information regarding the availability, quality, and other characteristics of the data to a diverse audience. These varied skills may be demonstrated through the following:
Bachelor’s degree or equivalent experience in a related quantitative field
5 + years experience accessing and retrieving data from disparate large data sources, by creating and tuning SQL queries. Understanding of data modeling concepts, data warehousing tools and databases (e.g. Oracle, AWS, Snowflake, Spark/PySpark, ETL, Big Data, and Hive)
Demonstrated ability to create and deliver high quality Python code using software engineering best practices. Experience with object-oriented programming and software development a plus. Proficiency with Github and Linux highly desired.
Ability to analyze data sources and provide technical solutions. Strong exploratory and problem-solving skills to check for data quality issues.
Determine business recommendations and translate into actionable steps
Self-starter with curiosity and a willingness to become a data expert
Demonstrate a passion to both learn new skills and lead discovery of the data research
Results oriented with the ability to multi-task and adjust priorities when necessary
Ability to work both independently and in a team environment with internal customers
Ability to articulate and train technical concepts regarding data to both data scientists and partners