- LocationPune, Pune, India,
Roles and Responsibilities
● Solicit business, functional, non-functional and technical requirements through interviewing and
requirements gathering process.
● Analyze and document above requirements and data definitions, perform data analysis, assist in
change management, training and testing efforts.
● Works with stakeholders to gather requirements on merging, de-duplicating, standardizing data.
● Develop, support, and refine new data pipelines, data models, business logic, data schemas as
code, and analytics to product specifications.
● Prototype and optimize data type checks to ensure data uniformity prior to load.
● Develop, and refine batch processing data pipeline frameworks.
● Maintain, improve, and develop expertise in existing production data models, and algorithms.
● Learn and utilize business data domain knowledge and its correlation to underlying data
sources.
● Define, document, and maintain a data dictionary including data definitions, data sources,
business meaning and usage of information.
● Identify and validate opportunities to reuse existing data and algorithms.
● Collaborate on design and implementation of data standardization procedures.
● Share team responsibilities, such as contributing to development of data warehouses and
productizing algorithms created by Data Science team members.
● Participate in on-call and weekly shift rotation.
Skills Set
● 4-6 years of experience building data pipelines and using ETL tools(Must-have).
● 2+ years of experience in ETL tools like Talend /Jaspersoft ETL tools(Must-have) .
● 2+ years of experience in SQL programming language (Must-have) .
● Strong in writing stored procedures and sql queries(Must-have).
● 2+ years of experience in python programming (Must-have) .
● Sound knowledge of distributed systems and data processing with spark.
● Knowledge of any tool for scheduling and orchestration of data pipelines or workflows (preferred-
Airflow)(must to have)
