- LocationPune, Pune, India
-
IndustryFMCG, Retail & E-Commerce
KEY ACCOUNTABILITIES/OUTCOMES
- Building, monitoring, and optimizing data pipelines
- Learning and using modern data preparation, integration, and metadata management tools and techniques
- Monitoring and remediating data quality issues
- Tracking data consumption patterns
- Monitoring schema changes
KNOWLEDGE/SKILLS/EXPERIENCE
A bachelor's or master's degree in computer science, statistics, data management, information systems, or a related quantitative field is preferred.
Candidates will ideally have a combination of IT skills, data governance skills, and analytics skills with a technical or computer science degree. Experience with data management disciplines is strongly preferred, including data integration, modeling, optimization, data quality, and other areas directly relevant to data engineering responsibilities and tasks.
TECHNICAL SKILLS
- Must be hands-on/well-verses in Python/PySpark coding and scripting
- Strong expertise and hands-on experience with Azure Data Factory and Azure Databricks
- Databricks Lakehouse architecture experience strongly preferred
- Programming/Coding Skills with familiarity with software and system engineering design principles & standards (e.g., TDD)
- Foundational knowledge of data management architectures like Data Warehouse, Data Lake, Data Hub and the supporting processes like Data Integration, Governance, Metadata Management
- Ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, data quality and observability, schemas, metadata, and job management
- Understanding of structured, semi-structured, and unstructured data sources
- Experience with data integration technologies such as ETL/ELT, data replications/CDC, message-oriented data movement, API design and access, stream data integration, and data virtualization
- Basic understanding of machine learning algorithms and approaches
- Knowledge of/proficiency using one or more popular languages and frameworks such as SQL, Python, Scala, Spark, Command Line, etc.
- Familiarity with Open-Source and Commercial (e.g., Azure Cloud Services) technology
- Ability to automate development via CICD patterns and processes
- Experience leveraging version control and repository management systems (Git experience recommended)
- Experience leveraging IDE and source code editors (e.g., Visual Studio Code)
- Data visualization (e.g., Power BI, Tableau, QlikSense)
BUSINESS SKILLS
- Self-driven and action-oriented
- Ability to work in diverse, cross-functional teams
- Strong collaboration skills with business & technical personas
- Strong moderation and communication skills
- Strong presentation skills, including storytelling and other techniques to guide and inspire
- Ability to understand business & functional needs and translate into data analytics problems
- Ability to refine requirements and design and develop data deliverables accordingly
- Display drive and curiosity to understand the business process to its core and its intersections with data
- Network with internal and external partners
- Continuous learning and upskilling (conferences, publications/research, courses, meetups, online resources, etc.)
- Educating and training business and technical counterparts
- Cultivates innovation and creativity
- Familiarity with design thinking techniques and agile & lean methodologies