Job description
Our Client is looking for a Data Engineer who will primarily focus on building data pipelines. will be expected to leverage a variety of advanced tools and technologies such as Kafka/Kinesis for real-time data processing/streaming, Relational/No-SQL databases for robust data storage and management, and other Integration tools for seamless data flow across various cloud and on-premises platforms. The Data Engineer will also utilize ETL processes to extract data from various sources, transform the data to fit operational and business needs, and load it into an end target. In addition to these, they will be expected to have familiarity with Pub-Sub messaging patterns or similar data dissemination models to ensure efficient data distribution and consumption. One of the primary goals is to create a real-time bidirectional data pipeline from Oracle transactional databases to a data lake in the cloud.
Responsibility
- Develop, construct, test, and maintain data architectures and pipelines
- Create best-practice ETL frameworks; repeatable and reliable data pipelines that convert data into powerful signals and features
- Handle raw data (structured, unstructured, and semi-structured) and align it into a more usable, structured format that is better suited for reporting and analytics.
- Work with the cloud solutions architect to ensure data solutions are aligned with company platform architecture and all aspects related to infrastructure.
- Collaborate with business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organization
Qualifications
The qualified candidate must possess the following skills and experience in the following areas:
- A bachelor’s degree in computer science, data science, software/computer Engineering, or a related field.
- Proven experience as a data engineer or in a similar role, with a track record of manipulating, processing, and extracting value from large disconnected datasets
- Demonstrated technical proficiency with data architecture, databases, and processing large data sets.
- Proficient in Oracle databases and comprehensive understanding of ETL processes, including creating and implementing custom ETL processes.
- Experience with cloud services (AWS, Azure), and understanding of distributed systems, such as Hadoop/MapReduce, Spark, or equivalent technologies.
- Knowledge of Kafka, Kinesis, OCI Data Integration, Azure Service Bus or similar technologies for real-time data processing and streaming.
- Data-driven mindset, with the ability to translate business requirements into data solutions.
- Experience with version control systems like Git, and with agile methodologies/scrum.
- Strong organizational, critical thinking, and problem-solving skills, with clear understanding of high-performance algorithms and Python scripting.li>
- Proficient in Oracle databases and comprehensive understanding of ETL processes, including creating and implementing custom ETL processes.
- Certifications in related field would be an added advantage (e.g. Google Certified Professional Data Engineer, AWS Certified Big Data, etc.).