Available for opportunities

Jabulani
Mcineka

Data Engineer | AWS Certified | ETL Pipelines & Data Warehousing | Python, SQL | AWS (S3, Glue, Lambda)
AWS Certified Data Engineer with hands-on experience building data pipelines, cloud-based data solutions, and analytical systems using Python, SQL, and AWS.

Jabulani Mcineka

About Me

I'm an AWS Certified Data Engineer based in Durban, South Africa โ€” currently working at Africa Health Research Institute (AHRI), where I design and maintain production-grade ETL pipelines, Power BI dashboards, and clinical data systems that support real-world health research across Africa.

AWS My work spans the full data lifecycle โ€” from ingesting raw data and designing cloud-based data solutions on AWS, to transforming, validating, and delivering data for reporting and analytics. I focus on building reliable, scalable systems that turn complex, messy data into meaningful insights.

AWS I have hands-on experience with Python, SQL, and AWS services including S3, Lambda, Glue, and RDS, applied in both professional and project-based environments.

I continuously strengthen my data engineering skills through hands-on projects, focusing on data pipelines, cloud architectures, and scalable data systems, and exploring areas such as real-time data pipelines and AWS-based data platforms..

๐Ÿ“ Location

Durban, KwaZulu-Natal, South Africa

๐Ÿ’ผ Current Role

Data Manager ยท Africa Health Research Institute

๐ŸŽ“ Education

PGDip Computer Science ยท Tshwane University of Technology

๐Ÿ… Certifications

AWS Cloud Practitioner ยท AWS Data Engineer Associate

View Projects โ†’ GitHub Profile LinkedIn
15+
Projects Built
2
Certifications
AWS
Cloud Platform
100%
Free Tier

Skills & Technologies

โ˜๏ธ

AWS Cloud

Building scalable cloud infrastructure using AWS Free Tier services for real-world data engineering workloads.

S3 Glue Athena IAM EC2
๐Ÿ

Python Engineering

Writing production-grade Python for data ingestion, transformation, and automated quality validation pipelines.

Python 3.12 Pandas PyArrow Boto3 Requests
๐Ÿ”„

Pipeline Orchestration

Designing and deploying automated DAGs with Apache Airflow for scheduled, monitored, and reliable data pipelines.

Apache Airflow Docker DAGs BashOperator
๐Ÿ—๏ธ

Data Architecture

Implementing Medallion Architecture (Raw โ†’ Silver โ†’ Gold) for clean, scalable, and queryable data lake designs.

Medallion Architecture Parquet Data Lake ETL
๐Ÿ“Š

Data Analysis

Querying and analysing large datasets using SQL on Athena and Python for delivery, sales, and e-commerce insights.

SQL Athena Pandas Data Quality
๐Ÿณ

DevOps & Tools

Containerising services with Docker Compose and managing code with Git and GitHub for version control and CI/CD.

Docker Docker Compose Git GitHub

Projects

๐ŸŒFake Store API
โ†’
๐ŸชฃS3 Raw (JSON)
โ†’
โš™๏ธGlue ETL
โ†’
๐ŸชฃS3 Silver (Parquet)
โ†’
๐Ÿ”Athena SQL
01

E-Commerce Real-Time Data Pipeline

Production-grade data engineering pipeline on AWS Free Tier. Ingests e-commerce data from Fake Store API, transforms JSON to Parquet via Medallion Architecture, catalogs with Glue, queries with Athena โ€” fully orchestrated by Apache Airflow DAGs running in Docker.

AWS S3 Glue Athena Airflow Docker Python
Airflow DAG - E-Commerce Pipeline
02

Delivery Pipeline

Full delivery data pipeline project โ€” ingesting, processing, and analysing delivery data through an automated data engineering workflow built for real-world logistics use cases.

Python Data Pipeline ETL
Delivery Pipeline Architecture
03

Delivery Analysis

In-depth analysis of delivery data โ€” exploring patterns, performance metrics, and operational insights using Python and data analysis techniques to drive business decisions.

Python Pandas Data Analysis
Delivery Analysis Dashboard
04

Sales Data Warehouse

Designed and implemented a sales data warehouse โ€” structured for efficient querying and reporting on sales performance, trends, and KPIs using modern data warehousing principles.

SQL Data Warehouse Python
05

Facility Visits Analysis

Analysis of facility visit data โ€” uncovering usage patterns, peak times, and operational insights to support data-driven facility management and resource planning decisions.

Python Data Analysis Pandas
Health Dashbord
06

AWS EC2 Node + Nginx Setup

Cloud infrastructure project deploying a Node.js application on AWS EC2 with Nginx as a reverse proxy โ€” demonstrating cloud deployment, server configuration, and DevOps skills.

AWS EC2 Nginx Node.js Linux
In Progress
07

๐Ÿ“˜ Data Warehouse ETL Toolkit

Studying ETL design principles and data warehouse architecture patterns based on The Data Warehouse ETL Toolkit by Ralph Kimball & Joe Caserta. Focus areas include ETL design patterns, dimensional modeling foundations, and best practices for building scalable data pipelines โ€” applied alongside Python self-study.

ETL DesignDimensional ModelingData WarehousePython
Apr 2026 โ€“ Present
In Progress
08

๐Ÿ Python Automation & Data Projects โ€” 100 Days of Code

Hands-on Python development focused on building automation scripts and data processing solutions through daily structured practice. Focus areas include Python fundamentals, file handling and data transformation, automation workflows, and clean and maintainable code practices. All exercises documented on GitHub.

PythonAutomationFile HandlingData Processing
Mar 2026 โ€“ Present
In Progress
09

๐Ÿ“˜ Data Engineering Self-Study โ€” Fundamentals of Data Engineering

Self-directed learning grounded in core principles of modern data systems, translating theory into practical implementations through hands-on exercises. Focus areas include data pipeline design, ETL processes, data architecture fundamentals, and end-to-end data flow understanding. Progress documented on GitHub through small projects that bridge concept and practice.

Data PipelinesETLData ArchitecturePythonEnd-to-End Flows
Mar 2026 โ€“ Present

Live Dashboard

03

Project Spotlight

SA Artists Multi-Source Data Lake

End-to-end data lake ingesting South African and African artist data from YouTube API, Last.fm, and MusicBrainz โ€” featuring automated ETL scripts and a live Power BI dashboard.

Python YouTube API Last.fm API Power BI ETL
View on GitHub
~ python fetch_all.py
SA Artists Multi-Source Data Lake
=====================================
Fetching: Kabza De Small
YouTube... 43 videos
Last.fm... 35,672 listeners
MusicBrainz... South Africa
Fetching: Burna Boy
Last.fm... 807,223 listeners
YouTube videos: 206
Total views: 480M
<

Experience

Data Manager (Data Engineering)

Africa Health Research Institute (AHRI) ยท Full-time

Durban, KwaZulu-Natal ยท Hybrid

Jan 2023 โ€“ Present ยท 3 yrs
  • Designed and maintained ETL pipelines using Python and SQL to ingest and transform clinical and research datasets.
  • Automated data quality checks to ensure accuracy and integrity across multiple systems.
  • Optimised SQL queries to improve reporting performance and reduce execution time.
  • Developed Power BI dashboards supporting operational monitoring and research analytics.
  • Led clinical data migration project โ€” consolidating data from multiple systems into a single database, standardising formats and improving reporting accuracy.
  • Documented pipeline architecture and workflow processes to improve maintainability.
  • Collaborated cross-functionally to ensure governed and reliable data access.
Python SQL ETL Power BI Data Governance

Software Developer (Internship)

CSG ยท Internship

Centurion, Gauteng ยท Hybrid

Jan 2022 โ€“ Jan 2023 ยท 1 yr
  • Contributed to enterprise-level software development in an agile environment.
  • Developed and maintained backend components using Python and SQL.
  • Participated in code reviews and worked with Git-based version control across collaborative development workflows.
Python SQL Git Agile

Certifications & Education

โ˜๏ธ

AWS Certified Cloud Practitioner

Amazon Web Services

Verify โ†—
๐Ÿ”ง

Data Engineering Certified

Data Engineering Certification

Verify โ†—
๐ŸŽ“

Postgraduate Diploma

Computer Science

๐ŸŽ“

BTech in software Development

Computer Science

๐ŸŽ“

National Diploma in software Development

Computer Science

Let's Work Together

Open to Data Engineer, Cloud Engineer, and Data Analyst roles. Let's connect and build something great.