Data Engineer Consultant

Total-TECH Co.

” The Job Description”

1- Enterprise Data Ingestion and Data Engineering: Design and build scalable, reusable ingestion pipelines (realtime and batch) on GCP (GCS, BigQuery, Dataflow).

Develop parameterized pipelines using Google-native services and/or Informatica IDMC mappings and taskflows.

Implement CDC patterns, idempotent loads, late‑arriving data handling, and schema evolution.

Optimize BigQuery ingestion strategies (batch vs. streaming, partitioning, clustering).

Establish version control, CI/CD, and environment promotion (dev/test/prod).

Results:

GCP-native pipelines (Dataflow, Composer, Cloud Run) and/or IDMC mappings/taskflows.

BigQuery schema designs (raw / stage / curated zones with partitioning & clustering).

CI/CD pipelines, deployment artifacts, and operational runbooks.

2-Data Mapping & Transformation Design :Profile source data and analyze data quality and patterns.

Design field‑level mappings, business rules, joins, aggregations, and derivations.

Implement transformations using BigQuery SQL, Dataflow, or IDMC transformation logic.

Results:

Source‑to‑Target Mapping (STM) document.

Transformation mappings.

3- Orchestration, Automation & Reliability Engineering : Build and parameterize end‑to‑end workflows using Cloud Composer (Airflow) and/or IDMC taskflows.

Define job dependencies, schedules, SLAs, and failure handling strategies.

Implement retries, backoff strategies, checkpoints, and restartability.

Integrate monitoring, logging, and alerting using Cloud Monitoring, Cloud Logging, and ChatOps tools.

Results:

Production‑ready orchestration workflows with environment‑aware parameters.

Scheduling calendar, dependency diagrams, and SLA matrix.

Alerting, monitoring dashboards, and operational SOPs.

4-Data Pipelines Monitoring, Performance & Cost Optimization : Monitor pipeline health, data freshness, volumes, and anomaly patterns.

Support and monitor data pipelines during off‑hours and weekends, troubleshooting and resolving issues to ensure SLA compliance.

Track and manage SLAs related to runtime, failures, cost, and data latency.

Optimize BigQuery performance (query refactoring, partition pruning, materialized views).

Manage and optimize GCP costs (storage lifecycle, slot usage, query optimization, caching).

Results:

Daily/weekly operational and SLA reports.

Performance and cost optimization plans with before/after metrics.

BigQuery tuning guidelines and data materialization strategy.

Requirements:

  • 5+ years of experience as a Data Engineer or ETL Developer.
  • Skills required technical, managerial : Strong experience with Google Cloud Platform: BigQuery, GCS, Data Stream, Dataflow, Cloud Composer, IAM, Cloud Monitoring.
  • Proficiency in SQL (BigQuery‑optimized) and data modeling for analytics.
  • Hands‑on experience building production‑grade, scalable data pipelines.
  • Experience with CI/CD, Git‑based version control, and DevOps practices.
  • [Optional] Experience with Informatica Intelligent Data Management Cloud (IDMC) CDI, Mass Ingestion, Taskflows.
  • Bachelor’s degree in Computer Science, Information Technology, Information Systems, or a related.

 

Tagged as: , , , , ,

Upload your CV/resume or any other relevant file. Max. file size: 3 GB.

Job Overview
Job Location