Total-TECH Co.
” The Job Description”
- Manage and oversee the operation of critical applications across production and test environments, ensuring optimal uptime, reliability, and support.
- Own the complete application lifecycle including planning, deployment, upgrades, patching, decommissioning, and documentation.
- Ensure high availability and disaster recovery strategies are in place for applications hosted on OCI VMs and Oracle Kubernetes Engine (OKE).
- Collaborate with cross-functional teams (DBAs, Network, Security, DevOps) to ensure seamless delivery and issue resolution.
- Lead incident management and root cause analysis for major application-related outages.
- Coordinate change and release management processes to ensure safe and controlled deployments.
- Monitor performance metrics, application logs, and usage trends to identify improvement opportunities.
- Drive the implementation of automation and DevOps practices to reduce manual errors and accelerate time to production.
- Maintain detailed and updated technical documentation for each application.
- Work closely with business stakeholders to capture evolving requirements and align application capabilities with business objectives.
- Lead vendor engagement and enforce SLAs for third-party and managed services.
- Conduct periodic DR drills and ensure business continuity readiness.
- Supervise, mentor, and guide application engineers and support staff.
- Prepare regular reports on service levels, incidents, and performance indicators for IT leadership.
Requirements:
- Bachelor’s degree in Information Technology, Computer Science, or related field.
- Master’s degree or relevant certifications (e.g., ITIL, PMP, TOGAF) are a plus.
- 10+ years of experience in enterprise application management, with at least 3 years in a leadership capacity.
- Strong hands-on experience managing both commercial and custom applications in production environments.
- Proven experience working with Oracle Cloud Infrastructure (OCI) and Kubernetes / OKE.
- Experience handling high availability, backup/recovery, and disaster recovery operations.
- Proficiency in application monitoring tools, performance tuning, and root cause analysis.
- Familiarity with DevOps concepts and tools used for CI/CD, container orchestration, and automation.
- Strong understanding of network, security, and database integration with enterprise applications.
- Strong stakeholder management, team leadership, and vendor coordination skills.
- Excellent communication skills to work effectively with both technical and business teams.
- Fluent in English and Arabic.
