Site Reliability Engineer IP4G

Puebla, Mexico

JOB SUMMARY

The IP4G Site Reliability Engineer is a key member of the team, responsible for ensuring the reliability, scalability, and performance of IP4G workloads. This role combines expertise in IBM Power Systems, cloud infrastructure, and site reliability engineering principles to design, implement, and maintain resilient and efficient solutions for clients. This role collaborates closely with cross-functional teams to monitor, optimize, and automate IP4G, striving for continuous improvement and operational excellence.


Job responsibilities that are specific to the position:   

  • Design, implement, and maintain highly available and resilient architectures for IP4G workloads on Google Cloud Platform, leveraging fault-tolerant designs and redundancy strategies. 
  • Monitor system performance, availability, and reliability metrics to proactively identify and address potential issues before they impact service uptime or performance. 
  • Implement disaster recovery solutions and failover mechanisms to ensure business continuity and minimize service disruptions. 
  • Optimize IP4G workloads for performance, scalability, and cost-efficiency in the Google Cloud environment, leveraging auto-scaling, load balancing, and caching strategies. 
  • Conduct capacity planning exercises and performance tuning activities to ensure optimal resource utilization and performance of IP4G systems and applications. 
  • Collaborate with cloud architects and DevOps teams to implement CI/CD pipelines and automation workflows for seamless deployment and scaling of IP4G workloads. 
  • Respond to and resolve critical incidents impacting the availability or performance of IP4G systems and applications on Google Cloud, following established incident response procedures and SLAs. 
  • Document incident response procedures, post-mortem reports, and lessons learned to improve incident management processes and enhance system reliability. 
  • Develop automation scripts and infrastructure as code (IaC) templates to automate routine tasks, streamline deployment processes, and improve operational efficiency. 
  • Continuously evaluate and adopt emerging technologies and best practices in automation and DevOps to enhance the reliability and scalability of IBM Power environments.
  • Implement comprehensive monitoring and alerting solutions for IP4G workloads on Google Cloud, utilizing monitoring tools such as Stackdriver, Prometheus, and Grafana. 
  • Define and configure alerting thresholds, notifications, and escalation policies to ensure timely detection and response to anomalous behavior or performance degradation


TECHNICAL SKILLS

  • Excellent verbal and written communication skills. 
  • Ethical and critical thinking. 
  • Excellent interpersonal and customer service skills. 
  • Excellent sales and customer service skills. 
  • Excellent organizational skills and attention to detail. 
  • Excellent time management skills with a proven ability to meet deadlines.
  •  Strong analytical and problem-solving skills. 
  • Strong supervisory and leadership skills. 
  • Ability to prioritize tasks and to delegate them when appropriate.
  •  Ability to function well in a high-paced and at times stressful environment. • Proficient with Microsoft Office Suite or related software. 
  • Strong knowledge of IBM Power architecture, AIX/Linux operating systems, virtualization technologies (e.g., PowerVM), and storage solutions (e.g., IBM Spectrum Storage). 
  • Proficiency in cloud monitoring and observability tools such as Stackdriver, Prometheus, Grafana, and ELK Stack. 
  • Excellent analytical and problem-solving skills, with the ability to troubleshoot complex technical issues in a dynamic, fast-paced environment.

Education:

    • Bachelor’s degree in Computer Science, Information Technology, or a related field. •
    • Certification in IBM Power Systems (e.g., IBM Certified System Administrator - AIX, IBM Certified Technical Sales Specialist) and Google Cloud Platform (e.g., Google Cloud Certified - Professional Cloud Architect) is highly preferred.
    • Proven experience designing, implementing, and supporting IBM Power workloads in cloud environments, with a focus on reliability, scalability, and performance optimization.

PROFESIONNAL SKILLS

    • Excellent verbal and written communication skills.
    • Ethical and critical thinking. 
    • Excellent interpersonal and customer service skills. 
    • Excellent sales and customer service skills. 
    • Excellent organizational skills and attention to detail. 
    • Excellent time management skills with a proven ability to meet deadlines. 
    • Strong analytical and problem-solving skills. 
    • Strong supervisory and leadership skills. 
    • Ability to prioritize tasks and to delegate them when appropriate. 
    • Ability to function well in a high-paced and at times stressful environment. 
    • Proficient with Microsoft Office Suite or related software. 
    • Strong knowledge of IBM Power architecture, AIX/Linux operating systems, virtualization technologies (e.g., PowerVM), and storage solutions (e.g., IBM Spectrum Storage). 
    • Proficiency in cloud monitoring and observability tools such as Stackdriver, Prometheus, Grafana, and ELK Stack. 
    • Excellent analytical and problem-solving skills, with the ability to troubleshoot complex technical issues in a dynamic, fast-paced environment. 


What We Offer


Each employee has a chance to see the impact of his work. You can make a real contribution to the success of the company.
Several activities are often organized all over the year, such as weekly sports sessions, team building events, monthly drink, and much more.

 Benefits

Healthcare, dental, life insurance, savings fund, Christmas bonus, grocery bonus, annual bonus.

 Save On Commute

Paid office parking.


 Sport Activity

Join your colleagues in various sport activities in the area.



 Discount Programs

Medical related discounts.


 Prime Location

In the heart of Puebla, with views of Popocatepetl volcano, restaurants and amenities close by.


 PTOs

Vacation, Sick, Holiday, and paid leave.


 Sponsored Events

Team social events, Holiday dinner.


 Eat & Drink

Enjoy a kitchen stocked with coffee, and snacks at low charge.