Senior SRE (Hybrid)
An insurance platform company is seeking a Senior Site Reliability Engineer to join their Taiwan-based team, supporting business operations across Hong Kong and Taiwan
What you'll do:
As a Senior Site Reliability Engineer based in Taiwan, you will play a pivotal role in developing robust distributed systems that underpin digital insurance services for users in Hong Kong. Your day-to-day responsibilities will involve implementing advanced monitoring solutions on public cloud platforms, participating in capacity planning sessions, and optimising system performance through careful analysis. You will take ownership of streamlining CI/CD pipelines using GitLab, ensuring smooth software delivery from development through production. By defining key reliability metrics and collaborating with engineering teams throughout the SDLC, you will help foster a culture of dependability and operational excellence. Troubleshooting infrastructure challenges, automating routine tasks with tools like Ansible and Terraform, and orchestrating containers via Kubernetes will be central to your success. Your efforts will directly contribute to creating secure, scalable environments that support business growth while maintaining high standards of reliability.
- Implement monitoring, alerting, and automation tools on public cloud platforms to enhance system reliability, availability, scalability, performance, and efficiency.
- Participate actively in capacity planning by analysing software performance data and fine-tuning systems to ensure optimal operation across all environments.
- Develop and improve GitLab CI/CD processes and toolsets to streamline software delivery and deployment for maximum efficiency.
- Define key metrics for system reliability and monitor them consistently to identify areas for improvement.
- Collaborate closely with engineering teams at every stage of the software development life cycle to foster reliability and operational efficiency.
- Troubleshoot infrastructure issues promptly, optimise existing setups, and automate repetitive tasks to increase overall effectiveness.
- Contribute to the design of distributed systems that support business operations with a focus on security and network concepts.
- Enhance container orchestration using Kubernetes while maintaining robust Docker environments for seamless application deployment.
- Utilise automation platforms like Ansible and Terraform to manage infrastructure as code efficiently.
- Maintain version control integrity through Git source code management while supporting continuous integration initiatives.
What you bring:
The ideal Senior Site Reliability Engineer brings proven experience working with distributed systems in cloud environments such as Azure or GCP. Your proficiency in scripting languages like Bash or Python allows you to automate complex tasks efficiently while maintaining clarity in troubleshooting procedures. Advanced familiarity with monitoring solutions such as Prometheus or ELK equips you to proactively identify potential issues before they impact operations. You have demonstrated expertise implementing CI/CD pipelines using GitLab CI alongside managing container orchestration via Kubernetes. Your deep understanding of network security principles ensures that all deployed systems remain resilient against threats. Experience across the full SDLC means you are comfortable collaborating with engineers at every phase—from initial design through deployment—while maintaining version control integrity using Git. Your ability to communicate technical concepts clearly makes you an invaluable member of any team striving for operational excellence.
- Proficiency in programming languages such as Bash, Python or Go enables you to develop scripts for automation and troubleshooting tasks effectively.
- Advanced knowledge of monitoring solutions including Prometheus, Grafana, ELK (Elasticsearch, Logstash, Kibana) allows you to assess system health accurately.
- Expertise in cloud technologies with hands-on experience specifically in Azure and GCP ensures you can deploy scalable solutions confidently.
- Experience across the complete software development life cycle (SDLC) equips you to collaborate seamlessly with engineering teams at every stage.
- In-depth understanding of network concepts with a strong focus on security empowers you to design resilient distributed systems.
- Hands-on experience implementing CI/CD processes using tools like GitLab CI supports efficient software delivery workflows.
- Proficiency in automation platforms such as Ansible and Terraform enables you to manage infrastructure as code reliably.
- Knowledge of orchestration tools like Kubernetes helps maintain robust containerised environments for application deployment.
- Familiarity with container technologies including Docker ensures seamless integration into modern DevOps practices.
- Experience managing Git source code version control systems guarantees integrity throughout continuous integration initiatives.
- Your systematic approach to problem-solving combined with effective communication skills fosters collaboration within diverse teams.
About the job
Contract Type: Perm
Specialism: IT & Digital Transformation
Focus: Infra/Network/System
Industry: IT
Salary: Negotiable
Workplace Type: Hybrid
Experience Level: Associate
Location: Taipei
FULL_TIMEJob Reference: THUD9O-1EE672C6
Date posted: 20 March 2026
Consultant: Amy Lin
taipei tech-transformation/infrastructure 2026-03-20 2026-05-19 it Taipei TW Robert Walters https://www.robertwalters.com.tw https://www.robertwalters.com.tw/content/dam/robert-walters/global/images/logos/web-logos/square-logo.png true