zh

服務項目

我們為企業量身打造招募解決方案,以其快速、有效深受臺灣頂尖企業信賴。瀏覽由Robert Walters臺灣提供的各種客製化服務與資源。

探索更多

聯繫我們

真正具有國際視野並深耕在地市場的招募機構,我們服務臺灣市場超過 10 年,並在臺北設有完善的辦公室。

聯繫我們
職缺

我們各領域的專業顧問會用心聆聽您的理想與抱負,並與臺灣知名企業、機構分享您的職涯故事。

讓我們的團隊與您攜手開啟職涯的下一個精彩篇章。

瀏覽全部職缺
服務項目

我們為企業量身打造招募解決方案,以其快速、有效深受臺灣頂尖企業信賴。瀏覽由Robert Walters臺灣提供的各種客製化服務與資源。

探索更多
關於Robert Walters臺灣

在Robert Walters臺灣,招募絕不僅是一份工作。

我們明白,每個機會的背後都是改變人們生活的可能性。

探索更多

加入我們

人永遠是企業的核心,也是Robert Walters與眾不同之處,了解更多關於臺灣團隊的故事,加入我們讓職涯更進一步。

探索更多
聯繫我們

真正具有國際視野並深耕在地市場的招募機構,我們服務臺灣市場超過 10 年,並在臺北設有完善的辦公室。

聯繫我們

Senior SRE (On-Premise)

已收藏的職缺

We are currently seeking a Site Reliability Engineer to join their reliability-first platform team. This role offers you the opportunity to be at the forefront of stabilising Windows-based services, enhancing observability, and driving containerisation into Kubernetes.

Key responsibilities:

As a Site Reliability Engineer based in Taiwan, you will immerse yourself in a dynamic environment focused on building a reliability-first platform. Your daily activities will revolve around automating operational tasks for Windows services using tools like AWX or Rundeck, implementing robust workflows with Ansible or PowerShell DSC, and standardising observability practices through Prometheus, Grafana, OpenTelemetry, ELK or Loki.

  • Develop comprehensive self-service runbooks for Windows services using AWX or Rundeck to streamline operational processes and empower teams with automated solutions.
  • Implement Ansible or PowerShell DSC workflows that facilitate health checks, safe rollbacks, and efficient incident response mechanisms across critical systems.
  • Standardise metrics, logs, and traces utilising Prometheus, Grafana, windows_exporter, OpenTelemetry, ELK, or Loki to create actionable dashboards and alerts that drive informed decision-making.
  • Participate actively in on-call rotations to handle incidents promptly, conduct thorough post-incident reviews (PIR), and lead game days to institutionalise standard operating procedures (SOPs).
  • Design and execute backup strategies and disaster recovery plans that ensure business continuity while optimising capacity planning and performance tuning for all services.
  • Drive long-term service containerisation initiatives by adopting Kubernetes technologies such as Helm, Kustomize, Argo CD, Flux, ConfigMap, and Secrets with a strong emphasis on security and compliance.
  • Collaborate closely with cross-functional teams to embed reliability engineering principles throughout the software delivery lifecycle.
  • Champion automation practices that reduce manual intervention and enhance operational efficiency across the platform.
  • Contribute to the development of golden-signal dashboards that provide real-time visibility into system health and performance.
  • Support the integration of compliance requirements into delivery pipelines by understanding RBAC principles and least privilege access models.

Candidate profile:

To excel as a Site Reliability Engineer in this organisation’s Taiwan office, you bring proven experience from SRE or DevOps roles where you managed production on-call duties for critical systems. Your technical proficiency covers Windows/Linux administration alongside deep knowledge of networking fundamentals such as DNS configuration, TCP/IP protocol management, TLS implementation and load balancing strategies. You are adept at deploying Ansible or PowerShell DSC workflows within CI/CD environments powered by Jenkins or Argo CD.

  • Mandatory: You have over four years of experience in SRE, DevOps or Platform Engineering roles with at least two years spent managing production on-call responsibilities for mission-critical systems.
  • Mandatory: Your hands-on expertise spans Windows/Linux administration as well as core networking concepts including DNS, TCP/IP protocols, TLS encryption standards and load balancing techniques.
  • Mandatory: You possess practical experience implementing Ansible or PowerShell DSC workflows alongside CI/CD pipelines using Jenkins or Argo CD.
  • Mandatory: You demonstrate solid understanding of monitoring frameworks such as Prometheus, Grafana or OpenTelemetry coupled with centralised logging solutions like ELK or Loki.
  • Mandatory: Familiarity with containers/Kubernetes is essential; you have built up environments from scratch and operated tooling including Helm/Kustomize/Argo CD/Flux within production settings.
  • Mandatory: Your communication skills are exceptional; you collaborate effectively within teams while maintaining accountability for deliverables through attention to detail and a bias towards automation.
  • Nice to have: Experience managing secrets using Vault or Key Vault enhances your ability to secure sensitive information within distributed systems.
  • Nice to have: Exposure to cloud-based DNS management platforms such as Cloudflare/AWS Route 53/CloudFront (or equivalent stacks) supports scalable infrastructure design.
  • Nice to have: Familiarity with incident command structures (IMOC) and conducting post-incident reviews (PIR) strengthens your incident response capabilities.
  • Nice to have: Understanding RBAC models, least privilege access controls and embedding compliance/audit requirements into delivery pipelines demonstrates your commitment to secure operations.

招募類型: 永久性

專業領域: 資訊科技及數位轉型

職務類別: 資訊基礎建設/網路/系統

產業: 資訊技術

薪資: Negotiable

辦公模式: 實體辦公模式

經驗: 專員

地區 Taipei

職務參考: HJ2KKJ-3B397C92

發佈日期: 2026年5月21日

獵頭顧問 Amy Lin

已收藏的職缺

分享