Search suggestions:

part time
work from home
admin
cleaner
it
jobs
account executive
procurement
admin assistant
remote
accountant
account assistant
customer service
Singapore
Woodlands
Tuas
West Singapore
Tampines
Central Singapore
Pandan Crescent
Yew Tee
Choa Chu Kang
Jurong West
East Singapore
Serangoon Garden
Yishun
Novena
Jurong East
Apply

Staff Site Reliability Engineer

Visier
$9,000 - $14,200 a month
Singapore
2 weeks ago

Visier gives organizations a Workforce AI Edge: a set of AI-powered capabilities that help leaders understand the relationship between people and work, elevate the productivity of their employees, and win by adapting to change faster. The company is the global leader in AI-powered people analytics, workforce planning, and compensation management solutions. All Visier technology is underpinned by its Real-time People Data Platform, which uses AI to unlock the business-transforming potential of people data, work data, and the fusion of both.


Founded in 2010 by the pioneers of business intelligence, Visier has over 60,000 customers in 75 countries—including enterprises like BASF, Panasonic, Experian, Amgen, eBay, Ford Motor Company, and more.


Visier’s Shared Services SRE team is responsible for operating the cloud infrastructure underlying our technology platform and for working with the development teams to effectively use these technologies in production environments. We are also responsible for our AWS integration, API gateway, Cassandra, Kafka, Vault, and Consul implementations, data science workbench, and the network infrastructure and security that tie everything together.

Our job is to provide the infrastructure for our analytic platform and services to scale.

What you'll be doing...

  • Managing LLM deployments on major cloud platforms such as AWS and Azure
  • Managing AI application observability platforms such as Langsmith
  • Deploying and maintaining highly available services in AWS using Terraform, Cloudformation, and Jenkins
  • Debugging production issues at any level, from the hardware layers and the OS kernels all the way up to working hand in hand with the developers to improve our application behaviour.
  • Working with the Kong API gateway to provide secure & reliable API access to our customers and partners
  • Writing secure code to safeguard Visier and our customers' data, including developing our application security infrastructure
  • Optimizing our diagnostics infrastructure components like Splunk, Cloudwatch, and Prometheus
  • Supporting large clusters of 3rd party systems like Cassandra, Postgres, and Kafka
  • Preparing for and simulating disasters of all sorts. We’re mission-critical for our customers and need to stay up, no matter what
  • Work closely with other development teams to design the infrastructure to support application features.

What you'll bring to the table...

  • Extensive experience in enterprise-level scalability of services
  • Strong coding skills in Java, Scala, Python, or Groovy
  • Deep expertise in networking, network security, firewalls, routing, DNS, and advanced Linux systems and security
  • Hands-on proficiency with AWS services including EC2, S3, RDS, IAM, Lambda, and VPC
  • Strong experience with containerization technologies (Kubernetes, ECS) and managing them via Infrastructure as Code
  • Skilled in Infrastructure as Code (IaC) practices, with deep experience developing and maintaining Terraform code and modules
  • Strong knowledge of deployment and configuration management tools
  • Proven ability to perform deep troubleshooting and root cause analysis to resolve complex system issues
  • Experience with system security patching to maintain infrastructure integrity and resilience

Most importantly, you share our values...

  • You roll up your sleeves
  • You make it easy
  • You are proud
  • You never stop learning
  • You play to win
Save Apply
Report job
Other Job Recommendations:

Site Reliability Engineer - Applied Machine Learning Engine (Singapore)

ByteDance
Singapore
ByteDance will be prioritizing applicants who have a current right to work in Singapore, and do not require ByteDance's...
1 week ago

Site Reliability Engineer (Traffic) - Infrastructure Engineering

ByteDance
Singapore
The team manages the end to end lifecycle of server fleet, providing cloud solutions and various infrastructure services ensuring...
1 week ago

Site Reliability Engineer, Traffic Platform - Traffic SRE

ByteDance
Singapore
The Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates multi-cloud based...
1 week ago

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

ByteDance
Singapore
Team IntroductionThe Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates...
1 week ago

Site Reliability Engineer, Applied Machine Learning Engine (Singapore)

ByteDance
Singapore
ByteDance will be prioritizing applicants who have a current right to work in Singapore, and do not require ByteDance's...
1 week ago