Jonathan Ng

Site Reliability Engineer

Toronto, Ontario

Systems Design Engineering @ University of Waterloo


Skills

Tools

ArgoCD, Atlassian, AWS, Datadog, FluxCD, GCP, Git, Grafana, Helm, Kubernetes, LaunchDarkly, Loki, Nx, Prometheus, Terraform

Languages

Bash, CSS, HTML, JavaScript, Python

Work Experience

SRE II @ Tempo Software

(2022-01-05 - present)


  • Architected internal tooling enabling developers to develop fully-scoped prod-like ephemeral environments for feature testing, used by 50+ developers across 9+ teams, improving release confidence and increasing release frequency 718%
  • Joined initiative as a developer to refactor auth service, enabling feature management via LaunchDarkly
  • Modularized legacy IaC to simplify provisioning of environments, incorporating GitOps by leveraging terragrunt and GitHub Actions to promote changes between environments
  • Co-led the migration of code base and artifacts from GitLab to GitHub consolidating 2 product lines under one organization
  • Directed the knowledge-sharing process for onboarding new hires, growing the team to 6 members
  • Implemented Datadog monitoring for k8s services, improving coverage by enabling support for opentelemetry to capture custom metrics
  • Improved stability and consistency of production rollouts with readiness gates and pod disruption budget, reducing release downtime with ALB ingress controllers to zero
  • Refactored release communication to more align with GitOps strategy, enabling consistent notifications to stakeholders as features get promoted into production and increasing service coverage to 100%
  • Improved security of CI workflows across the organization by leveraging OIDC roles for AWS authentication, eliminating need for access keys in CI as well as creating a standard for repo and service based AWS access
  • Led initiative to refactor services to leverage internal communications where possible, increasing visibility in product's inter-service communications
  • Directed new SSO permission strategy for standardizing AWS permissions for SREs and developers across the company, leaning into GitOps and management of IaC

Junior DevOps @ TimePlay (now Stream6ix)

(2021-04-18 - 2021-12-30)


  • Used Ansible to manage scaling game infrastructure on AWS for a projected 700% increase in player traffic
  • Deployed infrastructure using AWS API Gateway, Lambda and ECS to reduce resource spin up times 93% down to 30s for new line of on-demand games
  • Created CI/CD pipelines for Unity projects using Unity Cloud Build to leverage tailored support and integrations

DevOps Cloud Developer Intern @ Cryptonumerics (now Snowflake)

(2019-09-03 - 2019-12-20)


  • Designed OAS dataset retrieval API to support popular cloud storage solutions
  • Refactored Spectron testing suite, improving test consistency by 100% and cutting test time down 89%

Innovation Engineer Intern @ VIA Rail Canada

(2019-01-08 - 2019-04-30)


  • Developed telemetry solution using Kibana to monitor train health and activity, leveraging existing sensor data installed throughout the train and enabling engineers to diagnose issues remotely
  • Developed a bash script to automate set up of analytics solution across scalable fleet of train cars
  • Applied beacon technology to map customer journeys through trainstation via device pings

DevOps Engineer Intern @ TD Bank

(2017-09-05 - 2018-08-31)


  • Generated a daily health report for stakeholders by compiling Jenkins build, and SonarQube results with Groovy
  • Coordinated migration of over 80 projects from several outdated instances of Jenkins, proposing a workflow to reduce over 90% of the planned work
  • Architected and introduced an experimental shared library system on Jenkins to streamline continuous integration pipeline for over 5 project teams
  • Administrator over Atlassian Toolstack, providing support and provisioning across all cloud platforms

Backend Engineer Intern @ Rave

(2017-01-03 - 2017-04-28)


  • Migrated video service from Postgres to Google Cloud Datastore, to improve consistency of transactions
  • Refactored location microservice, reducing redis queries by 50%

Projects

Home Lab

(2023-07-01 - present)


  • A home lab server for learning and hosting passion projects

Pomodoro Timer

(2023-05-01 - 2023-05-02)


  • A simple pomodoro app to learn React

Free Games Notifier

(2021-06-01 - present)


  • A script to pull freebies from Epic, supporting docker, k8s and github actions

Personal Discord Bots

(2021-01-01 - 2022-02-01)


  • A personal Discord bot developed to provide information from various game and movie APIs

Luxify

(2020-01-01 - 2021-01-01)


  • A restock notification service using Facebook's messaging API and various online store APIs to notify userbase as soon as highly coveted items are in stock