Skip to content

Resources

A curated set of tools, references, and reading materials that Alex Hsieh(謝明宏)uses and recommends. Nothing here is sponsored; everything listed has been evaluated or used in practice.

By Topic

Cloud & Kubernetes

DevOps & CI/CD

  • DORA Metricsdora.dev — Research-backed engineering performance metrics.
  • Terraformterraform.io — Infrastructure as code standard for multi-cloud.
  • GitHub Actions Documentationdocs.github.com/actions — Well-maintained CI documentation.

Observability & SRE

  • Google SRE Booksre.google/sre-book — Foundational reading.
  • Prometheus Documentationprometheus.io/docs — Metrics collection standard in cloud-native environments.
  • OpenTelemetryopentelemetry.io — Vendor-neutral observability instrumentation.
  • Grafana Lokigrafana.com/loki — Log aggregation that pairs well with Prometheus and Grafana.

AI for Infrastructure

  • LangChain Documentationdocs.langchain.com — Framework for building LLM-powered agents.
  • OpenAI API Docsplatform.openai.com/docs — Reference for AI-assisted tooling.
  • Alex's ArticlesArticles covering LLMs applied to infrastructure diagnostics and automation.

Writing & Communication

  1. Site Reliability Engineering — Beyer et al.
  2. The Phoenix Project — Kim et al.
  3. Team Topologies — Skelton & Pais.
  4. Building Microservices — Sam Newman.
  5. Designing Data-Intensive Applications — Martin Kleppmann.

Found something useful? Share it with Alex via contact.

Built with VitePress. Deployed on Cloudflare Pages.