Jared (Yusef) Amen

SRE & Platform Engineering Consultant

I help teams build reliability practices and ship observability tools that actually work.

About

I'm an SRE and Platform Engineering consultant with ~10 years of experience building reliable systems and observability tools. I don't just advise—I build. The open source tools in my portfolio are the same ones I use in consulting engagements.

My approach combines deep technical expertise in Kubernetes, Prometheus, Grafana, and cloud infrastructure with practical experience shipping production systems at scale. Previously at Salesforce (Pardot Division, SRE capacity), Magnus Technologies (observability platform across EKS clusters), and UpdatePromise (platform reliability and infrastructure).

Based in Austin, TX area with a B.S. in Computational Mathematics from UC Riverside. When I'm not debugging distributed systems, I'm building tools that make other engineers' lives easier.

Projects & Portfolio

Reflex

Commercial Product

Generates production-grade SLO definitions, burn-rate alerts, and incident runbooks in minutes. AI-recommended targets with guardrails against unrealistic configs.

Python OpenAI Prometheus Kubernetes
Launch →

prom-slo-analyzer

Free & Open Source

Point it at your Prometheus, see which services are ready for SLO-based alerting. Identifies SLI candidates and monitoring gaps.
prom-slo-analyzer --url http://prometheus:9090

Python Prometheus CLI
GitHub →

Basecamp

Live App

Learn anything, your way. Paste a URL or text — generates flashcards and expert audio lessons using AI. Pay only for what you use.

React TypeScript OpenAI Stripe Vercel
Launch App →

charge-IQ-core

Free & Open Source

Analytics and fraud detection for Stripe payment data. Analyze payment patterns and detect anomalies in transaction flows.

Python Stripe API
GitHub →

Work With Me

I help engineering teams build reliability practices that actually work. Here's what I do:

SLO Strategy & Implementation

Define meaningful SLOs, implement burn-rate alerting, and build incident response workflows that reduce MTTR.

Observability Architecture

Design and implement Prometheus, Grafana, and alerting pipelines that give you the visibility you need without the noise.

Kubernetes Platform Reliability

Audit cluster configurations, implement monitoring, and establish reliability practices for your K8s platform.

Alert Fatigue Reduction

Overhaul monitoring and alerting to eliminate false positives while ensuring real issues get immediate attention.

Platform Engineering

Design and build developer platforms, CI/CD pipelines, and infrastructure automation to improve engineering velocity and reliability.

Let's Talk

Previously: Salesforce, Magnus Technologies