FixFlow
AI-Powered SRE Platform
An AI-powered Site Reliability Engineering (SRE) platform automating incident lifecycle logs, RCA generation, and real-time Socket.io push streams.

Technologies
Overview
FixFlow is an intelligent, automated Site Reliability Engineering (SRE) monitoring and alert management platform. It helps engineering teams detect critical outages, isolate root causes, and resolve incidents with minimal manual friction by using automated pipelines.
Architecture & Decisions
Built with a modern web stack: React and Framer Motion on the frontend for high-fidelity animations, and Node.js/Express on the backend. It integrates the Gemini API to orchestrate diagnostic analytics and Socket.io for immediate server-to-client system status alerts. Data persists in MongoDB.
Key Features
Challenges
Ensuring the real-time Socket.io stream doesn't overload the React state when dozens of microservice alerts fire simultaneously, which was solved by batching state updates.
Lessons Learned
I learned how to construct a resilient multi-stage fallback pipeline (Gemini Pro falling back to LLaMA via Groq) to guarantee RCA script generation during rate limits or provider downtime.