Posts

Browse the latest posts in this category.

All Posts

Stop Buffering File Uploads. Here's What Streaming Actually Does to Your Memory.

Most file upload code loads the entire file into RAM twice, once in the browser, once on the server. For small files nobody notices. For large files, your server dies silently. Here's the full picture from browser stream to S3.

MUMuhammad Umar Aziz

Jun 4, 2026

99% of people are using Claude, Codex, Gemini and GPT like it's still 2023

I've watched hundreds of devs and founders use these tools. The gap between the top 1% and everyone else isn't the model. It's how they drive it.

MUMuhammad Umar Aziz

May 23, 2026

Why Your AI Background Job Is Probably Lying to You

Most AI processing pipelines look like they work until one crash reveals they have been silently duplicating data, swallowing failures, and pretending retries are safe. This is the full story of breaking a production pipeline and rebuilding it the right way.

MUMuhammad Umar Aziz

May 23, 2026

The Security Layer AI Agents Actually Need

AI agents are making real decisions — calling APIs, moving money, filing compliance reports. Most of them run with a static API key that never expires and has no scope limits. Caracal is the open-source system built to fix this: pre-execution authority enforcement, short-lived tokens, real-time revocation, and a tamper-proof audit trail built on Merkle trees. Here is a deep technical look at how it works.

MUMuhammad Umar Aziz

May 18, 2026

From REST Calls to Event Streams: How I Stopped Fighting My Microservices and Started Designing Them

Most developers reach for HTTP and call it microservices. But request-response, message queues, and event streaming are not the same thing they carry different guarantees, different failure modes, and different operational costs. Here's how to actually tell them apart, and when to use which.

MUMuhammad Umar Aziz

May 12, 2026

From Retrieval to Production: Reranking, Caching, and the Streaming Architecture Behind Real-Scale RAG

Part 1 covered what to store and how to retrieve it. Part 2 covers what breaks when real users arrive — and how production systems like Perplexity and ChatGPT are actually wired to handle it

MUMuhammad Umar Aziz

Apr 27, 2026

The Engineering Behind a RAG (Retrieval-Augmented Generation)

Most RAG tutorials show you how to build something that works in a notebook. This one shows you what it takes to make it work when a real user shows up.

MUMuhammad Umar Aziz

Apr 19, 2026

Schema-Based Multi-Tenancy with PostgreSQL & Supabase (A Practical SaaS Foundation)

Designing multi-tenant systems isn’t just about scaling, it’s about isolation, structure, and long-term maintainability. In this post, I break down how I built a schema-based multi-tenancy system using PostgreSQL and Supabase, with automated migrations, tenant isolation, and a reusable backend foundation.

MUMuhammad Umar Aziz

Mar 31, 2026

Building a Scalable Portfolio with Next.js & Supabase (Beyond Static Sites)

Most developer portfolios are static and forgettable. I built mine as a scalable system using Next.js and Supabase with a custom CMS, dynamic content, and real-world architecture decisions. Here’s why this stack changed everything.

MUMuhammad Umar Aziz

Mar 30, 2026