Mirpri's Blog

home post tags categories about
home post tags categories about

Why Decoding is memory-bound for LLMs and how to optimize it

Research May 14, 2026

Breaking down the bottlenecks in LLM decoding and how speculative decoding can help optimize performance.

[Read Full Story...]

Latest Updates

  • The Big O Lie: Why You Should (Almost) Always Default to Vector

    Why vectors usually beat lists in practice despite Big O theory....

    Algorithm Apr 15, 2026
  • Windows ARM 环境下烧录 Nexys 4 DDR 的方案记录

    用 WSL2 和 openFPGALoader 在 Windows ARM 上烧录 Nexys 4 DDR 的记录....

    Course Mar 23, 2026
  • Introduction to Lyapunov Functions and Stochastic Network Optimization

    A primer on Lyapunov functions and the drift-plus-penalty framework for stochastic networks....

    Research Mar 3, 2026

Index by Tags

#LLM #GPU #ARM #Verilog #luogu #C++ #WordPress #Markdown #Astro #mdx

Catagories

Research Algorithm Course Uncategorized Development

More

  • The Knapsack Problem: From 0/1 to Advanced

    Algorithm Feb 25, 2026
  • P5960 【模板】差分约束

    Algorithm Feb 23, 2026
  • P2803 学校选址 II

    Algorithm Feb 22, 2026
  • Steam DLC

    Uncategorized Feb 22, 2026
  • P1020 导弹拦截

    Algorithm Feb 21, 2026
  • The New std::print in C++23: A Modern Replacement for cout and printf

    Development Feb 20, 2026
  • Set Up a WordPress Website

    Development Feb 19, 2026
  • Mastering React and Vue, Why I Still Give WordPress a Try

    Development Feb 19, 2026
  • Markdown Style Guide

    Uncategorized Jun 19, 2024
  • Using MDX

    Uncategorized Jun 1, 2024
root@mirpri:~$ echo "© 2026 All rights reserved."
[Github] [Home Page]