Starred Posts ⭐
Linear Transformers, Mamba2, and many ramblings
I go through architectures used in sequence modelling: FFN, CNN, RNN, SSM, and Transformers along with many efficiency optimization attempts. I provide intuitive understanding of how they work, and analyze their strengths and weaknesses. All while paying special attention (pun intended) to their computational and memory complexities.
No matching items
Understanding Transformers
No matching items
ML Basics
No matching items
Miscellaneous
No matching items











