Vishal Rajput
Jan 24, 2024

--

Mamba's architecture, while still handling sequential data, often incorporates better usage of cache to make it faster. Even compared to Attention it does more calculation, but still faster.

--

--

No responses yet