Diagram showing efficient transformer architectures

Taming the Transformer: How Perceiver IO and PaCa-ViT Conquer Quadratic Complexity

A deep dive into two novel architectures, Perceiver IO and PaCa-ViT, that break the O(N^2) barrier in Transformers, enabling them to process massive inputs efficiently.

October 2025 · Saeed Mehrang
ViT

The Image is a Sequence: Dissecting the Vision Transformer (ViT)

An in-depth look at ‘An Image is Worth 16x16 Words,’ the paper that introduced the pure Vision Transformer, its architecture, novelty, limitations, and how modern models like Swin Transformer evolved from it.

October 2025 · Saeed Mehrang