A polynomial autoencoder beats PCA on transformer embeddings
↗A PCA encoder with a closed-form quadratic decoder (Ridge OLS on a polynomial lift) consistently outperforms PCA alone in FiQA/BEIR retrieval metrics, enabling 4× compression with modest NDCG@10 loss; performance depends on model isotropy, dimensionality, and corpus size, and the method requires corpus statistics and becomes impractical beyond ~200–256 dimensions.
May 8, 20261%