The latest research from Google
Jun 30, 2026
Introducing TabFM: A zero-shot foundation model for tabular data · The latest research from Google
Science, Technology & Innovation · Jun 30, 2026
A table-specific architecture treats tables as two-dimensional, order-invariant data by alternating attention across columns and rows, compressing each row into dense vectors, then applying a Transformer over those compressed rows to cut computation and enable scalable zero-shot tabular prediction without manual feature engineering.
Introducing TabFM: A zero-shot foundation model for tabular data · The latest research from Google
Science, Technology & Innovation · Jun 30, 2026
Google’s TabFM—benchmarked on TabArena (38 classification + 13 regression datasets, 700–150k samples, Elo/head-to-head)—offers a no‑tuning base mode (single forward pass) and a calibrated 32‑way ensemble (cross/SVD features, Platt scaling) that Google claims consistently outperforms heavily tuned supervised baselines and is being embedded into BigQuery via a simple AI.PREDICT SQL command, suggesting tabular ML may shift toward SQL‑native inference and away from separate ML engineering stacks.
Introducing TabFM: A zero-shot foundation model for tabular data · The latest research from Google
Business, Finance & Industries · Jun 30, 2026
Google views synthetic data—not augmentation—as the essential basis for pretraining industrial-scale tabular models: TabFM was trained entirely on hundreds of millions of synthetically generated datasets (via structural causal models and random functions) because high-quality real enterprise tables are scarce and sensitive, and Google claims this synthetic corpus generalizes to unseen real-world tables; therefore control over synthetic-data generation quality may become a key competitive moat for tabular AI.
Introducing TabFM: A zero-shot foundation model for tabular data · The latest research from Google
Science, Technology & Innovation · Jun 30, 2026
TabFM replaces per-dataset training for tabular ML with zero-shot, in-context inference that ingests an entire table (historical examples + target rows) as a single prompt, enabling single-pass classification/regression on unseen tables and reducing manual training, tuning, and feature engineering to speed deployments.