News Feed

datasette-extract 0.3a0

Simon Willison's Weblog

Simon Willison's Weblog · Apr 1, 2026

datasette-extract 0.3a0 is a release focused on importing unstructured data (text and images) into structured tables. The update now uses datasette-llm to manage model configuration, enabling you to control which models are available for extraction tasks via the extract purpose and LLM model configuration. This change enhances configurability of extraction workflows and ties to issue #38.

Quoting Soohoon Choi

Simon Willison's Weblog

Simon Willison's Weblog · Apr 1, 2026

The piece conveys Soohoon Choi's argument that AI models will write better, cheaper-to-maintain code due to economic incentives. With fierce competition among AI models, those that help developers ship reliable features quickly will favor simple, maintainable code, and markets will eventually penalize sloppy coding. It frames code quality as an economic driver for AI-generated software and long-term market outcomes.

The Entire Internet Is a UGC Reaction Video Now

Westenberg.

Westenberg. · Mar 31, 2026

The article examines the rise of 'reaction' videos sold for ads (Danugc.com) as a symptom of a larger shift toward manufactured authenticity in the creator economy. It argues that the internet has become a content mill where paid, AI-augmented, or pre-recorded reactions crowd out genuine expression, undermining trust and blurring reality, with theoretical touchstones from Baudrillard and Don DeLillo. It also points to the scale and pricing of these services to illustrate the economic incentives driving this trend.

Business Insider Profiles Fidji Simo, OpenAI’s ‘CEO of Applications’

Daring Fireball

Daring Fireball · Mar 31, 2026

OpenAI's product chief Fidji Simo is being groomed to turn OpenAI's research-heavy platform into a profitable business as a potential IPO looms. The article outlines the delicate balancing act she faces—driving monetization (with a focus on coding tools and enterprise offerings) while preserving the mission—overseeing a large share of the company and reporting to Sam Altman. It also notes intense competition from Anthropic and Google and cites large projected losses as the pressure backdrop for her strategy.

RAM Is the New Bearer Bond

Daring Fireball

Daring Fireball · Mar 31, 2026

RAM shortages driven by the AI build-out are driving prices up (64GB RAM rising from about $250 to over $1,000) and pushing device prices higher as manufacturers shift memory supply to AI data centers. With roughly 70% of memory-chip production going to AI this year and trillions in AI infrastructure spending, the market risks a broader ‘AI tax’ on gadgets and slowdowns in consumer devices, with ripple effects on hospitals, schools, and users in poorer regions. The piece notes that, while fab investments are underway, new RAM capacity will take years to materialize, delaying relief.

Bowman, Supporting Small Businesses

Federal Reserve (Speeches & Testimony)

Federal Reserve (Speeches & Testimony) · Mar 31, 2026

Bowman argues that small businesses are the backbone of the U.S. economy, driving job creation and growth, with 59 million Americans employed by small firms in 2023 and $16 trillion in revenue. She outlines a lending landscape dominated by community and smaller regional banks, but notes credit conditions remain tight as banks tighten standards. To expand access to capital, regulators are proposing Basel III and standardized-capital-rule changes that reduce risk weights for small-business loans and credit cards, while seeking public comment on the tradeoffs.

Barr, Brief Remarks on Stablecoins

Federal Reserve (Speeches & Testimony)

Federal Reserve (Speeches & Testimony) · Mar 31, 2026

Governor Michael S. Barr argues that the GENIUS Act provides needed regulatory clarity for stablecoins and could spur innovation and broader use cases beyond trading, such as remittances and treasury management. He highlights two major risks—money-laundering/terrorist financing and financial stability—emphasizing that stablecoins must be backed by high-quality, liquid reserves and reliably redeemable. He cautions that the Act's effectiveness will hinge on regulatory implementation, including asset eligibility, capital/liquidity requirements, AML controls, and consumer protections.

Building better AI benchmarks: How many raters are enough?

The latest research from Google

The latest research from Google · Mar 31, 2026

The authors propose an evaluation framework for AI benchmarks that uses gold ratings data to optimize the balance between the number of items rated and the number of raters per item (the N/K trade-off), aiming for more reproducible benchmarks that better reflect human disagreement. Through a simulator tested on subjective tasks across toxicity, safety, and offensiveness datasets, they find that the common 3–5 raters per item is often insufficient, with the optimal strategy depending on the metric (accuracy versus capturing the full range of opinions) and the forest vs. tree approach (breadth versus depth). They also show that reliable results are achievable with around 1,000 annotations when configured properly and have open-sourced the simulator on GitHub to aid the community.

IT: The 6T Industry that Silicon Valley Hardly Ever Thinks About

a16z News

a16z News · Mar 31, 2026

The article argues that the vast IT and security services market, dominated by MSPs, remains poorly innovated and fragmented, leaving many SMBs dissatisfied with slow responses and disjointed vendor coordination. It introduces Treeline’s concept of a Modern IT Operating System—a software-driven layer that standardizes workflows, automates routine IT tasks, and aligns IT, security, and compliance while reserving human expertise for strategic work. It contends that automation has matured enough to manage IT like electricity, reducing growth costs and letting companies redirect resources to core product and customers.

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

Engineering at Meta

Engineering at Meta · Mar 31, 2026

Meta presents the Adaptive Ranking Model to scale Ads Recommender runtime models to LLM-scale while preserving sub-second latency and cost efficiency. The approach rests on three pillars—inference-efficient scaling with request-oriented computation, deep model–system co-design (hardware-aware architectures and selective FP8 quantization), and a reimagined serving stack (multi-GPU embedding sharding and trillion-parameter capacity)—to deliver deeper understanding of user intent without exploding compute. Early results on Instagram show +3% ad conversions and +5% CTR for targeted users, with a roadmap toward increasingly autonomous, real-time optimization across Meta’s ecosystem.

Apple's 50 Years of Integration

Stratechery

Stratechery · Mar 31, 2026

The article argues that Apple’s enduring strength comes from its relentless hardware‑software integration, a strategy that has defined markets from the Mac to the iPhone and remains its core moat as AI reshapes the tech landscape. It details Apple’s AI push, describing a plan to open Siri to third‑party AI providers via iOS 27 and Apple Intelligence, monetizing via App Store subscriptions to aggregate AI on devices rather than building all models itself. It also signals momentum into lower-cost hardware (MacBook Neo) and contends Apple’s integrated platform will keep it ahead by owning the customer relationship even as hyperscalers pour money into AI infrastructure.

Safeguarding cryptocurrency by disclosing quantum vulnerabilities responsibly

The latest research from Google

The latest research from Google · Mar 31, 2026

Google researchers update quantum-resource estimates for breaking the 256-bit elliptic-curve discrete logarithm problem (ECDLP-256), detailing two Shor’s algorithm circuits that use about 1,200–1,450 logical qubits and 70–90 million Toffoli gates, potentially executable on a superconducting CRQC with under 500,000 physical qubits in minutes. They argue that post-quantum cryptography (PQC) is a viable path to long-term blockchain security, provide examples of PQC deployments, and emphasize urgency and policy considerations for the crypto community. To avoid misuse and FUD, they advocate responsible disclosure—using embargoes and zero-knowledge proofs to verify resource estimates without leaking attack details—and invite broader discussion to strengthen the ecosystem.

The World's First Bullshit

Westenberg.

Westenberg. · Mar 31, 2026

The article critiques startups’ frequent claims of being the 'world's first,' arguing that most such announcements are false or superficial marketing rather than genuine innovation. It links this to a mythic tech culture and Twitter-driven attention, noting that lasting value comes from iterative improvements that solve real user needs. Through historical examples (Watt vs. Newcomen, PARC vs. Jobs, Levi Strauss), it contends that being first matters less than delivering meaningful, durable product value.

datasette-files 0.1a3

Simon Willison's Weblog

Simon Willison's Weblog · Mar 30, 2026

datasette-files 0.1a3 release focuses on enabling integration with other plugins (notably datasette-extract) and introduces an architectural refactor. It scopes ownership options and file actions to a new FileResource under FileSourceResource, adds a <datasette-file-picker> Web Component UI, and provides a Python API get_file for accessing file data. Together, these updates improve cross-plugin file data access and usability.

Quoting Georgi Gerganov

Simon Willison's Weblog

Simon Willison's Weblog · Mar 30, 2026

Georgi Gerganov explains that local models struggle with harnesses, chat templates, and prompt construction, with occasional inference bugs. He notes that the end-to-end pipeline from user input to result is a chain of fragile components built by different parties, making consolidation difficult and leaving subtle failures along the way. The takeaway is that integration fragility, rather than any single component, drives the challenges of using local models for coding agents.

datasette-llm 0.1a3

Simon Willison's Weblog

Simon Willison's Weblog · Mar 30, 2026

datasette-llm 0.1a3 adds a configuration feature that lets you specify which LLMs are available for specific plugin purposes, enabling per-plugin restrictions on model choices. This makes it possible to restrict which models a given plugin can use, increasing control and clarity in LLM integrations. (Post by Simon Willison, March 30, 2026.)

AI for American-Produced Cement and Concrete

Engineering at Meta

Engineering at Meta · Mar 30, 2026

Meta announced BOxCrete, an open-source Bayesian-optimization AI model to design concrete mixes optimized for sustainable, domestically produced materials. The model supports rapid exploration of formulations to substitute imported cement, backed by real-world tests such as a Rosemount, MN data center batch that reached full strength 43% faster and reduced cracking by about 10%. Meta positions this as part of a broader effort to accelerate American production and resilience using adaptive experimentation across industry partners like Amrize and UIUC, with ongoing open-source collaboration.

Clip Show

the singularity is nearer

the singularity is nearer · Mar 30, 2026

The clip show distills a blog’s core philosophy: societies thrive on real production (energy, housing, software, tools, etc.) and suffer when access is gated by rent-seeking, bureaucracy, or centralized control. It frames sovereignty as practical control of the technical stack, urging open-source use, local computation, and distributed infrastructures to prevent domination by any single actor. It also treats AI as a genuine civilizational development—a source of caution against mysticism and centralization—while championing plurality of technosocial centers and a builders-over-rentiers ethic to enlarge collective capability.

Closed Source AI = Neofeudalism

the singularity is nearer

the singularity is nearer · Mar 30, 2026

The article argues that a handful of closed-source AI labs with concentrated compute and deployment power will consolidate political legitimacy and pose existential risks, effectively creating a neofeudal order. It contends that safety is not achieved by secret control but by broad access, portraying open-source AI as a bulwark against feudalism rather than a threat to safety. The piece emphasizes that guarding intelligence behind closed doors risks turning a minority into permanent gatekeepers and that openness distributes oversight and power more safely.

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Simon Willison's Weblog

Simon Willison's Weblog · Mar 30, 2026

Mr. Chatterbox is a 340-million-parameter language model trained entirely on Victorian-era British texts (1837–1899), with 2.93 billion training tokens, designed to run locally from public-domain data. The article assesses its practicality, noting the model’s responses feel more like a Markov chain than a typical LLM and arguing that far more data would be needed for genuinely useful conversational performance per scaling insights (e.g., the 20x token-per-parameter ratio). It also provides a hands-on account of running the model locally via Claude Code, nanochat, and a plugin, and includes an update detailing Trip’s data-filtering steps and the caveat that synthetic dialogue for fine-tuning may dilute the “no post-1899 inputs” claim.

Every Building You've Ever Been In Was Designed By Software Built in 1997

a16z News

a16z News · Mar 30, 2026

The piece argues that the $13 trillion AEC industry remains locked to 1990s-era software—chiefly Autodesk's Revit—which creates fragmented workflows and costly rework across project teams. It outlines three AI-driven disruption paths: (1) replace Revit with a cloud-native, AI-enabled BIM, (2) build around Revit to fix coordination and documentation gaps, and (3) target the services layer (notably MEP design) where most drafting occurs; startups like Motif, LightTable, and Endra illustrate these routes. It concludes that AI can unlock substantial capacity and spur a $100B+ opportunity by increasing throughput in labor-constrained design work.

How Do We Get Developers to Read the Docs

iDiallo.com

iDiallo.com · Mar 30, 2026

Ibrahim Diallo argues that API docs must serve two audiences—consumers who need quick, pattern-based guidance and maintainers who require deep implementation context—by designing with intuitive APIs and layered, audience-specific documentation. He recommends separating public, quick-start references from internal, detailed notes and using collapsible sections to reveal deeper information only when needed. This approach aims to transfer the right information at the right moment and avoid overwhelming readers.

Notes on going solo: celebrating 6 years of Studio Self

Westenberg.

Westenberg. · Mar 30, 2026

Celebrating six years of Studio Self, the author argues that solo creative practices can outpace traditional agencies by using AI to handle operational tasks while preserving high-judgment, taste-driven work as the core offering. The piece analyzes agency economics, the importance of personal perspective, and how AI could enable a one-person business to scale with high margins, outlining a future where human judgment remains the key differentiator.

The Verge: ‘Rank the Best Apple Products From the Last 50 Years’

Daring Fireball

Daring Fireball · Mar 29, 2026

Verge invites readers to rank the best Apple products of the last 50 years through pairwise comparisons. The article explains they aggregate votes using a modified ELO ranking system, updating each item's score after every matchup and dampening upsets to reflect subjective preferences, effectively turning individual choices into a community ranking. It credits the design and engineering team behind the feature.

Pretext

Simon Willison's Weblog

Simon Willison's Weblog · Mar 29, 2026

Pretext is a new browser library from Cheng Lou that enables fast, DOM-free estimation of how tall a paragraph will be when text is wrapped. It uses a two-step approach: prepare() tokenizes and measures segments on an off-screen canvas with caching, then layout() simulates browser wrapping to compute the final height for a given width; the technique is validated with large-scale tests across multilingual texts and even full public-domain works like Gatsby. The article highlights its small footprint, cross-language support (including RTL scripts and emojis), and a testing process that relies on browser ground truth over weeks with Claude tooling.

Back to feed

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

Engineering at Meta

Mar 31, 2026

3/31/2026

Embedding Memory Management And Distributed Serving Enable Large Scale Recommender Systems

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads · Engineering at Meta

Science, Technology & Innovation · Mar 31, 2026

Meta says the real bottleneck for trillion-parameter recommender serving is recommendation-scale embeddings (not just large transformer layers) and solves it with distributed memory architecture plus embedding-specific controls—feature-adaptive hashing, pruning, unified embeddings, and multi-GPU sharding with hardware-aware communications—to hit TB/O(1T) parameter scale with single-card performance parity, fast loading, autoscaling and production reliability, shifting the competitive edge to memory topology and embedding economics.

3/31/2026

Architecture And Hardware Co-Design With Layerwise Precision Policies And Kernel Fusion Enables Efficient Heterogeneous Inference

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads · Engineering at Meta

Science, Technology & Innovation · Mar 31, 2026

Meta boosted recommendation-model inference by co-designing architecture and hardware—using micro-benchmarked, layer-wise selective FP8 post‑training quantization and operator/kernel fusion (Grouped GEMM, horizontal fusion) to cut small-op and memory-move overheads, raising FLOPs utilization to ~35% across diverse hardware and showing heterogeneous fleets can work with per-layer precision/execution policies.

3/31/2026

ROI From Large Ranking Models Depends On Latency-Preserving Infrastructure And Intelligent Request Routing

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads · Engineering at Meta

Business, Finance & Industries · Mar 31, 2026

Meta launched an Adaptive Ranking Model on Instagram in Q4 2025 that, by routing each request to the most effective model while preserving strict sub‑second latency, produced +3% ad conversions and +5% ad CTR for targeted users, highlighting that ROI from larger ranking models depends on low‑latency, cost‑efficient serving rather than model size alone.

3/31/2026

Per Request Inference With Computation Sharing Enables Sublinear Scaling For LLM-Scale Ranking

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads · Engineering at Meta

Science, Technology & Innovation · Mar 31, 2026

Meta’s Adaptive Ranking Model shifts from per-candidate to per-request inference—computing dense user signals once per request and sharing them across ads via request-oriented computation sharing, in-kernel broadcast, and a centralized key-value log—achieving claimed sub-linear cost scaling and LLM-level model complexity with ~100 ms latency.

News Feed

datasette-extract 0.3a0

Quoting Soohoon Choi

The Entire Internet Is a UGC Reaction Video Now

Business Insider Profiles Fidji Simo, OpenAI’s ‘CEO of Applications’

RAM Is the New Bearer Bond

Bowman, Supporting Small Businesses

Barr, Brief Remarks on Stablecoins

Building better AI benchmarks: How many raters are enough?

IT: The 6T Industry that Silicon Valley Hardly Ever Thinks About

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

Apple&#39;s 50 Years of Integration

Safeguarding cryptocurrency by disclosing quantum vulnerabilities responsibly

The World's First Bullshit

datasette-files 0.1a3

Quoting Georgi Gerganov

datasette-llm 0.1a3

AI for American-Produced Cement and Concrete

Clip Show

Closed Source AI = Neofeudalism

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Every Building You've Ever Been In Was Designed By Software Built in 1997

How Do We Get Developers to Read the Docs

Notes on going solo: celebrating 6 years of Studio Self

The Verge: ‘Rank the Best Apple Products From the Last 50 Years’

Pretext

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

Embedding Memory Management And Distributed Serving Enable Large Scale Recommender Systems

Architecture And Hardware Co-Design With Layerwise Precision Policies And Kernel Fusion Enables Efficient Heterogeneous Inference

ROI From Large Ranking Models Depends On Latency-Preserving Infrastructure And Intelligent Request Routing

Per Request Inference With Computation Sharing Enables Sublinear Scaling For LLM-Scale Ranking

Apple's 50 Years of Integration