News Feed

★ The Supreme Court Rules That Law Enforcement’s Use of ‘Geofence Warrant’ Was a ‘Search’ (But May Be Moot, Technically, Since 2024)

Daring Fireball

Daring Fireball · Jun 30, 2026

The Supreme Court ruled 6-3 that geofence warrants—requests for location data tied to a time/place—constitute a Fourth Amendment 'search,' sending the Chatrie case back to evaluate reasonableness. The piece emphasizes the privacy implications of location-history data (noting Google’s shift to on-device storage with end-to-end encryption and Apple’s lack of geofence data) and argues this may limit such warrants going forward. It also advocates for clearer, accessible explanations of how location data is collected and used, beyond sensational assumptions about surveillance.

Expanding our Heat Resilience data to 50+ global cities

The latest research from Google

The latest research from Google · Jun 30, 2026

Google Research announces an expanded dataset of building-level rooftop reflectivity for 50+ global cities and a new Heat Resilience Earth Engine App to help urban planners deploy cool-roof solutions and reduce extreme heat. The project fuses Sentinel-2 data with 30-cm imagery to produce high-resolution albedo maps, validated against airborne measurements (RMSE 0.04), and indicates targeted cool-roof planning could cut urban temperatures by up to about 0.5°C globally. The dataset is open access across 9 countries and aims to prioritize retrofit efforts in vulnerable neighborhoods, with details published alongside related Nature Communications research.

Have your agent record video demos of its work with shot-scraper video

Simon Willison's Weblog

Simon Willison's Weblog · Jun 30, 2026

shot-scraper 1.10 adds a new 'video' command that uses a storyboard.yml to automate Playwright-driven demos of web apps and record them as video. The article walks through a concrete Datasette example and explains how the YAML-based storyboard is authored (including GPT-assisted generation) and integrated with Playwright’s screencast and Pydantic validation. It also discusses the development rationale and workflow, highlighting how coding agents can produce reproducible demos and the design decisions behind the feature.

Grant Sanderson – AI and the future of math

Dwarkesh Podcast

Dwarkesh Podcast · Jun 30, 2026

An interview with Grant Sanderson about AI's progress in mathematics, arguing that advances are highly domain-specific and unlikely to produce a sudden AGI moment; the transformative potential lies in AI's ability to generate conjectures and new definitions that could unify fields and alter the economy beyond solving existing problems. Using examples from the IMO, geometry versus combinatorics, and classic problems like the Riemann hypothesis and Fermat’s Last Theorem, he discusses where AI already shines, where it struggles, and why the next milestone might be the creation of new mathematical concepts or frameworks rather than mere problem-solving.

Charts of the Summer: Featuring Deel

a16z News

a16z News · Jun 30, 2026

Using data from Deel’s portfolio of startups and tech/remote-first firms, the piece maps how people take vacation across regions and seasons, highlighting that North America tends to offer and use fewer vacation days than Europe. It covers patterns such as Brazil’s block-taking rule, last-minute India bookings, long weekends and Summer Fridays, and the fact that most time off is taken as single days rather than long holidays, with caveats about formal vs informal vacation usage. The article uses these data-driven insights to compare regional norms and seasonal trends in vacation behavior, making it a substantive, analytical read rather than a simple promotional update.

Introducing TabFM: A zero-shot foundation model for tabular data

The latest research from Google

The latest research from Google · Jun 30, 2026

TabFM is a zero-shot foundation model for tabular data that treats classification and regression as in-context learning, eliminating the need for dataset-specific training, hyperparameter tuning, and feature engineering. It employs a hybrid architecture with alternating row and column attention, row compression, and an in-context learning transformer, and is trained on hundreds of millions of synthetic tables generated via structural causal models to generalize to real-world data; on TabArena it outperforms traditional, heavily tuned algorithms, with a higher-performing ensemble variant. Google BigQuery integration is planned, enabling AI.PREDICT-based predictions without ML expertise.

The 2026 Summer Reading List

a16z News

a16z News · Jun 29, 2026

The article is a curated 2026 Summer Reading List from a16z New Media, offering annotated picks across literature, media theory, and psychology. For each title, it provides why it matters, key themes, notable quotes, and guidance on reading, rereading, and pairing with other works, emphasizing how these books illuminate culture, technology, and human behavior.

Quoting Jon Udell

Simon Willison's Weblog

Simon Willison's Weblog · Jun 28, 2026

Jon Udell reframes 'human in the loop' as 'our loop,' urging developers to invite AI agents to join the team and collaborate rather than cede authority to machines. He advocates transparent, agent-assisted software development where workflows aren’t black boxes, avoiding unreviewable outputs like opaque pull requests. The piece presents a concise, opinionated stance on rethinking human–agent collaboration in development workflows.

★ Bernie Sanders: Ideologue and Economic Ignoramus

Daring Fireball

Daring Fireball · Jun 27, 2026

The article critiques Bernie Sanders’ tweet blaming Apple’s pricing on corporate greed, disputing his cited figure of $310 billion in stock buybacks and arguing that such a number is inconsistent with Apple’s reported buybacks and profits. It then uses economic reasoning to show that when input costs rise, firms must choose between raising prices or absorbing costs, and Sanders treats different cost-shock scenarios inconsistently, revealing ideological bias. Overall, it’s a substantive, data-informed critique rather than mere commentary.

Microsoft Raises Xbox Prices, Drops High-End Storage Model From Lineup

Daring Fireball

Daring Fireball · Jun 27, 2026

Microsoft will raise Xbox console prices worldwide starting August 1, 2026, with a $100 increase for 512 GB models and $150 for 1 TB models, and will sunset the 2 TB variant. The company attributes the price hike to sharply rising storage and memory costs amid a broader components crisis, noting that these costs have grown by more than 2.5x and could double again by 2027. To improve accessibility, it’s also rolling out Buy Now, Pay Later, 0% APR financing up to 12 months, trade-in programs for previously played consoles, and certified refurbished units at discounts.

Saying the obvious thing

seangoedecke.com RSS feed

seangoedecke.com RSS feed · Jun 27, 2026

Stating the obvious—articulating truths people already tacitly know—can sharpen thinking, clarify why we dislike certain things, and help others feel validated when they realize they're not crazy. The piece argues that highlighting obvious points is hard but valuable in engineering and technical communication, revealing tacit dynamics (like workload, ego, and how engineers are evaluated) and enabling deeper analysis by then drawing out the subtleties behind those obvious claims. It also offers practical guidance: write down the obvious even if it's dangerous or seems trivial, because it anchors shared understanding and unlocks further exploration of the less obvious parts.

Quoting Dean W. Ball

Simon Willison's Weblog

Simon Willison's Weblog · Jun 26, 2026

The piece quotes Dean W. Ball on the economics of frontier AI models, arguing that these models incur enormous training costs and only recoup a short post-release window before margins compress as competition grows. It also critiques the US AI infrastructure strategy, suggesting that the envisioned global market and the scale of data-center investments aren’t being realized due to access controls and deployment realities. The overall point is that strategic policy and investment are needed to sustain US AI competitiveness in the face of tightening margins and global dynamics.

What happened after 2,000 people tried to hack my AI assistant

Simon Willison's Weblog

Simon Willison's Weblog · Jun 26, 2026

Fernando Irarrázaval ran a hackmyclaw.com challenge to test whether secrets could be leaked from his OpenClaw test instance via email. After 6,000 attempts and about $500 in token spend, no secrets were leaked, thanks to anti-prompt-injection rules in Opus 4.6. The article notes that such defenses are improving but warns production systems remain vulnerable to more sophisticated attacks, and it cites a skeptical Hacker News thread for further discussion.

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction

The latest research from Google

The latest research from Google · Jun 26, 2026

Researchers retrofit a Multi-Token Prediction (MTP) head onto frozen Gemini Nano v3 models to accelerate on-device LLM inference on Pixel phones, avoiding retraining and preserving the base model’s behavior through a verification step. The approach employs a zero-copy design that cross-attends to the main model’s frozen KV cache, eliminating the need for a separate drafter and delivering ~50%+ speedups with reduced energy use for tasks like AI Notification Summaries and Proofread, while maintaining bit-for-bit identical outputs. The work also notes future directions such as parallel decoding and relaxed verification to further boost edge efficiency.

Apple’s Full Statement on Yesterday’s Price Increases

Daring Fireball

Daring Fireball · Jun 26, 2026

Apple has raised prices across many products—including Macs, iPads, Apple TV, HomePod, HomePod mini, and Vision Pro—citing a memory-chip shortage driven by the expansion of AI data centers. Price increases range from $30 for the HomePod mini to $1,300 for the Mac Studio, with iPhone, Apple Watch, and AirPods unchanged for now. The article notes Apple’s language suggests more price moves could occur as the chip shortage persists and situates the move within a broader industry trend of rising memory costs.

The next big breakthrough will be AIs learning on the job

Dwarkesh Podcast

Dwarkesh Podcast · Jun 26, 2026

The article argues that a path to AGI is to train AIs on millions of verifiable tasks across diverse RL environments (RLVR) to develop general problem-solving skills that work within a session, potentially reducing the need for continual online learning. It analyzes the limits of data efficiency, long-horizon generalization, and the difficulty of creating replayable real-world training targets, and discusses approaches like on-policy self-distillation (OPSD) and architectural/loss-design innovations as ways to improve continual learning without heavy weight updates. It remains speculative about how well RLVR will generalize to domains like politics or entrepreneurship, but emphasizes that breakthroughs in representations and learning objectives are needed for true on-the-job intelligence.

Charts of the Week: Cycles, different but the same

a16z News

a16z News · Jun 26, 2026

The article contrasts the current tech cycle with prior ones, arguing tech remains the winner but leadership has inverted across sectors: asset-heavy industries like energy, materials, and construction rally while asset-light ones lag, driven by AI infrastructure and a shift from “bits” to “atoms.” It cites private-market signals (robotics/physical AI investment) and research showing that AI’s productivity gains hinge on upstream discovery and rethinking processes, not merely adopting AI, with startups tending to run lean and solopreneurs expanding. It also uses a grocery-retail productivity discussion to illustrate how technology’s impact is nuanced and depends on measurement and business-model shifts.

AI inference is obviously profitable

seangoedecke.com RSS feed

seangoedecke.com RSS feed · Jun 26, 2026

The article contends that AI inference is profitable at scale, debunking the belief that it must be subsidized or is inherently unprofitable. It runs rough cost calculations (about $1 per million output tokens and roughly 13 cents per hour for a 70B dense model on four A100s) and cites frontier margins of 70-80% as well as open-weight providers like DeepSeek to argue that serving models at scale can be profitable; it also explains that inference revenue currently subsidizes training costs for AI labs, yet the inference business would likely survive even if major labs fail because models can be licensed or acquired by others.

Blink if you’re human

DYNOMIGHT

DYNOMIGHT · Jun 26, 2026

"Blink if you’re human" examines whether blog posts are truly human-written and argues for transparency about AI involvement along a spectrum rather than a binary AI-free stance. It reframes writing as a human–AI collaboration in the so-called centaur era, urging authors to set and disclose their personal AI-use boundaries, while noting that improving detectors and reputational risk will shape how openly people report their AI usage.

Privacy-Aware Infrastructure in the AI-Native Era: An Asset Classification Case Study

Engineering at Meta

Engineering at Meta · Jun 25, 2026

This article outlines privacy-aware infrastructure (PAI) and asset classification as the foundation for enforcing data governance in AI-native systems. It presents a practical pattern—build rich context before prompting, use LLMs narrowly for ambiguity, keep human-reviewed labels separate, and distill stable behavior into deterministic rules—along with a seven-stage process demonstrated at Meta. It emphasizes challenges like noisy signals, distributed context, and evolving requirements, arguing that production enforcement should be deterministic and auditable while leveraging model reasoning only for novel cases.

AI and Liability

Simon Willison's Weblog

Simon Willison's Weblog · Jun 25, 2026

The post discusses Bruce Schneier's take on a German ruling that Google should be liable for errors in AI-generated overviews. Schneier argues that AI agents are agents of the entities that deploy them and should be treated as such under the law, meaning companies should be liable for AI-produced inaccuracies just as they would be for human-written content. It warns that allowing firms to hide behind faulty AI would incentivize corporate misbehavior and undermine accountability.

Q&A: How KRAFTON Built PUBG Ally, a Co-Playable Character Powered by NVIDIA ACE

NVIDIA Technical Blog

NVIDIA Technical Blog · Jun 25, 2026

PUBG Ally is a co-playable AI teammate for PUBG: BATTLEGROUNDS, built with NVIDIA ACE to listen to player voice, reason about game state, and respond in real time using an on-device 2B parameter small language model, ASR, and custom TTS. The system uses a fast-system for immediate in-game actions and a slower language-model-driven system for deliberate reasoning, grounding responses in a closed PUBG context (Sanhok AI Duo) and live game state via tool calls. KRAFTON emphasizes multilingual support, long- and short-term memory, latency optimization through prompt design and caching, and an iterative, large-scale playtest feedback loop to tune the teammate before and during arcade beta.

NASA’s Artemis II Mission

a16z News

a16z News · Jun 25, 2026

Artemis II is NASA's first crewed test flight of the Artemis program, looping around the Moon on a figure-eight trajectory and reaching the farthest distance humans have traveled from Earth, a milestone on the path to a sustainable lunar presence and future Mars missions. The piece places Artemis II within a detailed historical arc—from Apollo through Shuttle, Constellation, and the Space Launch System/Orion—explaining how funding, politics, and program architecture shaped NASA's ability to return to the Moon. It also outlines the mission crew, the large industrial coalition behind SLS/Orion, the costs involved, and how earlier heat-shield issues were addressed to enable Artemis II.

Investing in Netris

a16z News

a16z News · Jun 25, 2026

Guido Appenzeller argues that the AI data center’s back-end interconnects require new management beyond classic multi-layer networks. Netris offers a scalable network management platform with switch agents, automation, multi-tenancy, and day-0/1 tooling, enabling reliable configuration and diagnostics across tens of thousands of GPUs. The piece highlights the eight-year track record and strong engineering team as reasons for a strategic investment to advance networking in AI infrastructure.

How Notion used the Cursor SDK to embed coding agents

Cursor Blog

Cursor Blog · Jun 25, 2026

Notion has integrated the Cursor SDK to embed autonomous coding agents inside Notion, enabling users to delegate tasks in docs, threads, or databases with Cursor handling planning, building, testing, and PRs end-to-end. The integration was completed in weeks, because the Cursor SDK provides a full-stack coding agent and infrastructure, allowing Notion to focus on product UX rather than agent infrastructure. It supports live streaming of agent runs, remote MCPs for real-time workspace access, and customizable templates and triggers to tailor agents to common workflows.

Back to feed

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction

The latest research from Google

Jun 26, 2026

6/26/2026

On-Device Latency Reduction Through Freezing The Base Model And Training A Lightweight Drafting Head With Verification While Preserving Bit-For-Bit Output

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction · The latest research from Google

Science, Technology & Innovation · Jun 26, 2026

Google retrofitted Multi-Token Prediction onto Gemini Nano v3 by freezing the backbone and training a lightweight drafting head plus verifier, yielding verified, bit-for-bit identical outputs with out-of-the-box latency speedups on Pixel devices without retraining or requalifying the base model.

6/26/2026

Zero-Copy Drafter Reuses Main Model KV Cache To Eliminate Prefill Latency And Reduce Runtime Memory For Edge AI

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction · The latest research from Google

Science, Technology & Innovation · Jun 26, 2026

A “zero-copy” drafter design reuses the main model’s frozen key-value cache via an MTP head that cross-attends to it, eliminating drafter prefill latency and cutting ~130MB of runtime memory per instance versus a standalone drafter—addressing the mobile dynamic-memory bottleneck and showing memory architecture can be as decisive as model quality for edge AI deployment.

6/26/2026

MTP Improves On-Device Inference Efficiency By Increasing Tokens Per Pass And Reducing Verification Frequency, Improving Battery Life.

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction · The latest research from Google

Science, Technology & Innovation · Jun 26, 2026

Google's MTP uses speculative verification with richer drafts and a redesigned on-device inference stack to validate nearly two extra tokens per pass in production features (e.g., AI Notification Summaries, Proofread), cutting verification frequency, reducing how often heavy processors wake, lowering energy use and improving battery life—making on-device AI more usable and commercially defensible.

6/26/2026

Integrated MTP Heads Outperform Standalone Drafters By Reusing Backbone Activations

Accelerating Gemini Nano models on Pixel with frozen Multi-Token Prediction · The latest research from Google

Science, Technology & Innovation · Jun 26, 2026

Google found that attaching a lightweight MTP drafter head to a model’s final hidden states (a late-exit design) outperforms similarly sized standalone drafters in speculative decoding—giving ~50%+ speedups on Pixel 9 and up to 55% higher token acceptance—implying reuse of backbone internal state trumps mere parameter parity.