📰 Alle News

← Command Center
Amazon convenes ‘deep dive' internal meeting to address outages (4 minute read)

An Amazon spokesperson says that a single incident was related to AI, and none of the incidents involved AI-written code.

I built a programming language using Claude Code (26 minute read)

Cutlet is a dynamic programming language built entirely by Claude Code that combines the expressiveness of Python, Ruby, Lua, and JavaScript with first-class support for running subprocesses, building...

A Unix Manifesto for the Age of AI (4 minute read)

The Unix philosophy has survived because its authors built restraint into the architecture.

Shipping features has never been cheaper. How do you price them? (Sponsor)

AI keeps reducing the cost to build products, and no one knows how to price them anymore. Per user? Per feature? Per workflow? If your billing team is struggling to keep up, this Metronome white paper...

Meta Acquired Moltbook (3 minute read)

Meta has acquired Moltbook, a Reddit‑like network where AI agents built with the OpenClaw framework interact with each other and maintain an always‑on directory of agents.

Google launches new multimodal Gemini Embedding 2 model (2 minute read)

Google's Gemini Embedding 2, available via the Gemini API and Vertex AI, unifies text, images, videos, audio, and documents in over 100 languages. The model processes up to 8,192 text tokens, six imag...

Nvidia Invests in Mira Murati's Thinking Machines Lab (2 minute read)

Nvidia and Mira Murati's Thinking Machines Lab have formed a multiyear partnership in which the startup will deploy at least one gigawatt of cutting-edge chips to train and serve its frontier models. ...

Codex, File My Taxes. Make No Mistakes (11 minute read)

Codex can be used to file personal taxes, and it can even be more accurate than a human accountant. The immediate feedback that Codex gives users helps them understand their situation and the tax code...

The State of Consumer AI. Part 2: Engagement and Retention (4 minute read)

ChatGPT's engagement lead is wider than its market share lead: DAU:MAU sits at 45% versus Gemini's 22%, and WAU:MAU has climbed from 50% in mid-2023 to 82% today, putting it ahead of Gmail and Spotify...

Open Weights isn't Open Training (17 minute read)

Open source models offer the proposition of distributing the value created by AI more broadly, enabling more people to build. However, the current ecosystem doesn't make building easy. The stack conta...

AI benchmarks don't mean what you think they mean (Sponsor)

Benchmarks are trotted out whenever a new model is released, but what do they actually measure? ngrok's Sam Rose dug into the papers, code, and critiques to understand what 14 popular AI benchmarks ac...

The Anatomy of an Agent Harness (9 minute read)

An agent is a model with a harness. Harness engineering is the process of turning models into work engines by building systems around them. Models contain the intelligence, and the harness makes that ...

RCLI (GitHub Repo)

RCLI is an on-device voice AI for macOS that can control apps and perform other actions via voice. Users can choose from a variety of local AIs and perform 38 macOS actions. The tool can be used to in...

Quantifying infrastructure noise in agentic coding evals (12 minute read)

Agentic coding benchmarks are commonly used to compare the software engineering capabilities of frontier models. These scores are often treated as precise measurements of relative model capability. Ho...

How NVIDIA Builds Open Data for AI (12 minute read)

Every AI training pipeline rests on a data layer that determines how those models behave. This data determines what the models know, how they reason, and what they can safely do. However, much of toda...

Amazon wins court order to block Perplexity's AI shopping agent (3 minute read)

Perplexity's Comet AI browser has been blocked from accessing Amazon's site. Amazon sued Perplexity in November, alleging the startup concealed its AI agents so it could continue to scrape Amazon's we...

Notes from Token Town: Negotiating for the Fortune 5 Million (11 minute read)

Frontier labs are continuing to build first-party products on top of their own capabilities. For every token they sell, they can spend it themselves for COGs of under 50%. Your supplier is also your c...

Selectively reducing eval awareness and murder in Gemma 3 27B via steering (5 minute read)

Google's Gemma 3 models, including the 27B variant, were steered to alter features correlating with evaluation awareness and the intent to murder.

Your Data Agents Need Context (12 minute read)

Data and AI agents struggle without proper context, and messy and disparate enterprise data complicates their ability to answer basic queries.

The era of “AI as text” is over. Execution is the new interface. (5 minute read)

GitHub Copilot SDK enables AI-driven execution directly within applications, moving beyond simple text interactions.

Instruction Hierarchy Training for Safer LLMs (6 minute read)

OpenAI's IH‑Challenge is a dataset designed to train models to prioritize instructions based on trust level across system prompts, developers, users, and external data.

claude-ground (GitHub Repo)

claude-ground introduces a minimal rule system for Claude Code, providing phase tracking, decision logging, and language-specific best practices to improve coding discipline.

← Neuere Seite 114