Generative AI: 9th September 2025
đŁ Headlines
⢠The DuckDuckGo subscription now offers GPTâ4o/GPTâ5, Claude Sonnet 4 and Llama Maverick access for $9.99/month, giving users direct access to multiple cuttingâedge LLMs.
⢠OpenAI acquired experimentation platform Statsig for $1.1B and named its CEO CTO of Applications, signaling tighter A/B testing and realâtime decisioning integration across OpenAI products.
⢠Companies are advancing AI doppelgängers and video avatars to scale personal knowledge and meetings while Synthesia's Expressâ2 avatars mimic real speakers and could enable realâtime interactivity, pushing lifelike agent use in sales, coaching and media.
⢠New research and reporting highlight risks: chatbots and AI companions can be manipulative and raise fairness concerns, experts warn of mentalâhealth harms linked to AI use and legal scrutiny follows with calls for better parental controls and concerns about therapists secretly using ChatGPT; safety researchers also sounded alarms about broader systemic risks (https://www.theguardian.com/technology/2025/sep/08/chatbots-mental-health-warning-super-intelligent-ai-nate-soares).
⢠Security analysts warn that AIâdriven development could make subtle backdoors in openâsource projects harder to detect, prompting calls for stronger maintainer support and supplyâchain defenses.
⢠Firefox Nightly added Microsoft Copilot to its sidebar, bringing voice, image and document analysis modes into the browser for users and developers to test.
⢠VCs are pouring hundreds of millions into AIâpowered customer service startups, accelerating automation of support workflows and the deployment of agentic AI in CX.
⢠Researchers are weighing the pros and cons of synthetic data for privacy, bias mitigation and model testing, noting toolchains like Synthetic Data Vault and tradeoffs around validation and realism.
đ§ Company Engineering Blogs
Using AI to perceive the universe in greater depth (deepmindâ.google). Deep Loop Shaping uses reinforcement learning in frequency-domain rewards to reduce control noise in LIGOâs mirror systems, improving gravitational-wave measurement
A New Ranking Framework for Better Notification Quality on Instagram (engineeringâ.fbâ.com). Diversity-aware notification ranking using multiplicative demotion, MM R-based similarity across content, author, type, and product surface, with adjustable weights and potential for LLM integration
Building Sustainable Enterprise AI Adoption: Cultural Strategies That Achieved 95% Developer Engagement (engineeringâ.salesforceâ.com). Salesforce shares how to scale AI adoption beyond code generation, tackling monolithic codebases, modular loading, and enterprise-wide cultural change
Spec-driven development with AI: Get started with a new open source toolkit (githubâ.blog). Spec Kit enables spec-driven development with GitHub Copilot, Claude Code, and Gemini CLI to turn specs into executable artifacts
Welcome EmbeddingGemma, Google's new efficient embedding model (huggingfaceâ.co). EmbeddingGemma: Google's 308M multilingual on-device text embeddings, MMTEB/MMTEB v2 benchmarks, MRl truncation, 2K context, onâdevice RAG, Sentence Transformers, LangChain, LlamaIndex, Haystack, txtai, TEI, ONNX, FAISS
đ¨ Applied AI: creative, education, and genomics
When Machines that Simulate Intelligence Seemed Like a Summer Project (tensorlabbetâ.com). Explores Dartmouth 1956 proposal, seven themes, and how early AI ideas compare with modern LLMs, diffusion, and self-improvement concepts
Stumbling into AI: Part 2âModels (rmoffâ.net). Overview of LLMs, tokens, context windows, weights, clients, tools (MCP), and routers like OpenRouter and Raycast in the AI ecosystem
Conversations with Large Language Models: Battle Decks (aaronlandâ.info). Generative systems in museums: revisiting collections, storytelling, vibes, and playful infrastructure using artifacts, Muppets, and lava-lamp metaphors
DNA Foundation Models and Their Applications (aditharunâ.com). DNA Foundation Models generate DNA sequences and predict genomic properties; Evo2, AlphaGenome, Caduceus; tissue-specific promoters; in silico mutagenesis; VUS resolution; biosecurity; benchmarking; data quality; RC-equivariance
From Static Textbooks to Living Systems: How I Tried to Turn My Brain into AI Agents (blogâ.crackinglanguageâ.com). Living systems for learning: RAG, edge tools, BYOK, Thai syllable analysis, and a dynamic, personalized teaching platform
âď¸ Infra, LLMOps, and hardware trends
A Technical History of Generative Media â with Gorkem and Batuhan from Fal.ai (latentâ.space). Fal.ai's pivot from a Python cloud runtime to optimized diffusion inference, CUDA kernels, and multi-model hosting for 2M developers and 350 models
AI Operations Under the Hood: Challenges and Best Practices (towardsdatascienceâ.com). A practical framework for LLMOps and GenAI, focusing on data prep, RAG, evaluation, monitoring, and safety
Googleâs Nano Banana is the start of a Massive AI Trend [Markets] (artificialintelligencemadesimpleâ.substackâ.com). Nano Banana diffusion models,, four choke points, memory/packaging, HBM/CoWoS, p99 latency, ASICs, porting tax, CUDA moat, deterministic silicon, edge, video, supply chains
Build Production-Ready Agentic-RAG Applications From Scratch Course: What we are going to build (newsletterâ.theaiedgeâ.io). Hands-on course building production-ready Agentic-RAG apps with LangGraph, FastAPI, React, Pinecone, Langsmith on GCP
đ Evals, embeddings, and model quality
How big are our embeddings now and why? (newsletterâ.vickiboykisâ.com). Trends in embedding sizes from 300 to 1536+; BERT 768 baseline; GPT-3/2/CLIP; HuggingFace; OpenAI matryoshka; vector databases; MTEB benchmarks
llm-eval-simple a simple way to evaluate LLM for your use case (grigioâ.org). Evaluate OpenAI-compatible APIs with prompts and metrics across models like gemma-3-27b-it-qat-q4_0-q3_k_m, gpt-oss-20b-mxfp4, and Qwen3-4B-IQ4_NL
Gemini AI in Gmail is terrible (nelsonslogâ.wordpressâ.com). Gemini-in-Gmail shows limited email access, poor RAG retrieval, and disruptive AI UI in Gmail
In Defense of AI Evals, for Everyone (sh-reyaâ.com). Defends AI evals as systematic, continuous quality measurements across posttraining and practical dogfooding, with examples in coding, document processing, and policing data
𧲠RAG engineering and retrieval systems
How Dropbox Built an AI Product Dash with RAG and AI Agents (blogâ.bytebytegoâ.com). Dropbox Dash uses RAG and AI Agents to unify data across Gmail, Slack, Notion, Jira, and Dropbox with a custom interpreter for safe AI execution
How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques (towardsdatascienceâ.com). Scaling AI search with RAG, contextual retrieval, BM25, router agents, and evaluations for 10M queries
Generate Dataframe Summaries With Python (fundor333â.com). Generate dataframe summaries with Python, LangChain, Ollama, Mistral, Pandas, and custom context-driven reports for Cirrhosis patient data analysis
Chroma: RAG is Dead; Long Live Context Engineering (cto4â.ai). Chroma shifts focus from RAG to context engineering for grounding AI with embeddings and metadata
The AI Architect's Guide to RAG Debugging: A 3-Step Process to Fix Hallucinations in Minutes, Not Days (mikulskibartoszâ.name). 3-step RAG debugging guide: retrieval cascade, hybrid search, reranking, prompt engineering, HyDE, RRF, BM25, bi-encoders, cross-encoders, and observability for LLMs
đ§ LLM internals: scaling, training, and architecture
The wall confronting large language models (arxivâ.org). Analysis of barriers to scaling LLMs, alignment, safety, computation, data, and governance with practical mitigations
Understanding and Implementing Qwen3 From Scratch (sebastianraschkaâ.com). Hands-on Qwen3 from scratch in PyTorch: architecture, components, and building blocks for open-weight models
Gemma 3 Explained (opencvâ.org). Gemma 3 introduces multimodal vision, 128k context, GQA, RoPE, local-global attention, and a decoder-only Transformer with post-training and API call capabilities
Online versus Offline RL for LLMs (cameronrwolfeâ.substackâ.com). Online vs offline RL for LLMs; analyzes PPO-based RLHF online training, offline DPO, SFT variants, rejection sampling, and semi-online approaches across Llama-2 and SafeRLHF data
The Physics of AI Hallucination: New Research Reveals the Tipping Point for Large Language Models (firstprinciplesâ.org). Physicist Neil Johnson maps tipping point in LLMs, uses spin model, gap cooling, and attention head dynamics to predict hallucinations
đ Academic Research
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey (arxiv:cs). Survey of Agentic RL for LLMs: planning, tool use, memory, reasoning, self-improvement, perception, POMDPs, benchmarks, open-source frameworks, and five hundred works
OmniActor: A Generalist GUI and Embodied Agent for 2D&3D Worlds (arxiv:cs). OmniActor: Layer-heterogeneity MoE, GUI and embodied data synergy, 2D GUI and 3D embodied worlds, generalist agent, cross-domain training
Symbolic Graphics Programming with Large Language Models (arxiv:cs). RL with verifiable rewards improves SVG generation for symbolic graphics programming using SVGs with SigLIP and DINO encoders
Aligning Large Vision-Language Models by Deep Reinforcement Learning and Direct Preference Optimization (arxiv:cs). Overview of aligning large vision-language models via Deep Reinforcement Learning and Direct Preference Optimization for human-aligned multimodal systems
KVCompose: Efficient Structured KV Cache Compression with Composite Tokens (arxiv:cs). KV cache compression for long-context LLMs using attention-guided composite tokens and layer-adaptive allocation
đ Before you go
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
- Real say in how Blaze evolves â vote on new topics, features, topic curation ideas
- First dibs on merch (details still cooking)
- That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worriesâthe newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
You may also like
About Generative AI
Our Generative AI newsletter covers the latest developments, trends, tools, and insights in AI research, LLMs and agentic applications. Each week, we curate the most important content from over 50,000 blogs and news sites so you don't have to spend hours searching.
Whether you're a beginner or expert in generative AI, our newsletter provides valuable information to keep you informed and ahead of the curve in this rapidly evolving field.
Subscribe now to join thousands of professionals who receive our weekly updates!