🤖

Machine Learning Engineer: 16th September 2025

Published 16th September 2025

🔧 Company Engineering Blogs

Next Gen Data Processing at Massive Scale At Pinterest With Moka (Part 2 of 2) (medium.com/pinterest-engineering). Deploying EKS clusters, Fluent Bit logging, OTEL metrics pipelines, image management, and a custom Moka UI for Spark on Kubernetes

🎓 Research & academia: TDA, AI safety, self-assembly, neural rendering, audio ML, lab news

Conference on TDA: Recent Developments and Applications, University of Missouri – Columbia, November 22-24, 2025 (appliedtopology.org). Conference on Topological Data Analysis: Recent Developments and Applications at Missouri, focusing on theory, algorithms, and real-world applications

CS 2881: AI Safety (windowsontheory.org). CS 2881 AI Safety course page by Boaz Barak; includes lecture video, slides, homework zero, LessWrong posts

RenderFormer: How neural networks are reshaping 3D rendering (microsoft.com). RenderFormer learns a full graphics pipeline with triangle tokens, dual-transformers, and ray bundle tokens to render arbitrary 3D scenes with global illumination

Analysis and Synthesis of Audio with AI: from Neurological Disease to Accented Speech and Music (dorienherremans.com). Automated oral diadochokinesis assessment, accent-converted Text-to-Speech, and controllable music generation with MidiCaps, MusicBench, Mustango, and SonicMaster

Self-Assembly Gets Automated in Reverse of ‘Game of Life’ (quantamagazine.org). Neural cellular automata learn rules to self-assemble shapes, enabling regeneration and distributed computation

Congratulations Dr. Jan on graduating! (dorienherremans.com). Dr. Jan Melechovsky’s PhD journey at AMAAI lab includes dysarthric speech analysis, text-to-music, datasets, and audio AI tools like Mustango and SonicMaster

🛠️ Practice & ops: ML team process, security, field notes, hackathon builds

Link Graveyard: A snapshot of my abandoned browser tabs (timkellogg.me). Snapshot of abandoned browser tabs covering AI, LLMs, data curation, GLM-4.5, prompts, embeddings debates, and infrastructure papers

Web Directions Engineering AI - Notes (halans.com). Notes on Web Directions Engineering AI: talks on copilots, agents, MCPs, context engineering, and human-in-the-loop practice

Extreme Programming for ML Teams: Faster Delivery, Reliable Results (probableodyssey.blog). Extends XP to ML teams, emphasizing CI, TDD-like data-driven testing, simple design, collaboration, and treating experiments as releases

Brian (bex) Exelbierd: Day 1: Microsoft Hackathon — Building a Focused Summarizer for Upstream Linux (winglemeyer.org). Lightweight LLM-driven Debian mailing-list summarizer MVP; agentic coding in Python; data collection from August 2025; memory-focused architecture; avoids full vector DB

Data Poisoning Attacks (infosecwriteups.com). Overview of data poisoning dimensions: objective, goal, attacker knowledge, stealthiness, scope, impact, and variability

🧱 Data platforms for ML: Polars, Spark-on-K8s, Kafka/Flink, Postgres vectors, AWS

AI in Production: Gen AI and Agentic AI on AWS at scale (edwarddonner.com). Gen AI on AWS at scale: Bedrock, SageMaker, Lambda, App Runner, RAG pipelines, and multi-agent MCP deployment for Enterprise-grade AI

Polars at Decathlon: Ready to Play? (pola.rs). Decathlon uses Polars on Kubernetes for faster, memory-efficient pipelines, replacing pandas in smaller datasets and enabling streaming engines

Online Feature Store for AI and Machine Learning with Apache Kafka and Flink (kai-waehner.de). Real-time feature store with Apache Kafka and Flink powering Wix personalization and AI-driven experiences

🐥 Vector embeddings with Ash, OpenAI, and PostgreSQL (yellowduck.be). AshAi with OpenAI embeddings stored in PostgreSQL's vector extension for semantic search and recommendations in Elixir apps

⚡ LLM systems performance, kernels & GPU serving

Network and Storage Benchmarks for LLM Training on the Cloud (maknee.github.io). Network and storage benchmarks for distributed LLM training with SkyPilot and Nebius, comparing InfiniBand vs Ethernet and various storage tiers

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers (huggingface.co). OpenAI gpt-oss techniques in transformers: MXFP4 quantization, custom Hub kernels, Flash Attention 3, TP/EP, dynamic KV cache, continuous batching, and load-time optimizations

Nvidia's context-optimized Rubin CPX GPUs were inevitable (go.theregister.com). Nvidia Rubin CPX uses GDDR7 memory to disaggregate prefill workloads from decode for long-context AI workflows

Efficient LLM Serving with MTP: DeepSeek V3 and SGLang on AMD Instinct GPUs (rocm.blogs.amd.com). Speed up LLM inference with Multi-Token Prediction (MTP) in DeepSeek V3 using SGLang on AMD Instinct GPUs, detailing NextN draft model and EAGLE speculative decoding

Exploring Use Cases for Scalable AI: Implementing Ray with ROCm Support for Efficient ML Workflows (rocm.blogs.amd.com). Ray with ROCm enables scalable AI on AMD GPUs for LLM training, inference, serving, and RL via RayTrain, RayServe, and RayServe examples

Supercharge ML performance on xPUs with the new XProf profiler and Cloud Diagnostics XProf library (cloud.google.com). Profile ML models on xPUs with XProf and Cloud Diagnostics XProf library to identify bottlenecks and optimize performance

Defeating Nondeterminism in LLM Inference (simonwillison.net). Nondeterminism in LLM inference arises mainly from varying load and batch size; paper proposes invariant kernels in PyTorch to achieve determinism

🧠 Modeling mechanics: embeddings, tokenization, from-scratch Transformers, recsys hybrids

How to Train an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs (eugeneyan.com). LLM-Recsys hybrid with semantic IDs using RQ-VAE, SASRec, Qwen models; train on Amazon Video Games data; steerable, conversational recommendations

Qwen-8B Embeddings: Near-SOTA Performance at 600x the Speed (alexdong.com). Qwen-8B embeddings enable near-SOTA text classification, 600x faster than LLM classifiers, achieving MAP ~0.944 on Kaggle with simple MLP

lecture three (aarnphm.xyz). Lecture three on tokenizers, LLMs, alignment, sparse autoencoders, residual streams, and speculative decoding for efficient inference

numpy implementation of Transformer (aarnphm.xyz). NumPy-based Transformer with forward/backward passes, causal attention masking, gradient checks, and TinyStories dataset training guidance

🔢 Math of ML: activations, kernels, invariants, double descent

Out of Distribution Data, and other experiments for 'ML and vanishing order' Paper (davidlowryduda.com). Machine learning experiments on L-functions, PCA/LDA, and out-of-distribution data using Dirichlet coefficients and primes, with Python code excerpts

A Slotted Hash Cons for Alpha Invariance (philipzucker.com). slotted e-graphs for alpha-invariant hashing, canonical forms, and lazy permutations in hash-consing

Maxout Activation Function (blog.sparsh.dev). Explains Maxout activation, its math, a 2-group example, and implementations in NumPy, PyTorch, and TensorFlow with applications and comparisons

Reimagining Equity Solvency Capital Requirement Approximation (one of my Master’s Thesis subjects): From Bilinear Interpolation to Probabilistic Machine Learning (thierrymoudiki.github.io). Probabilistic SCR equity approximation using RVFL and conformal prediction with R and Python implementations

More is More: Double Descent and HTE (gojiberries.io). Double descent in treatment effect estimation: wide models with minimum-norm RFFs improve prediction while preserving orthogonal inference

“Kernel Ridge Regression with Stochastic Gradient Descent Training Using JavaScript” in Visual Studio Magazine (jamesmccaffrey.wpcomstaging.com). Kernel ridge regression with SGD training of a JavaScript KRR demo using RBF gamma and alpha regularization in Visual Studio Magazine

📚 Academic Research

Enhancing ML Models Interpretability for Credit Scoring (arxiv:q-fin). Hybrid approach: SHAP-guided feature selection and glass-box models (EBM, PLTR) for interpretable credit scoring with 10 features

Machine Learning with Multitype Protected Attributes: Intersectional Fairness through Regularisation (arxiv:stat). Regularisation via distance covariance for intersectional fairness across multiple protected attributes in regression and classification

An Information-Theoretic Framework for Credit Risk Modeling: Unifying Industry Practice with Statistical Theory for Fair and Interpretable Scorecards (arxiv:stat). Unified information-theoretic framework for credit risk: IV/PSI as divergences, WoE transitions, and fair, interpretable scorecards with binning and MIP Pareto optimization

"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations (arxiv:cs). Ensemble learning using Rashomon diversity: select high-performing models with explanations to boost generalization and robustness

Comparative Analysis of Global and Local Probabilistic Time Series Forecasting for Contiguous Spatial Demand Regions (arxiv:stat). Global LightGBM with station identifiers outperforms cluster- and station-level models for probabilistic demand forecasting across homogeneous to heterogeneous Divvy data

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
First dibs on merch (details still cooking)
That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

About Machine Learning Engineer

Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.

Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.

Subscribe now to join thousands of professionals who receive our weekly updates!