🤖

Machine Learning Engineer: 2nd September 2025

Published 2nd September 2025

🔧 Company Engineering Blogs

What the interns have wrought, 2025 edition (blog.janestreet.com). Intern projects include Faster (J)SQL evaluation with JSQL, better Torch bindings via OxCaml, Memtrace memory leaks, and ref-counted shared memory with OxCaml modes

Moving ahead faster with fallbacks (booking.ai). Fallbacks in ranking service enable fast experimentation, reliability, and ML-induced innovation without outages

Engineering stories behind the Medium Daily Digest Algorithm: Part 1 (medium.engineering). How Apple Mail Privacy Protection and filtering adjustments boosted digest quality and engagement through adjusted filtering rules and A/B testing

Simplifying Large-Scale LLM Processing across Instacart with Maple (tech.instacart.com). Maple: Instacart’s batch LLM processing service for scalable, cost-efficient, auditable prompts across catalogs, fulfillment, and search

Revolutionizing warehouse automation with scientific simulation (amazon.science). Sensor Workbench (SWB) on NVIDIA Isaac Sim enables parallel GPU-based sensor simulations, CAD-to-OpenUSD pipeline, and OpenUSD ground-truth for barcode detection in warehouses

🔬 Applied ML Research & Domain Applications

Building a YOLOX Plate Detector – Setup, Fine-Tuning, Metrics, Dashcam inference (poeticoding.com). YOLOX plate detector setup, fine-tuning, COCO annotations, ONNX export, dashcam inference, and evaluation

Two Papers Accepted at APSIPA 2025 in Singapore (bagustris.blogspot.com). Two papers on dementia prediction from optimized prosodic features and longitudinal cough TB detection for APSIPA 2025 Singapore

SURP Student Spotlight: Nava Wolfish (dunlap.utoronto.ca). SURP spotlight on Nava Wolfish: ML-based stellar abundances from JWST NIRSpec and APOGEE data with contrastive learning

Team Brings Lung Cancer Into Focus with 3D Imaging Innovation (cmu.edu). NIH-funded CMU collaboration uses Magnify expansion microscopy and omni-mesoscopes to map 3D tumor microenvironments at nanoscale for lung cancer research

Empowering air quality research with secure, ML-driven predictive analytics (aws.amazon.com). Low-code SageMaker Canvas-based PM2.5 imputation using AWS AI services, Lambda, Step Functions, and RDS in Africa’s sensor networks

🎯 Classification, NLP & Feature Engineering

Interestingness First Classifiers (data-processing.club). EUREKA: selecting interesting features with pairwise LLM comparisons to build rule-based classifiers on tabular data

Engineering a Scalable Topic Pipeline: A BERTopic and GenAI Case Study (medium.com/gumgum-tech). Hybrid BERTopic + Agglomerative Clustering pipeline with cuML GPU acceleration, post-processing, and GenAI for topic merging and naming

Symmetry in subword segmentation (languagelog.ldc.upenn.edu). Symmetry in subword segmentation across 32 languages; comparison of manual, probabilistic, and BPE-MR methods for minimizing text redundancy

Algebraic approach reveals how to restore complex altered gene networks (phys.org). KAIST uses Boolean networks, semi-tensor product, and Taylor approximation to identify gene control targets restoring altered stimulus–response patterns

📈 Statistical Learning & Forecasting Methods

external regressors in ahead::dynrmf’s interface for Machine learning forecasting (thierrymoudiki.github.io). External regressors in ahead::dynrmf interface demonstrated with USAccDeaths, AirPassengers, fpp2 a10, fdeaths; xreg creation; runs with ridge and glmnet cv.glmnet

Historical notes on semi-parametric theory and estimation (herbsusmann.com). Historical notes on von Mises functionals, sample splitting, one-step estimation, and key references in semiparametric theory

PyData Berlin 2025: Introduction to Stochastic Variational Inference with NumPyro (juanitorduz.github.io). Intro to Stochastic Variational Inference with NumPyro: SVI concepts, Gamma toy, AutoNormal, Predictive, BNNs, Flax NNX integration, SVI with AutoGuides

A One-Slide Summary of Tree-Based Classification and Regression (jamesmccaffrey.wpcomstaging.com). One-slide refresher on tree-based classification and regression: bagging, bootstrap aggregation, boosting, random forests, weak learners, and memory aids

Marginal Effect of Hyperparameter Tuning with XGBoost (towardsdatascience.com). Bayesian hyperparameter optimization with hyperopt (TPE), SMBO, EI, and broader vs narrower XGBoost search spaces

⚡ ML Systems & GPU Optimization

Draft - Efficient RL Training - Optimizing Weight Sync in slime (hebiao064.github.io). Weight synchronization in slime: CUDA IPC, asynchronous tensor gathering, tensor bucketing, SGLang server calls, and 120s→7s optimizations for RL training with Megatron, PPO/GRPO

The Parallelism Mesh Zoo (blog.ezyang.com). Overview of device mesh concepts and parallelism strategies: DP, FSDP, HSDP, TP, SP, Ulysses, CP, PP, EP across multi-dimensional device meshes

Why are CUDA kernels hard to optimize? (johndcook.com). Investigates GPU kernel optimization challenges, memory hierarchies, tiling, block size, prefetching, caching, and autotuning across eight GPUs with PTX/SASS exploration

Deploying DeepSeek on 96 H100 GPUs (lmsys.org). Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs

📊 Embeddings, Quantization & Information Theory

GeoTessera Python library released for geospatial embeddings (anil.recoil.org). GeoTessera Python library for accessing 128-band 10 m2 geospatial embeddings from Sentinel data and GIS workflows

quantisation basics (aarnphm.xyz). Quantization, uniform/non-uniform, MSQE; kv cache pruning; KV quantization (KVQuant, SKVQ, KIVI, AdaKV, PyramidKV); multi-head attention, per-token KV; RoPE conflicts; DeepSeek KV compression; two-batch overlap (TBO); RMDA/NIXL; KV-aware routing; prefill/decode timing;

How big are our embeddings now and why? (veekaybee.github.io). Embedding sizes grow from 768 to 4096+, OpenAI’s 1536 norm, HuggingFace standardization, MTEB benchmarks, matryoshka representations, vector databases commoditization

The Theoretical Limitations of Embedding-Based Retrieval (arxiv.org). Theoretical limitations of embedding-based retrieval, kernelized similarity, retrieval error bounds, and implications for practical IR systems

🧠 Deep Learning Theory & Mathematical Foundations

Cracking the Black Box: Six Lenses for Understanding Deep Learning (kalhansblog.blogspot.com). Six lenses—NTK, Information Bottleneck, Mean-Field theory, Loss landscape, Geometric Deep Learning, PAC-Bayes—explain deep learning generalisation

Information bottleneck method (aarnphm.xyz). Information bottleneck principle for representing X with compressed T to maximize I(T;Y) while minimizing I(X;T), via Lagrangian β, information plane, mutual information, conditional entropy, and TBP visualization

Notes: A Brief History of Intelligence (Bennett) (scyy.fi). Notes: A Brief History of Intelligence (Bennett) traces steering, affect, TD learning, pattern recognition, neocortex function, vicarious trial and error, episodic memory, theory of mind, language, and morality across nematodes to humans

The Price of Unearned Knowledge: Jung’s Warning and the Crisis of Modern Machine Learning (medium.com/intuitionmachine). Jungian warning on unearned knowledge applied to ML: FER, UFR, compression-decompression, temporal ordering, curriculums, meta-learning, and evolvability

📐 Mathematical Structures & Computational Geometry

Million Point Sculptures: an exploration tool written in Metal (hunsley.io). Ynfold: a Metal-based MacOS/iPadOS explorer for Million Point Sculptures, real-time MPS rendering, hashing, RNG options, and focus-based geometric invariants

Linkage (11011110.github.io). 3d and layered QR codes, developable surfaces from flat strips, AI slop in knowledge, matroid parity, cubical spheres, topological book embeddings, LATIN call, Wikipedia search critique

Multilinear polynomials: survival kit (blog.lambdaclass.com). Multilinear polynomials, hypercube interpolation, Lagrange basis, coordinates via evaluations, tensor product structure, and variable-dependence tests for products p_k(X)

Five-arc fractal (11011110.github.io). Five-arc fractal replaces arcs with five congruent sub-arcs; C1 smooth curve, no convex arcs, convex-arc-free, connects to preprint on Stabbing faces by a convex curve

Deliberate play (koaning.io). Deliberate play concept, interactive Matrix widget (wigglystuff), PCA demo, reactive notebooks, curiosity-driven exploration in linear algebra

The biggest math symbol (johndcook.com). Riemann P-symbol (Papperitz) for solutions to Riemann’s differential equation with three regular singular points a, b, c and Möbius-transformation behavior

📚 Academic Research

Interestingness First Classifiers (arxiv:stat). Exposes critical security vulnerabilities in Python's pickle serialization used by ML frameworks, demonstrating bypass techniques against existing scanners. Essential reading for any Python ML engineer dealing with model serialization and supply chain security

FORGE: Foundational Optimization Representations from Graph Embeddings (arxiv:cs). Introduces a pre-trained graph autoencoder for mixed-integer programming instances, enabling transfer learning across optimization problems. Significant for ML engineers working on optimization problems and combinatorial challenges in production systems

Fast and Scalable Mixed Precision Euclidean Distance Calculations Using GPU Tensor Cores (arxiv:cs). Achieves 2.5-51× speedup in Euclidean distance calculations using GPU tensor cores with mixed precision arithmetic. Critical for ML engineers optimizing similarity searches, clustering, and nearest neighbor algorithms at scale

A Mixture of Experts Gating Network for Enhanced Surrogate Modeling in External Aerodynamics (arxiv:cs). NVIDIA researchers combine three specialized neural architectures using mixture-of-experts with entropy regularization for CFD surrogate modeling. Demonstrates practical MoE implementation patterns valuable for complex multi-domain prediction tasks

MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training (arxiv:cs). Proposes new optimizer using max-norm trust ratios and element-wise scaling for stable large-batch training of neural networks. Addresses fundamental optimization challenges in distributed training with rigorous mathematical foundations

Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits (arxiv:stat). Develops first algorithm guaranteeing optimal regret bounds in both adversarial and stochastic settings with efficient KKT-based projections. Advances mathematical foundations of online learning algorithms with practical computational improvements

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
First dibs on merch (details still cooking)
That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

About Machine Learning Engineer

Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.

Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.

Subscribe now to join thousands of professionals who receive our weekly updates!