🤖

Machine Learning Engineer: 7th October 2025

Newsletters sent once a week, unsubscribe anytime.

Published 7th October 2025

🔧 Company Engineering Blogs

Revolutionizing Data Cloud: Unleashing the Power of the New ML Recommendations System (engineering​.salesforce​.com). Data Cloud-native ML recommendations system; flexible abstract schemas; multi-cluster architecture; CI/CD NDCG evaluation; Cursor AI-assisted development

SOTA OCR on-device with Core ML and dots.ocr (huggingface​.co). On-device OCR with Core ML and dots.ocr: converting a 3B parameter model via CoreML/MLX, debugging, and benchmarking on Apple Neural Engine

Compute-Optimal Quantization-Aware Training (machinelearning​.apple​.com). Compute-Optimal Quantization-Aware Training improves QAT efficiency by modeling FP and QAT compute trade-offs and deriving a scaling law

AI as a research partner: Advancing theoretical computer science with AlphaEvolve (research​.google). AI-assisted theory via AlphaEvolve evolves finite gadgets to improve MAX-4-CUT inapproximability and Ramanujan graphs for average-case hardness with rigorous verification

📈 Applied modeling and anomalies

Real-time pricing with a pretrained probabilistic stock return model (thierrymoudiki​.github​.io). Real-time pricing with a pretrained probabilistic stock return model using Python FastAPI and R Plumber

NGBoost (Natural Gradient Boosting) for Regression, Classification, Time Series forecasting and Reserving (thierrymoudiki​.github​.io). NGBoost-based regression, classification, time series forecasting, and reserving using cybooster with multiple base learners and sklearn integrations

Geological Modeling based on Machine Learning with Python and hatariTools - Tutorial (hatarilabs​.com). Geological unit modeling with Python, hatariTools, 3D visualization, and ML classification on Queens Mary Reservoir data

Advancing Anomaly Detection for Industry Applications with NVIDIA NV-Tesseract-AD (developer​.nvidia​.com). NV-Tesseract-AD uses diffusion modeling, curriculum learning, and adaptive thresholds to enhance multivariate time-series anomaly detection

Smarter Anomaly Detection in Semiconductor Manufacturing with NVIDIA NV-Tesseract and NVIDIA NIM (developer​.nvidia​.com). NV-Tesseract anomaly detection for semiconductor fabs; multivariate time-series, anomaly localization, and NIM deployment

Order from disordered proteins: Physics-based algorithm designs biomolecules with custom properties (phys​.org). Physics-based gradient optimization designs intrinsically disordered proteins with tailored properties using molecular dynamics and automatic differentiation

🏗️ MLOps and data platforms

Restoring Reliability in the AI-Aided Software Development Life Cycle (cacm​.acm​.org). AI-generated code boosts velocity; SRE-led risk models, testing, and observability drive reliability and resilience

Fine-Tuning Local Models with Docker Offload and Unsloth (docker​.com). Fine-tuning local models with Docker Offload and Unsloth to create LoRA-based GGUF artifacts for PII masking

Revolutionizing Car Measurement Data Storage and Analysis: Mercedes-Benz's Petabyte-Scale Solution on the Databricks Intelligence Platform (databricks​.com). Mercedes-Benz and Databricks benchmark petabyte-scale automotive time series using RLE, Liquid Clustering, and hierarchical metadata for optimized storage and analytics

Data & AI Infrastructure Are Fusing (tomtunguz​.com). Unified AI-ready data stacks with vector databases, context layers, and real-time observability from Netflix and Stripe

Modernize fraud prevention: GraphStorm v0.5 for real-time inference (aws​.amazon​.com). GraphStorm v0.5 enables sub-second real-time inference and streamlined SageMaker deployment for enterprise-scale GNN fraud prevention

How Hapag-Lloyd improved schedule reliability with ML-powered vessel schedule predictions using Amazon SageMaker (aws​.amazon​.com). ML-powered vessel ETA predictions with hierarchical XGBoost models on SageMaker, orchestrated by SageMaker Pipelines and Step Functions at Hapag-Lloyd

🚀 ML systems and acceleration

Machine Learning in Trading : the CPU-GPU latency problem (quantblog​.wordpress​.com). Latency-driven ML trading: GPU inference, CPU simulation, and CPU-GPU colocated architectures like AMD Strix Halo for low-latency decision making

DiLoCo: Data Parallelism for the Datacenter Poor (hackbot​.dad). Data parallelism basics, gradient accumulation, and DiLoCo for training large LLMs across heterogeneous, non-densely connected compute

SOTA OCR on-device with Core ML and dots.ocr (huggingface​.co). On-device OCR with Core ML and dots.ocr: converting a 3B parameter model via CoreML/MLX, debugging, and benchmarking on Apple Neural Engine

Optimizing Drug Discovery Tools on AMD MI300s Part 2: 3D Molecular Generation with SemlaFlow (rocm​.blogs​.amd​.com). SemlaFlow on AMD MI300X enables two-order speedups in 3D molecular generation and training optimizations with ROCm/PyTorch

CAP4D: 4D Avatars with Morphable Multi-View Diffusion Models (opencv​.org). CAP4D merges Morphable Multi-View Diffusion Models with 3D Gaussian Splatting to render 4D avatars from few inputs in real time

Elevating 3D Scene Rendering with GSplat (rocm​.blogs​.amd​.com). GPU-accelerated GSplat port for AMD ROCm; train and render 3D Gaussian splatting scenes on MI300X with multi-GPU support

🧠 LLM attention and evaluation

A practical blueprint for evaluating conversational AI at scale (dropbox​.tech). Structured evaluation blueprint for conversational AI at scale: datasets, LLM judges, Braintrust, gated QA pipelines, and production-grade metrics

Attention in LLMs and Extrapolation (data-processing​.club). Attention heads in LLMs: syntactic, streaming, retrieval, induction, function vectors, and iteration heads underpin in-context learning and extrapolation

Why do LLMs freak out over the seahorse emoji? (vgel​.me). Investigation of seahorse emoji belief in LLMs using logit lens, lm_head mechanics, and cross-model behaviors

Iterating some sample data (kieranhealy​.org). Iterates sample data to illustrate LLM evaluation via confusion matrices, R code, and tibble-based data frames

Evidence that Recent AI Gains are Mostly from Inference-Scaling (tobyord​.com). Inference-scaling dominates gains over RL post-training in MATH 5, GPQA Diamond, and OTIS AIME benchmarks, per Sonnet 3.7 vs Sonnet 3.6 data

About DeepSeek Sparse Attention (sibellavia​.lol). Dynamic sparse attention (DSA) with a lightning indexer and top-k token selection for query-specific adaptive context

📚 Indie math and NLP

CAUSAL CONCEPT-BASED EXPLANATIONS (medium​.com/feedzaitech). Post-hoc, concept-based explanations using DiConStruct: a causal, SCM-enabled explainer for CBMs with counterfactual reasoning

APLearn: The Winning APL Forge 2025 Project (dyalog​.com). Borna Ahmadzadeh presents APLearn, an open-source ML toolkit in APL inspired by scikit-learn, detailing design, features, and future improvements

Math Academy, update 3: I completed Linear Algebra (frankhecker​.com). Progress update: completes Linear Algebra course, learns eigenvectors; reflects on Math Academy system and future courses

Latent Semantic Scale based on Word2vec (blog​.koheiw​.net). Latent Semantic Scaling with Word2vec: probabilistic LSS using seed words and quanteda tokens

Linkage with feijoas (11011110​.github​.io). Explores feijoas, Wikipedia entries, CSS features, geometric polyhedra, mesher concepts, and mean curvature flow in a playful blog post

Let the LLM Write the Prompts: An Intro to DSPy in Compound Al Pipelines (simonwillison​.net). DSPy optimizes prompts for smaller models like Qwen3-0.6B to improve conflation in GIS with MIPROv2, enabling easy model switching

📐 Learning theory and scaling

Hyperparameter Optimization in Machine Learning (nowpublishers​.com). Hyperparameter optimization techniques including random, bandit-, model-, population-, and gradient-based methods for ML, with online, constrained, and multi-objective extensions

Learning to act in generative settings (danmackinlay​.name). Survey of optimizing agents vs. replicating persisters; with curiosity, empowerment, and POET as open-ended generators

AI as a research partner: Advancing theoretical computer science with AlphaEvolve (research​.google). AI-assisted theory via AlphaEvolve evolves finite gadgets to improve MAX-4-CUT inapproximability and Ramanujan graphs for average-case hardness with rigorous verification

the Harvard and Brown school of computer science (xianblog​.wordpress​.com). Harvard-Brown school vs LeCun's neural networks; Bayesian inference and Markov random fields in pattern learning

connectionist networks (aarnphm​.xyz). Explores connectionist networks, representations, backpropagation, universal approximation, inductive bias, tensor product representations, SMT/NTK, attention, and emergent cognition

lecture five (aarnphm​.xyz). Lecture five covers scaling laws, power-law relations, and MuP parameterization for Transformers with practical takeaways

📚 Academic Research

FairContrast: Enhancing Fairness through Contrastive learning and Customized Augmenting Methods on Tabular Data (arxiv:cs). FairContrast: contrastive learning with customized augmentations to mitigate bias in tabular data while preserving accuracy

CardioForest: An Explainable Ensemble Learning Model for Automatic Wide QRS Complex Tachycardia Diagnosis from ECG (arxiv:cs). CardioForest: an optimized Random Forest ensemble with XGBoost/LightGBM for explainable WCT detection from MIMIC-IV ECG using SHAP explanations

C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems (arxiv:cs). Cohort-Contrastive Auxiliary Learning (C2AL) enhances attention in factorization machines to preserve minority cohorts in large-scale recommendations

fev-bench: A Realistic Benchmark for Time Series Forecasting (arxiv:cs). fev-bench: a 100-task time series forecasting benchmark with covariates and bootstrapped evaluation via fev library

Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD (arxiv:cs). Analyzes how DPSGD affects fairness across metrics and hyperparameter choices, examining DP leakage, utility-fairness trade-offs, and DPSGD-Global-Adapt

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

  • Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
  • First dibs on merch (details still cooking)
  • That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

You may also like

About Machine Learning Engineer

Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.

Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.

Subscribe now to join thousands of professionals who receive our weekly updates!