Machine Learning Engineer: 7th October 2025
🔧 Company Engineering Blogs
Revolutionizing Data Cloud: Unleashing the Power of the New ML Recommendations System (engineering.salesforce.com). Data Cloud-native ML recommendations system; flexible abstract schemas; multi-cluster architecture; CI/CD NDCG evaluation; Cursor AI-assisted development
SOTA OCR on-device with Core ML and dots.ocr (huggingface.co). On-device OCR with Core ML and dots.ocr: converting a 3B parameter model via CoreML/MLX, debugging, and benchmarking on Apple Neural Engine
Compute-Optimal Quantization-Aware Training (machinelearning.apple.com). Compute-Optimal Quantization-Aware Training improves QAT efficiency by modeling FP and QAT compute trade-offs and deriving a scaling law
AI as a research partner: Advancing theoretical computer science with AlphaEvolve (research.google). AI-assisted theory via AlphaEvolve evolves finite gadgets to improve MAX-4-CUT inapproximability and Ramanujan graphs for average-case hardness with rigorous verification
📈 Applied modeling and anomalies
Real-time pricing with a pretrained probabilistic stock return model (thierrymoudiki.github.io). Real-time pricing with a pretrained probabilistic stock return model using Python FastAPI and R Plumber
NGBoost (Natural Gradient Boosting) for Regression, Classification, Time Series forecasting and Reserving (thierrymoudiki.github.io). NGBoost-based regression, classification, time series forecasting, and reserving using cybooster with multiple base learners and sklearn integrations
Geological Modeling based on Machine Learning with Python and hatariTools - Tutorial (hatarilabs.com). Geological unit modeling with Python, hatariTools, 3D visualization, and ML classification on Queens Mary Reservoir data
Advancing Anomaly Detection for Industry Applications with NVIDIA NV-Tesseract-AD (developer.nvidia.com). NV-Tesseract-AD uses diffusion modeling, curriculum learning, and adaptive thresholds to enhance multivariate time-series anomaly detection
Smarter Anomaly Detection in Semiconductor Manufacturing with NVIDIA NV-Tesseract and NVIDIA NIM (developer.nvidia.com). NV-Tesseract anomaly detection for semiconductor fabs; multivariate time-series, anomaly localization, and NIM deployment
Order from disordered proteins: Physics-based algorithm designs biomolecules with custom properties (phys.org). Physics-based gradient optimization designs intrinsically disordered proteins with tailored properties using molecular dynamics and automatic differentiation
🏗️ MLOps and data platforms
Restoring Reliability in the AI-Aided Software Development Life Cycle (cacm.acm.org). AI-generated code boosts velocity; SRE-led risk models, testing, and observability drive reliability and resilience
Fine-Tuning Local Models with Docker Offload and Unsloth (docker.com). Fine-tuning local models with Docker Offload and Unsloth to create LoRA-based GGUF artifacts for PII masking
Revolutionizing Car Measurement Data Storage and Analysis: Mercedes-Benz's Petabyte-Scale Solution on the Databricks Intelligence Platform (databricks.com). Mercedes-Benz and Databricks benchmark petabyte-scale automotive time series using RLE, Liquid Clustering, and hierarchical metadata for optimized storage and analytics
Data & AI Infrastructure Are Fusing (tomtunguz.com). Unified AI-ready data stacks with vector databases, context layers, and real-time observability from Netflix and Stripe
Modernize fraud prevention: GraphStorm v0.5 for real-time inference (aws.amazon.com). GraphStorm v0.5 enables sub-second real-time inference and streamlined SageMaker deployment for enterprise-scale GNN fraud prevention
How Hapag-Lloyd improved schedule reliability with ML-powered vessel schedule predictions using Amazon SageMaker (aws.amazon.com). ML-powered vessel ETA predictions with hierarchical XGBoost models on SageMaker, orchestrated by SageMaker Pipelines and Step Functions at Hapag-Lloyd
🚀 ML systems and acceleration
Machine Learning in Trading : the CPU-GPU latency problem (quantblog.wordpress.com). Latency-driven ML trading: GPU inference, CPU simulation, and CPU-GPU colocated architectures like AMD Strix Halo for low-latency decision making
DiLoCo: Data Parallelism for the Datacenter Poor (hackbot.dad). Data parallelism basics, gradient accumulation, and DiLoCo for training large LLMs across heterogeneous, non-densely connected compute
SOTA OCR on-device with Core ML and dots.ocr (huggingface.co). On-device OCR with Core ML and dots.ocr: converting a 3B parameter model via CoreML/MLX, debugging, and benchmarking on Apple Neural Engine
Optimizing Drug Discovery Tools on AMD MI300s Part 2: 3D Molecular Generation with SemlaFlow (rocm.blogs.amd.com). SemlaFlow on AMD MI300X enables two-order speedups in 3D molecular generation and training optimizations with ROCm/PyTorch
CAP4D: 4D Avatars with Morphable Multi-View Diffusion Models (opencv.org). CAP4D merges Morphable Multi-View Diffusion Models with 3D Gaussian Splatting to render 4D avatars from few inputs in real time
Elevating 3D Scene Rendering with GSplat (rocm.blogs.amd.com). GPU-accelerated GSplat port for AMD ROCm; train and render 3D Gaussian splatting scenes on MI300X with multi-GPU support
🧠 LLM attention and evaluation
A practical blueprint for evaluating conversational AI at scale (dropbox.tech). Structured evaluation blueprint for conversational AI at scale: datasets, LLM judges, Braintrust, gated QA pipelines, and production-grade metrics
Attention in LLMs and Extrapolation (data-processing.club). Attention heads in LLMs: syntactic, streaming, retrieval, induction, function vectors, and iteration heads underpin in-context learning and extrapolation
Why do LLMs freak out over the seahorse emoji? (vgel.me). Investigation of seahorse emoji belief in LLMs using logit lens, lm_head mechanics, and cross-model behaviors
Iterating some sample data (kieranhealy.org). Iterates sample data to illustrate LLM evaluation via confusion matrices, R code, and tibble-based data frames
Evidence that Recent AI Gains are Mostly from Inference-Scaling (tobyord.com). Inference-scaling dominates gains over RL post-training in MATH 5, GPQA Diamond, and OTIS AIME benchmarks, per Sonnet 3.7 vs Sonnet 3.6 data
About DeepSeek Sparse Attention (sibellavia.lol). Dynamic sparse attention (DSA) with a lightning indexer and top-k token selection for query-specific adaptive context
📚 Indie math and NLP
CAUSAL CONCEPT-BASED EXPLANATIONS (medium.com/feedzaitech). Post-hoc, concept-based explanations using DiConStruct: a causal, SCM-enabled explainer for CBMs with counterfactual reasoning
APLearn: The Winning APL Forge 2025 Project (dyalog.com). Borna Ahmadzadeh presents APLearn, an open-source ML toolkit in APL inspired by scikit-learn, detailing design, features, and future improvements
Math Academy, update 3: I completed Linear Algebra (frankhecker.com). Progress update: completes Linear Algebra course, learns eigenvectors; reflects on Math Academy system and future courses
Latent Semantic Scale based on Word2vec (blog.koheiw.net). Latent Semantic Scaling with Word2vec: probabilistic LSS using seed words and quanteda tokens
Linkage with feijoas (11011110.github.io). Explores feijoas, Wikipedia entries, CSS features, geometric polyhedra, mesher concepts, and mean curvature flow in a playful blog post
Let the LLM Write the Prompts: An Intro to DSPy in Compound Al Pipelines (simonwillison.net). DSPy optimizes prompts for smaller models like Qwen3-0.6B to improve conflation in GIS with MIPROv2, enabling easy model switching
📐 Learning theory and scaling
Hyperparameter Optimization in Machine Learning (nowpublishers.com). Hyperparameter optimization techniques including random, bandit-, model-, population-, and gradient-based methods for ML, with online, constrained, and multi-objective extensions
Learning to act in generative settings (danmackinlay.name). Survey of optimizing agents vs. replicating persisters; with curiosity, empowerment, and POET as open-ended generators
AI as a research partner: Advancing theoretical computer science with AlphaEvolve (research.google). AI-assisted theory via AlphaEvolve evolves finite gadgets to improve MAX-4-CUT inapproximability and Ramanujan graphs for average-case hardness with rigorous verification
the Harvard and Brown school of computer science (xianblog.wordpress.com). Harvard-Brown school vs LeCun's neural networks; Bayesian inference and Markov random fields in pattern learning
connectionist networks (aarnphm.xyz). Explores connectionist networks, representations, backpropagation, universal approximation, inductive bias, tensor product representations, SMT/NTK, attention, and emergent cognition
lecture five (aarnphm.xyz). Lecture five covers scaling laws, power-law relations, and MuP parameterization for Transformers with practical takeaways
📚 Academic Research
FairContrast: Enhancing Fairness through Contrastive learning and Customized Augmenting Methods on Tabular Data (arxiv:cs). FairContrast: contrastive learning with customized augmentations to mitigate bias in tabular data while preserving accuracy
CardioForest: An Explainable Ensemble Learning Model for Automatic Wide QRS Complex Tachycardia Diagnosis from ECG (arxiv:cs). CardioForest: an optimized Random Forest ensemble with XGBoost/LightGBM for explainable WCT detection from MIMIC-IV ECG using SHAP explanations
C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems (arxiv:cs). Cohort-Contrastive Auxiliary Learning (C2AL) enhances attention in factorization machines to preserve minority cohorts in large-scale recommendations
fev-bench: A Realistic Benchmark for Time Series Forecasting (arxiv:cs). fev-bench: a 100-task time series forecasting benchmark with covariates and bootstrapped evaluation via fev library
Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD (arxiv:cs). Analyzes how DPSGD affects fairness across metrics and hyperparameter choices, examining DP leakage, utility-fairness trade-offs, and DPSGD-Global-Adapt
👋 Before you go
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
- Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
- First dibs on merch (details still cooking)
- That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
You may also like
About Machine Learning Engineer
Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.
Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.
Subscribe now to join thousands of professionals who receive our weekly updates!