🤖

Machine Learning Engineer: 15th July 2025

Published 15th July 2025

🔧 Company Engineering Blogs

AXLearn: Modular Large Model Training on Heterogeneous Infrastructure (machinelearning.apple.com). AXLearn offers modular deep learning system for scalable model training on varied infrastructure, emphasizing performance, complexity management, and rapid experimentation

Next Gen Data Processing at Massive Scale At Pinterest With Moka (Part 1 of 2) (medium.com/pinterest-engineering). Pinterest's Big Data Platform team transitions from Hadoop to Kubernetes-based architecture, introducing Moka to optimize Spark workloads and enhance data processing efficiency

How we built it: Jurisdiction resolution for Stripe Tax (stripe.com). Stripe introduces a jurisdiction resolution system (JRS) for accurate tax calculations amid complex US taxation landscapes using GIS and bounding box algorithms

🏗️ Data Engineering & MLOps

The Data Engineering Toolkit: Infrastructure, DevOps, and Beyond (ssp.sh). Explore advanced data engineering tools: SQL, Python libraries, orchestration, DevOps, Infrastructure as Code, and soft skills for effective data platform management

Graph foundation models for relational data (research.google). Graph foundation models leverage interconnected relational tables for better ML generalization, featuring tools like GNNs and Transformers by Google Research

Feature Store Architecture (blog.devgenius.io). Explore the importance of feature stores in MLOps, architectural choices, data layers, and tools like ClickHouse and Snowflake for machine learning

🔧 ML Algorithms & Model Optimization

Diffusion Elites: surprisingly good, simple and embarrassingly parallel (blog.christianperone.com). Diffusion Elites leverages pre-trained diffusion models and the Cross-Entropy Method for efficient, parallelized search in high-dimensional problem spaces

mlsauce (home to a model-agnostic gradient boosting algorithm) can now be installed from PyPI. (thierrymoudiki.github.io). mlsauce, a model-agnostic gradient boosting algorithm, is now available for installation from PyPI, enhancing ease of use and flexibility with various base learners

(Better) Split Decisions: More Stable CART By Using Subsampled Node-Level CV (gojiberries.io). Enhancing stability in CART models using subsampled node-level cross-validation to minimize sensitivity to training data variations

Boosting Stability: Fixing XGBoost Instability Under Row Permutation (gojiberries.io). XGBoost's row permutation sensitivity can lead to prediction variations. Investigates histogram binning, stability issues, and solutions for consistent model performance

Predicting the NASDAQ 100 with Hyperparameter Tuning (datageeek.com). Modeling NASDAQ 100 using boosted trees, hyperparameter tuning, and economic data like the Federal Funds Effective Rate and Unemployment Rate

KNN and K-means in Gini Prametric Spaces, at ECAI 2025, in Bologna, Italy (freakonometrics.hypotheses.org). Innovative Gini-based K-means and KNN algorithms enhance data classification and clustering, demonstrating resilience to noise at ECAI 2025 in Bologna, Italy

⚡ Hardware Performance & GPU Optimization

Outperform compiled PyTorch code using QuACK 🦆 (veitner.bearblog.dev). Implement efficient reduction methods using QuACK and CuTeDSL on modern GPUs for LLMs

what’s going on in the black clouds? (oklo.org). Exploring Landauer's limit in thermodynamics, image generation, and entropy in diffusion models with implications for data processing efficiency

Creating custom kernels for the AMD MI300 (huggingface.co). Custom kernel development for AMD MI300 GPUs enhances performance for Llama 3.1 405B in FP8 within VLLM, focusing on optimized RMS norm and SwiGLU kernels

The Crucial Role of NUMA Awareness in High-Performance Deep Learning (towardsdatascience.com). NUMA architecture significantly impacts deep learning performance; understanding system topology and implementing NUMA-aware PyTorch scripts are essential for optimization

🧠 LLM Architecture & Deep Learning

From Words to Worlds: from LLMs to VLAs (geoffreywoo.com). Geoffrey Woo discusses the evolution from LLMs to VLAs, emphasizing the need for data, supervision, physics, safety, and edge autonomy in robotics

The Simulation Hypothesis for Other Language Model Architectures (thedissonance.net). Explores the Simulation Hypothesis in LLMs, focusing on dLLMs and CoT models, examining their unique architectures and implications for AI behavior and alignment

Reading and Writing with Projections (mccormickml.com). Understanding Transformer models' feature directionality with projections for encoding and decoding speaker settings and overcoming dimensionality challenges

Writing an LLM from scratch, part 16 -- layer normalisation (gilesthomas.com). Explains layer normalization in LLMs, its importance in gradient propagation, and its implementation using mean and variance adjustments

H-Nets - the Past (goombalab.github.io). Exploring H-Nets, hierarchy in deep learning, chunking in cognition, the evolution from RNNs to S4 models, and insights on tokenization

H-Nets - the Future (goombalab.github.io). H-Nets introduce dynamic segmentation for improved modeling in complex languages and data modalities, enhancing efficiency in multimodal applications and language reasoning

Understand Neural Nets better, post 5 of N -- Code Assistant shootout (addxorrol.blogspot.com). Neural network training optimization using CUDA, hashing during GPU forward passes, and performance comparison between code assistants Gemini and Claude

🔬 Specialized Analysis & Research

Dipping my toes into the ducklake: Exploring gene expression data with R and python (tomsing1.github.io). Exploring gene expression data management and analysis using DuckDB, R, Python, RNA-seq, and the limma workflow for differential expression

Specialized R packages for spatial cross-validation: sperrorest and blockCV (geocompx.org). Overview of R packages sperrorest and blockCV for spatial cross-validation using temperature data in Spain, focusing on model performance and variable importance

Carnegie Mellon University at ICML 2025 (blog.ml.cmu.edu). CMU researchers present 127 papers on expected variational inequalities, adversarial voting, high-dimensional prediction, and scientific equation discovery at ICML 2025

Aseem Baranwal receives 2025 Cheriton Distinguished Dissertation Award (uwaterloo.ca). Aseem Baranwal awarded the 2025 Cheriton Distinguished Dissertation Award for groundbreaking research in graph neural networks and machine learning at the University of Waterloo

Matthew Regehr awarded 2025 CPI Graduate Excellence Scholarship (uwaterloo.ca). Matthew Regehr, a PhD candidate at Waterloo, receives the 2025 CPI Graduate Excellence Scholarship to advance research in privacy accounting and machine learning technology

Rising on arXiv - 2025-07-11 (blog.rinesi.com). Exploring advancements in multi-target tracking, spatial audio, and autoregressive modeling on arXiv, highlighting recent research trends

📚 Academic Research

CDC: Causal Domain Clustering for Multi-Domain Recommendation (arxiv:cs). Causal Domain Clustering enhances multi-domain recommendations using affinity matrices and causal discovery, improving performance in over 50 domains by 4.9%

Heterogeneity-Aware Regression with Nonparametric Estimation and Structured Selection for Hospital Readmission Prediction (arxiv:stat). Novel hierarchical-group structure kernel enables interpretable hospital readmission prediction, utilizing sparsity-inducing selection and enhancing modeling of patient heterogeneity and interactions

PromiseTune: Unveiling Causally Promising and Explainable Configuration Tuning (arxiv:cs). PromiseTune enhances configuration tuning by using causally purified rules, improving performance and explainability in software systems, outperforming 11 state-of-the-art tuners

Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees (arxiv:cs). GreedyLore enables low-rank gradient compression in distributed learning, achieving linear convergence rates with error feedback and semi-lazy updates, outperforming existing methods

Topic Modeling and Link-Prediction for Material Property Discovery (arxiv:cs). AI-driven framework combines Hierarchical and Boolean matrix factorization for link prediction in material science, uncovering hidden associations in TMD literature

Late Fusion Multi-task Learning for Semiparametric Inference with Nuisance Parameters (arxiv:stat). Late fusion framework for multi-task learning enhances semiparametric inference, estimating heterogeneous treatment effects with privacy-preserving nuisance parameters from diverse data sources

When Graph Contrastive Learning Backfires: Spectral Vulnerability and Defense in Recommendation (arxiv:cs). Graph Contrastive Learning enhances recommender systems but increases vulnerability to targeted promotion attacks; introduces CLeaR attack method and SIM defense framework

Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful (arxiv:cs). Explores stable training of language models with small batch sizes, recommends against gradient accumulation, and offers guidelines for optimizer hyperparameter scaling

Ensemble of Weak Spectral Total Variation Learners: a PET-CT Case Study (arxiv:cs). Ensemble of weak learners based on spectral total-variation features improves PET-CT predictions for skeletal metastases over deep learning and Radiomics

Multilayer GNN for Predictive Maintenance and Clustering in Power Grids (arxiv:cs). Multilayer GNN framework enhances predictive maintenance in power grids, integrating Graph Attention, Convolutional, and Isomorphism Networks for improved performance and clustering

Gradient boosted multi-population mortality modelling with high-frequency data (arxiv:stat). Novel gradient boosting integration in stochastic mortality models enhances forecast accuracy using high-frequency mortality data from 30 countries, addressing clustering challenges

Compressive Imaging Reconstruction via Tensor Decomposed Multi-Resolution Grid Encoding (arxiv:cs). Tensor Decomposed multi-resolution Grid encoding enhances compressive imaging reconstruction through efficient neural network optimization and hierarchical modeling for diverse imaging applications

Electricity Market Predictability: Virtues of Machine Learning and Links to the Macroeconomy (arxiv:econ). Comparative analysis of machine learning models for forecasting electricity prices in Singapore, emphasizing non-linearity, macroeconomic links, and time-series predictability

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
First dibs on merch (details still cooking)
That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

About Machine Learning Engineer

Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.

Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.

Subscribe now to join thousands of professionals who receive our weekly updates!