🤖

Machine Learning Engineer: 24th June 2025

Newsletters sent once a week, unsubscribe anytime.

Published 24th June 2025

🔧 Company Engineering Blogs

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware (huggingface​.co). Learn efficient fine-tuning of the FLUX.1-dev model on consumer hardware using QLoRA with techniques like low-rank adaptation, gradient checkpointing, and 8-bit optimization for improved performance and reduced memory usage

Unlocking the Power of Customization: How Our Enrichment System Transforms Recommendation Data… (medium​.com/booking-com-development). The Enrichment System at Booking.com leverages field masks and GraphQL to streamline data retrieval and enhance recommendation processes, enabling developers to implement custom enrichments with improved efficiency and performance

🌍 Scientific Computing & Applications

Can multi-sensor foundation models be more than the sum of their parts? (fnands​.com). Geospatial foundation models (GFMs) like DOFA and TerraMind are examined for their effectiveness and generalization abilities across sensors, highlighting challenges in specialization versus generalization for Earth Observation applications

A deep dive into vector data cubes in Python (martinfleischmann​.net). Explore vector data cubes in Python using Xvec, Xarray, Shapely, and GeoPandas with real-world applications and multi-dimensional data analysis

Flow Matching Meets PDEs (ge​.in​.tum​.de). A novel Physics-Based Flow Matching framework integrates physical constraints with probabilistic models to enhance accuracy in generative modeling, showcasing significant improvements in physical residuals and distributional accuracy across PDE problems

Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning (microsoft​.com). Microsoft Research advances computational chemistry by using deep learning to improve the accuracy of density functional theory (DFT), with a new functional called Skala, enhancing predictions for chemical reactions and materials design

🔧 Machine Learning Algorithms & Methods

Greedy Over Longer Horizons: Look Ahead CART (gojiberries​.io). Look Ahead CART enhances decision tree algorithms by extending split evaluation horizons, utilizing bounded search and pruning techniques. This method effectively captures feature interactions, significantly improving predictive accuracy while managing computational costs

Greedy is Good. Less Greedy May be Better. (gojiberries​.io). Greedy algorithms like forward stepwise regression and CART may be efficient, but less greedy approaches using beam search and lookahead can lead to better global solutions in statistical and machine learning contexts

Stacked generalization (Machine Learning model stacking) + conformal prediction for forecasting with ahead::mlf (thierrymoudiki​.github​.io). This guide demonstrates using ahead::mlf for univariate probabilistic time series forecasting with machine learning, particularly Elastic Net, Stacked Generalization, and Conformal Prediction techniques

Time Series Forecasting with Graph Transformers (kumo​.ai). Explore time series forecasting using Graph Transformers, leveraging graph structures representing relational data, and advanced techniques such as Relational Deep Learning and generative modeling for improved accuracy and probabilistic outcomes

⚙️ Systems, Security & Engineering

An introduction to the Neural Network Watermarking Call for Technologies (blog​.chiariglione​.org). MPAI's Neural Network Watermarking initiative seeks to establish traceability and authenticity through watermarking and fingerprinting, addressing the growing demands for certified service quality in neural networks

Yuzhe You wins best student paper award at GI 2025 for novel cybersecurity tool (uwaterloo​.ca). PhD student Yuzhe You won the Michael A. J. Sweeney Award for a paper on VATRA, a visual analytics tool designed to improve adversarial training in machine learning and evaluate model trade-offs

CAP theorem in ML: Consistency vs. availability (aiacceleratorinstitute​.com). Explores the CAP theorem's impact on machine learning pipelines, examining trade-offs in data ingestion, training processes, and model serving, featuring tools like Kafka, TensorFlow, and strategies for graceful degradation

Hadoop, up, and away (sciencespot​.co​.uk). Hadoop-OptiStor enhances data processing efficiency on Hadoop by optimizing data distribution, replica management, and task scheduling, resulting in significant performance improvements for large-scale data systems

NVIDIA Tensor Core Evolution: From Volta To Blackwell (semianalysis​.com). NVIDIA's Tensor Cores have evolved significantly from Volta to Blackwell, optimizing AI operations and performance through innovations such as mixed-precision training, asynchronous data copy, and the utilization of specialized matrix multiplication techniques

🧠 Neural Networks & Deep Learning Research

Research Update: Applications of Local Volume Measurement (blog​.eleuther​.ai). Research explores local volume measurement in neural networks for detecting model behaviors, utilizing the tyche library and contrasting findings in model misalignment detection and mechanistic anomaly detection

Efficient RL Training - Optimizing Memory Usage in veRL (Draft) (hebiao064​.github​.io). Biao He and Ata Fatahi discuss optimizing memory usage in veRL, a reinforcement learning library, utilizing techniques like Fully Sharded Data Parallel, Megatron-LM, and the torch_memory_saver library to address GPU memory challenges

How is Spiky Superhuman AI trained? (medium​.com/@danieldkang). Spiky superhuman AI (SSAI) leverages reinforcement learning (RL) and search techniques, exemplified by Google's AlphaEvolve and other models, to solve complex problems like AIME math competitions more effectively than humans

Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285) (datascienceathome​.podbean​.com). This episode explores neuromorphic computing, highlighting spiking neural networks (SNNs), their efficiency, and applications in devices like low-power drones, hearing aids, and event-based cameras

Backpropagating quasi-randomized neural networks (thierrymoudiki​.github​.io). Explore backpropagating quasi-randomized neural networks using FiniteDiffRegressor, a Python package blending finite difference-based training with supervised machine learning, convenient for data-driven decisions through the Techtonique web app

📊 Bayesian Methods & Econometrics

SCOR Foundation for Science Webinar, ML and Econometrics (freakonometrics​.hypotheses​.org). A discussion on econometrics vs machine learning highlights their different focuses on causality and prediction. Key concepts include loss functions, maximum likelihood estimation, and models like logistic regression and ordinary least squares

BayesComp 2025.4 (xianblog​.wordpress​.com). Emtiyaz Khan presented on adaptive Bayesian intelligence at BayesComp 2025.4, discussing variational Bayes, Adam optimization, federated learning, and applications in cancer research, while speakers highlighted privacy and simulation-based inference techniques

BayesComp 2025.3 (xianblog​.wordpress​.com). BayesComp 2025.3 features talks on horseshoe priors, Langevin algorithms, and PDMPs, including topics like change-point detection, Wasserstein gradient flows, and approximate Bayesian methods, with discussions of advanced statistical techniques

Understanding the Link Between Uncertainty and Imports by glmnet (datageeek​.com). The analysis shows that increased economic uncertainty in late 2024 led to a rise in imports, utilizing glmnet for modeling, revealing a negative correlation related mainly to gold imports

📐 Mathematical Foundations

Polarities (Part 6) (johncarlosbaez​.wordpress​.com). John Baez and Adittya Chaudhuri explore polarities in graph theory, presenting directed graphs with edges labeled by a monoid to model positive and negative effects, and studying feedback loops via homology

An Explicit Computation in Derived Algebraic Geoemtry (grossack​.site). Exploring derived algebraic geometry through examples, highlighting intersections of curves, derived tensor products, and the significance of flatness and smoothness in mathematical computations

How to Explicitly Compute Charts for a Levelset Submanifold (grossack​.site). This blog post explores explicit chart computations for levelset submanifolds, focusing on $SL_2(\mathbbR)$ and hyperboloids, highlighting the use of Jacobians and various charts based on conditions of coordinates

Animating Linear Transformations with Quiver (towardsdatascience​.com). Explore how animated quiver plots in Python's Matplotlib help visualize linear transformations and concepts like Singular Value Decomposition, by understanding vector movements and transformations through code snippets

🎯 Specialized Applications & Research

The Little Learner and hotel room hacking (dustycloud​.org). Christine Lemmer-Webber explores 'The Little Learner' for deep learning insights, experimenting with Racket and Guile, while juggling hotel dorm distractions and attending a friend's wedding

[128] LinkedOut: The Best Published Audit Study, And Its Interesting Shortcoming (datacolada​.org). A LinkedIn audit study comparing responses to networking requests from Black and White young men is praised for its clever methodology but criticized for a lack of diversity in the stimuli used, affecting the generalizability of results

Community Spotlight: Kirill Brodt (drivendata​.co). Kirill Brodt, a PhD student in computer graphics, discusses using machine learning for automating animation tasks, including inbetweening and 3D pose estimation, while highlighting his competition successes on DrivenData

Autoformalization: Bridging Human Mathematical Intuition and Machine Precision (medium​.com/intuitionmachine). Autoformalization aims to bridge human mathematical intuition with machine precision by translating informal mathematical content into formal proofs using semantic embedding spaces, guiding exploration, and verification systems to address inherent challenges and ambiguities

Methodologies to Improve the Role of White Matter Hyperintensities As Neuroimaging Biomarker Of Alzheimer’s Disease (aliceinstatisticsland​.wordpress​.com). Valentina Bordin explores the use of white matter hyperintensities and deep learning models, like BIANCA, to improve Alzheimer’s disease diagnostics, enhancing early detection and treatment through data from sources like the UK Biobank

📚 Academic Research

RocketStack: A level-aware deep recursive ensemble learning framework with exploratory feature fusion and model pruning dynamics (arxiv:cs). RocketStack introduces a level-aware recursive ensemble framework utilizing mild Gaussian noise, attention-based selection, SFE filters, and autoencoders to achieve deeper stacking, significantly improving accuracy while reducing complexity and runtime across various datasets

TabArena: A Living Benchmark for Machine Learning on Tabular Data (arxiv:cs). TabArena introduces a continuously maintained benchmarking system for machine learning on tabular data, showcasing ensembles and deep learning's competitiveness, alongside gradient-boosted trees. It features a public leaderboard and reproducible code

Revisiting Randomization in Greedy Model Search (arxiv:cs). Proposes a randomized ensemble of greedy forward selection estimators for sparse linear regression, improving computational efficiency and outperforming lasso and elastic net, while reshaping the bias-variance trade-off through dynamic programming

Comparative analysis of machine learning techniques for feature selection and classification of Fast Radio Bursts (arxiv:astro). Machine learning techniques for classifying Fast Radio Bursts using PCA, t-SNE, and clustering methods, enhancing understanding of repeating and non-repeating FRBs

SIDE: Semantic ID Embedding for effective learning from sequences (arxiv:cs). A novel Semantic ID (SID) embedding method uses vector quantization and a multi-task VQ-VAE framework to enhance industrial ad-recommendation systems, achieving 2.4X normalized entropy gain and 3X data footprint reduction

HARMONY: A Scalable Distributed Vector Database for High-Throughput Approximate Nearest Neighbor Search (arxiv:cs). Harmony is a distributed ANNS system employing a novel multi-granularity partition strategy, achieving 4.63 times throughput improvement and 58% better performance over traditional methods in skewed workloads through efficient load balancing and early-stop pruning

Advancing Loss Functions in Recommender Systems: A Comparative Study with a Rényi Divergence-Based Solution (arxiv:cs). Loss functions such as Softmax Loss and Cosine Contrastive Loss are analyzed for their strengths and limitations. A new loss function, DrRL, utilizing Rényi divergence, enhances recommendation accuracy and robustness

Sequential Policy Gradient for Adaptive Hyperparameter Optimization (arxiv:cs). Sequential Policy Gradient (SPG) offers a novel trajectory generation approach for lightweight online hyperparameter optimization, extending models with temporary modules, demonstrating consistent performance gains across diverse datasets with low computational costs

Manifold Learning for Personalized and Label-Free Detection of Cardiac Arrhythmias (arxiv:cs). Nonlinear dimensionality reduction techniques, like t-SNE and UMAP, effectively detect cardiac arrhythmias in ECG data, achieving over 90% accuracy without labels, thereby enhancing personalized health monitoring

Bayesian Hybrid Machine Learning of Gallstone Risk (arxiv:stat). A hybrid machine learning framework integrates Adaptive LASSO for variable selection and Bayesian Additive Regression Trees for interaction detection, enhancing gallstone disease prediction and offering actionable clinical insights

A Scalable Factorization Approach for High-Order Structured Tensor Recovery (arxiv:math). This study introduces a scalable Riemannian gradient descent method for high-order tensor recovery, utilizing orthonormal factor constraints to ensure convergence to the ground-truth tensor at a linear rate under mild conditions

Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size (arxiv:stat). This research analyzes the impact of vocabulary size on training dynamics in large language models, leading to a new optimal embedding learning rate ratio that scales as Θ(√width), contrasting with previous predictions by Maximal Update Parametrization

Scalable Machine Learning Algorithms using Path Signatures (arxiv:stat). Path signatures provide robust, scalable solutions in machine learning, enhancing time series modeling, deep learning, and graph neural networks through innovative kernel and tensor methods

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

  • Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
  • First dibs on merch (details still cooking)
  • That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

You may also like

About Machine Learning Engineer

Our Machine Learning Engineer newsletter covers the latest developments, research papers, tools, and techniques in ML engineering and deployment. Each week, we curate the most important content so you don't have to spend hours searching.

Whether you're a beginner or expert in machine learning engineering, our newsletter provides valuable information to keep you informed and ahead of the curve in this technically challenging field.

Subscribe now to join thousands of professionals who receive our weekly updates!