📊

Data Scientist (with R): 7th October 2025

Newsletters sent once a week, unsubscribe anytime.

Published 7th October 2025

🌐 Community news and updates

The Sovereign Tech Fund invests $450,000 in R Foundation to Enhance R’s Sustainability and Security (r-consortium​.org). Sovereign Tech Fund funds R Foundation to modernize core infrastructure, strengthen supply chain security, and boost reproducibility

Weekly recap (Oct 3, 2025) (blog​.stephenturner​.us). Weekly recap of R updates, Slidecrafting, AI trends, structural variation, biotech, RAG, tinytable, and related papers and talks

posit::conf(2025) Recap (posit​.co). Posit conf highlights: Positron IDE, AI-powered workflows, Snowflake and Databricks partnerships, AWS demos, open-source updates, and enterprise deployments

R Weekly 2025-W40 Ducklake, Slidecrafting, Shiny & LLMs (rweekly​.org). R Weekly highlights ducklake, Slidecrafting, Shiny & LLMs with ggplot2 styling and rOpenSci updates

🛠️ IDEs, Quarto and automation

Real-time pricing with a pretrained probabilistic stock return model (thierrymoudiki​.github​.io). Real-time pricing with a pretrained probabilistic stock return model using Python FastAPI and R Plumber

Automating the Github Copilot Agent from the command line with Copilot CLI (seascapemodels​.org). Using Copilot CLI from R to automate agent runs, tool permissions, and directory isolation for reproducible experiments

Resolved: Bug affecting neonUtilities in latest RStudio version on Windows (neonscience​.org). Bug fixed in neonUtilities on Windows with latest RStudio 2025.09.1; download package conflict resolved

2025 MAPOR Fall Webinar Series (mapor​.org). MAPOR's Fall Webinar Series cover Career transitions, Quarto automation, and improved questionnaire design for survey research

Create a Quarto Document in Positron (posit​.co). Positron offers integrated Quarto support with pre-installed extension, YAML validation, code cells, and a built-in terminal for publishing and collaboration

📊 ggplot2 mapping and visuals

Rising Fastball Velocities are Surpressing the Home Run (conormclaughlin​.net). Hard fastballs suppress home runs; analysis uses 325k pitches from 2025, velocity bins, and R-like plotting with baseballr, tidyverse, and stringr

Mapping locations related to the Amelia Earhart disappearance (freerangestats​.info). Mapping key Earhart/Noonan locations in the Pacific with R, ggplot2, and custom map-building code

Still here. Still writing occasional posts for a tiny audience. (nsaunders​.wordpress​.com). Explores AFL jumpers, data wrangling in R with dplyr and fitzRoy, and rare cases of players wearing different numbers in consecutive Grand Finals

ggplot2 styling (tidyverse​.org). Styling ggplot2 with complete themes, theme elements, and extensions for typography, grids, panels, strips, and axis customization

European Basketball Success by Nation (stevenponce​.netlify​.app). Greece leads with 27 Final Four appearances and 10 titles in a faceted bar chart analysis using tidytuesday data

Double y-axis plots with ggplot2 and purrr (pacha​.dev). Double y-axis plotting in ggplot2 using spuriouscorrelations data, scaling with purrr and tintin palette

📘 Statistics, text, and reporting

2025(1)The leisurely cruise begins: Excerpt from Excursion 1 Tour 1 of Statistical Inference as Severe Testing (SIST) (errorstatistics​.com). Explores severity testing in statistics, anti-pseudoscience philosophy, and the 'severe testing' framework for evaluating evidence

Latent Semantic Scale based on Word2vec (blog​.koheiw​.net). Latent Semantic Scaling with Word2vec: probabilistic LSS using seed words and quanteda tokens

Welcome to Missing Data Solutions (missingdatasolutions​.rbind​.io). Missing Data Solutions covers missing data handling, pooling methods, and R packages like psfmi for Rubin’s Rules, D1-D3, and median pooling

Recreating APA Manual Table 7.23 in R with apa7 (wjschne​.github​.io). Recreating APA Table 7.23 in R using apa7, flextable, ftExtra, and tidyverse with hanging indents and decimal alignment

Recreating APA Manual Table 7.24 in R with apa7 (wjschne​.github​.io). Recreating APA Table 7.24 in R using apa7, flextable, ftExtra, tidyverse, and easystats with LME4 for data visualization

🧩 Iteration, simulation, idiomatic R

Iterating some sample data (kieranhealy​.org). Iterates sample data to illustrate LLM evaluation via confusion matrices, R code, and tibble-based data frames

Mapply: When You Need to Iterate Over Multiple Inputs (drmowinckels​.io). Using mapply to pair multiple varying inputs in R, with examples of scaling, labeling, and handling constants

A new simstudy function to make simulating replications easier (rdatagen​.net). New simstudy function scenario_list generates all combinations for simulation setups, with grouping and replication via each

Construct objects with idiomatic R code (blog​.stephenturner​.us). Construct human-readable R objects with the constructive package using construct() for reproducible examples

Building a Command-Line Quiz Application in R (towardsdatascience​.com). Step-by-step guide to building a command-line quiz in R using readline, trimws, tolower, lists, and functions

📚 Academic Research

False Discovery Rate Control via Bayesian Mirror Statistic (arxiv:stat). Bayesian Mirror Statistics for FDR control in high-dimensional variable selection using ADVI without data splitting

Compressed Bayesian Tensor Regression (arxiv:stat). Generalized tensor random projection with Bayesian inference for compressed tensor regression using low-rank representations and model averaging

Forecasting intraday particle number size distribution: A functional time series approach (arxiv:stat). Multilevel functional time series with a functional factor model for one-day-ahead forecasting of 51 intraday particle size curves in London

Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification (arxiv:stat). Bayesian nonparametric total robustness for measurement error in nonlinear regression using Dirichlet process priors and latent input pseudo-samples

One-shot variable-ratio matching with fine balance (arxiv:stat). One-shot variable-ratio matching with fine balance using one-shot optimization to achieve exact covariate balance in observational studies

👋 Before you go

I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:

  • Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
  • First dibs on merch (details still cooking)
  • That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing

If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.

You may also like

About Data Scientist (with R)

Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.

Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.

Subscribe now to join thousands of professionals who receive our weekly updates!