Data Scientist (with R): 7th October 2025
Published 7th October 2025
🌐 Community news and updates
The Sovereign Tech Fund invests $450,000 in R Foundation to Enhance R’s Sustainability and Security (r-consortium.org). Sovereign Tech Fund funds R Foundation to modernize core infrastructure, strengthen supply chain security, and boost reproducibility
Weekly recap (Oct 3, 2025) (blog.stephenturner.us). Weekly recap of R updates, Slidecrafting, AI trends, structural variation, biotech, RAG, tinytable, and related papers and talks
posit::conf(2025) Recap (posit.co). Posit conf highlights: Positron IDE, AI-powered workflows, Snowflake and Databricks partnerships, AWS demos, open-source updates, and enterprise deployments
R Weekly 2025-W40 Ducklake, Slidecrafting, Shiny & LLMs (rweekly.org). R Weekly highlights ducklake, Slidecrafting, Shiny & LLMs with ggplot2 styling and rOpenSci updates
🛠️ IDEs, Quarto and automation
Real-time pricing with a pretrained probabilistic stock return model (thierrymoudiki.github.io). Real-time pricing with a pretrained probabilistic stock return model using Python FastAPI and R Plumber
Automating the Github Copilot Agent from the command line with Copilot CLI (seascapemodels.org). Using Copilot CLI from R to automate agent runs, tool permissions, and directory isolation for reproducible experiments
Resolved: Bug affecting neonUtilities in latest RStudio version on Windows (neonscience.org). Bug fixed in neonUtilities on Windows with latest RStudio 2025.09.1; download package conflict resolved
2025 MAPOR Fall Webinar Series (mapor.org). MAPOR's Fall Webinar Series cover Career transitions, Quarto automation, and improved questionnaire design for survey research
Create a Quarto Document in Positron (posit.co). Positron offers integrated Quarto support with pre-installed extension, YAML validation, code cells, and a built-in terminal for publishing and collaboration
📊 ggplot2 mapping and visuals
Rising Fastball Velocities are Surpressing the Home Run (conormclaughlin.net). Hard fastballs suppress home runs; analysis uses 325k pitches from 2025, velocity bins, and R-like plotting with baseballr, tidyverse, and stringr
Mapping locations related to the Amelia Earhart disappearance (freerangestats.info). Mapping key Earhart/Noonan locations in the Pacific with R, ggplot2, and custom map-building code
Still here. Still writing occasional posts for a tiny audience. (nsaunders.wordpress.com). Explores AFL jumpers, data wrangling in R with dplyr and fitzRoy, and rare cases of players wearing different numbers in consecutive Grand Finals
ggplot2 styling (tidyverse.org). Styling ggplot2 with complete themes, theme elements, and extensions for typography, grids, panels, strips, and axis customization
European Basketball Success by Nation (stevenponce.netlify.app). Greece leads with 27 Final Four appearances and 10 titles in a faceted bar chart analysis using tidytuesday data
Double y-axis plots with ggplot2 and purrr (pacha.dev). Double y-axis plotting in ggplot2 using spuriouscorrelations data, scaling with purrr and tintin palette
📘 Statistics, text, and reporting
2025(1)The leisurely cruise begins: Excerpt from Excursion 1 Tour 1 of Statistical Inference as Severe Testing (SIST) (errorstatistics.com). Explores severity testing in statistics, anti-pseudoscience philosophy, and the 'severe testing' framework for evaluating evidence
Latent Semantic Scale based on Word2vec (blog.koheiw.net). Latent Semantic Scaling with Word2vec: probabilistic LSS using seed words and quanteda tokens
Welcome to Missing Data Solutions (missingdatasolutions.rbind.io). Missing Data Solutions covers missing data handling, pooling methods, and R packages like psfmi for Rubin’s Rules, D1-D3, and median pooling
Recreating APA Manual Table 7.23 in R with apa7 (wjschne.github.io). Recreating APA Table 7.23 in R using apa7, flextable, ftExtra, and tidyverse with hanging indents and decimal alignment
Recreating APA Manual Table 7.24 in R with apa7 (wjschne.github.io). Recreating APA Table 7.24 in R using apa7, flextable, ftExtra, tidyverse, and easystats with LME4 for data visualization
🧩 Iteration, simulation, idiomatic R
Iterating some sample data (kieranhealy.org). Iterates sample data to illustrate LLM evaluation via confusion matrices, R code, and tibble-based data frames
Mapply: When You Need to Iterate Over Multiple Inputs (drmowinckels.io). Using mapply to pair multiple varying inputs in R, with examples of scaling, labeling, and handling constants
A new simstudy function to make simulating replications easier (rdatagen.net). New simstudy function scenario_list generates all combinations for simulation setups, with grouping and replication via each
Construct objects with idiomatic R code (blog.stephenturner.us). Construct human-readable R objects with the constructive package using construct() for reproducible examples
Building a Command-Line Quiz Application in R (towardsdatascience.com). Step-by-step guide to building a command-line quiz in R using readline, trimws, tolower, lists, and functions
📚 Academic Research
False Discovery Rate Control via Bayesian Mirror Statistic (arxiv:stat). Bayesian Mirror Statistics for FDR control in high-dimensional variable selection using ADVI without data splitting
Compressed Bayesian Tensor Regression (arxiv:stat). Generalized tensor random projection with Bayesian inference for compressed tensor regression using low-rank representations and model averaging
Forecasting intraday particle number size distribution: A functional time series approach (arxiv:stat). Multilevel functional time series with a functional factor model for one-day-ahead forecasting of 51 intraday particle size curves in London
Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification (arxiv:stat). Bayesian nonparametric total robustness for measurement error in nonlinear regression using Dirichlet process priors and latent input pseudo-samples
One-shot variable-ratio matching with fine balance (arxiv:stat). One-shot variable-ratio matching with fine balance using one-shot optimization to achieve exact covariate balance in observational studies
👋 Before you go
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
- Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
- First dibs on merch (details still cooking)
- That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!