Data Scientist (with R): 8th July 2025
📚 Academic Research
clustra: A multi-platform k-means clustering algorithm for analysis of longitudinal trajectories in large electronic health records data (arxiv:stat). Clustra: a multi-platform k-means algorithm for clustering longitudinal health data, utilizing R and SAS for trajectory analysis of electronic health records
rdhte: Conditional Average Treatment Effects in RD Designs (arxiv:econ). Estimation of heterogeneous treatment effects in sharp RD designs using rdhte software, featuring automatic bandwidth selection and robust inference methods
Tensor-product interactions in Markov-switching models (arxiv:stat). Tensor-product interactions in Markov-switching models enhance ecological time series analysis, enabling complex multidimensional effects and efficient smoothing procedures for latent behavior inference
🌐 Community & Open Science
Our July Issue is out now! (methodsblog.com). July issue highlights microbial ecology R package MicroEcoTools, pnetr ecosystem modeling, novel leaf-wood classification, SWELL phenology model, and affordable TinyCO2 system
Postdoctoral Fellow at Purdue Statistics (bayesian.org). Postdoctoral Fellow position at Purdue University focusing on Bayesian analysis, generative models, and multivariate processes, supervised by Dr. Anindya Bhadra
Open Science with a Latin American Identity: Meet the New Cohort of the rOpenSci Champions Program (ropensci.org). New rOpenSci Champions from Latin America focus on open science, R programming, and diverse projects addressing real-world challenges in various fields
R Weekly 2025-W28 Quarto syntax, BDD, scalarized bug hunting (rweekly.org). R Weekly highlights Quarto syntax, Behavior-Driven Development in R, new R packages, and statistical learning for UEFA Women’s Euro 2025 prediction
Social Coworking and Office Hours - Research Software Engineering and R (ropensci.org). Join Saranjeet Kaur Bhogal and Yanina Bellini Saibene for online coworking focused on Research Software Engineering and R on July 1, 2025
🔧 Development Tools & Programming
Hash-tag baby (openanalytics.eu). Explore how hashing techniques like UUID and md5 can help expectant parents avoid baby name clashes while keeping names confidential
Dive()ing into the hunt #rstats (milesmcbain.micro.blog). R programming, data analysis, tidyverse, debugging, functions, vectorisation, browser debugging, statistical calculations, interactive evaluation in data frames
Open files in external programs with Positron or Visual Studio Code (andrewheiss.com). Utilize macOS writing tools with Positron and Visual Studio Code to open Quarto files in external editors like Typora via custom tasks
Use Positron to run R inside Docker a image through SSH (andrewheiss.com). Leverage Positron to SSH into Docker containers running R, enhancing reproducibility for researchers while simplifying coding and development processes across different environments
📝 Quarto, Documentation & Workflows
Generating quarto syntax within R (blog.djnavarro.net). Data visualization in R using babynames and quarto syntax for enhanced data analysis and presentation workflows
Creating tutorial worksheets: Quarto profiles for the win! (remlapmot.github.io). Create tutorial worksheets with Quarto profiles for R, Python, Stata, and Julia; implement conditional content for questions and solutions documents
High quality figures from R Markdown or Quarto to Word (riinu.me). Enhance quality of figures in R Markdown or Quarto for HTML and Word by adjusting DPI settings to 200 or 300
R Package Quality: Documentation (jumpingrivers.com). Evaluating R package documentation quality using a scoring framework assessing exported objects, examples, NEWS files, vignettes, and websites
📈 Data Visualization & Bioinformatics
Data Strips Experiment (rawdatastudies.com). Data Strips app explores new graphic representations for summarizing variable distributions using techniques like Grubbs's outlier test and shortest intervals
From R to Tableau - Leverage Both Tools for Effective Dashboards (codingthepast.com). Explore visualizing Chilean dictatorship data using the pinochet R package and Tableau Public, comparing strengths of R and Tableau for effective dashboard creation
Plot Data Along a Genome with karyoploteR (blog.stephenturner.us). Visualize genome data with karyoploteR: genome density, coverage, structural variation, GWAS plots, and gene expression from DESeq2
📊 Statistical Analysis & Methods
Understanding SSE in Regression (marsja.se). Learn about Sum of Squared Errors (SSE) in regression analysis, its significance, and R code for calculation
A Primer on Power Simulations when Evaluating Experimental Designs (joachim-gassen.github.io). Understanding power simulations for experimental design: effect sizes, sample sizes, Monte Carlo simulations, and confidence intervals using R’s pwr package
Continuous diff in diff, regressions and the 2x2 (causalinf.substack.com). Exploration of continuous diff-in-diff methods, differences in regression approaches, implications for causal inference, and personal reflections on econometric practices
Mastering ARMA, ARIMA and SARIMA Models for Time Series Forecasting in R (medium.datadriveninvestor.com). Step-by-step guide for forecasting page views using ARMA, ARIMA, and SARIMA models in R with practical data preparation and analysis
👋 Before you go
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
- Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
- First dibs on merch (details still cooking)
- That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!