Data Scientist (with R): 15th July 2025
Published 15th July 2025
🏢 Industry Applications & Community
Embracing open-source data science for smarter financial risk decisions (posit.co). Open-source data science empowers financial risk management through transparency, agility, and advanced tools like R, Python, Shiny, and Git for robust risk analysis
R for Health Technology Assessment (HTA): Identifying Needs, Streamlining Processes, Building Bridges (r-consortium.org). R Consortium's HTA working group explores R's role in health technology assessment, streamlining processes, and leveraging AI to enhance efficiency and collaboration
Elevate Your Skills and Boost Your Career with Jumping Rivers Free Monthly Webinars (jumpingrivers.com). Free monthly webinars by Jumping Rivers on R, Python, Shiny, and Posit for data professionals to enhance skills and address real-world challenges
🛠️ Development & Workflow Tools
Kimi K2 and R Coding (simonpcouch.com). Kimi K2 from Moonshot AI evaluated on R coding tasks, leveraging the vitals package for LLM performance assessment against competitors like Claude and Gemini
How to use Rmarkdown (jackauty.com). Quick guide to setting up R Markdown in RStudio using essential code for reproducible reports with ggplot2 and automated package handling
How to use Positron’s Connections Pane with DuckDB (andrewheiss.com). Connecting DuckDB with R using Positron's Connections Pane enhances data management and querying for large datasets, optimizing performance and efficiency
How to use natural language data science with RStudio and Amazon SageMaker (posit.co). Explore integrating natural language data science with RStudio and Amazon SageMaker for effective data insights and application management
📦 R Packages & Tools
Specialized R packages for spatial cross-validation: sperrorest and blockCV (geocompx.org). Overview of R packages sperrorest and blockCV for spatial cross-validation using temperature data in Spain, focusing on model performance and variable importance
httr2 1.2.0 (tidyverse.org). Release of httr2 1.2.0 introduces improved security, new URL handling, debugging tools, and deprecated features to enhance the HTTP client experience
Parallel processing in purrr 1.1.0 (tidyverse.org). purrr 1.1.0 introduces parallel processing with the in_parallel() function, enabling efficient multi-core operations for functional programming in R
R Package Quality: Maintainer Criteria (jumpingrivers.com). Evaluating R package maintenance: bug closure rates, maintainers, source control, and contributor analysis using Litmus for quality assurance
📚 Teaching & Learning Resources
Introducing the Qatar Cars Dataset (musgrave.substack.com). Introduction of the Qatar Cars Dataset for teaching statistics, featuring car prices, specifications, and metrics relevant for diverse global classrooms
Making messy data: creating more realistic, synthetic data for teaching and testing (nrennie.rbind.io). Nicola Rennie introduces the 'messy' R package for creating synthetic, realistic datasets, enhancing data wrangling skills for students in statistics and data science
Rouse, Russel, & Campbell (2025) is basically a curated list of Psi Chi journals that are perfect for Intro Stats. (notawfulandboring.blogspot.com). Curated list of paywall-free Psi Chi articles suitable for teaching Intro Statistics with a focus on result sections analysis
A modern introduction to probability and statistics [book review] (xianblog.wordpress.com). Graham Upton's book covers probability and statistics, integrating R commands but lacks depth in modern computational methods and Bayesian principles
🎨 Data Visualization & Analysis
How do we take inspiration from colour palettes? (questionsindataviz.com). Explore color palette inspiration from historical works, particularly Emily Vanderpoel's 'Color Problems,' and modern tools like Vanderbot for visualisations
Dipping my toes into the ducklake: Exploring gene expression data with R and python (tomsing1.github.io). Exploring gene expression data management and analysis using DuckDB, R, Python, RNA-seq, and the limma workflow for differential expression
Standard Deviation vs. Standard Error: Meaning, Misuse, and the Math Behind the Confusion (mfatihtuzen.netlify.app). Explore the critical differences between standard deviation and standard error, their impact on data interpretation, and how to visualize them using R
Reality is a number – A number is not reality (danumbers.substack.com). Data exhibition explores the nature of numbers, reality representation, and the use of R programming for creating expressive graphics in art
🔬 Bayesian Methods & Statistical Theory
Using Bayesian tools to be a better frequentist (martinmodrak.cz). Explore the effectiveness of Bayesian methods in frequentist contexts, highlighting negative binomial regression, confidence intervals, and simulation techniques
Are We Listening? Part II of “Sennsible significance” Commentary on Senn’s Guest Post (errorstatistics.com). Discusses Senn's concept of statistical significance, terminological clarity, and critiques the 'redefine significance' movement including Bayes factor tests
Bayes predictive framework (danmackinlay.name). Exploration of Bayes predictive framework, emphasizing predictive Bayesianism, exchangeability, and foundational contributions from Bruno de Finetti and modern Bayesian nonparametrics
Learn Stan with brms, Part II (solomonkurz.netlify.app). Explores modeling weight based on height using brms and Stan, discussing priors, continuous predictors, and workflow in R with practical code examples
A probability puzzle involving random fractions (mathematicaloddsandends.wordpress.com). Analyzing a probability puzzle involving independent uniform draws, rounding fractions, and Monte Carlo simulations to find even integer probabilities
Stable Distributions (bruceediger.com). Exploring stable distributions, particularly Cauchy distributions, through programming, statistical analysis, and understanding probability density functions using R and Go
📚 Academic Research
Method: Using generalized additive models in the animal sciences (arxiv:stat). Generalized additive models (GAMs) address nonlinear relationships in animal science, enhancing fit via smooth functions and penalized splines across diverse data examples
Robust Spatiotemporal Epidemic Modeling with Integrated Adaptive Outlier Detection (arxiv:stat). Robust spatiotemporal epidemic modeling using RST-GAM integrates adaptive outlier detection to inform public health decisions from COVID-19 infection data analysis
👋 Before you go
I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching a Patreon page!. Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
- Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
- First dibs on merch (details still cooking)
- That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!