Data Scientist (with R): 10th June 2025
๐ Community & Events
Putting Open Science Into Practice: Reflections from the 2025 RT2 Training (bitss.org, 2025-06-06). The RT2 training at UC Berkeley engaged 28 early-career researchers in open science practices, using tools like OSF and GitHub, tackling issues of reproducibility, pre-registration, and data transparency across various disciplines
New Mentoring Team, Same Open Science Spirit (ropensci.org, 2025-06-05). The rOpenSci Champions Program introduces a new team of mentors from Latin America, focusing on R package development, open science, and community engagement through tools like R and statistical analysis techniques
R Weekly 2025-W24 Containerizing Shiny, ggplot2 explorer, quickr package (rweekly.org, 2025-06-09). Explore R Weekly 2025-W24 featuring a guide on containerizing Shiny apps with shiny2docker, the new quickr 0.1.0 compiler, and diverse package updates such as Rdatasets and bssbinom
EOI R Shiny Workshop (combine.org.au, 2025-06-04). The workshop focuses on R Shiny, teaching participants to create reactive elements and incorporate interactive tables and figures, aimed at students and early career researchers with prior R experience
Social Coworking and Office Hours - Getting to know the DSLC (ropensci.org, 2025-06-03). Join Jon Harmon and Steffi LaZerte for a two-hour online coworking event on June 3, 2025, focused on the Data Science Learning Community, featuring Zoom integration and collaborative note-taking
๐ ๏ธ Tools & Development
CBIOโs EuroBioc2024 posters and talks (lgatto.github.io, 2025-06-09). CBIO presents contributions to EuroBioc2024, including analyses in single-cell proteomics, differential correlation, peptide identification optimization, and the development of the RforMassSpectrometry ecosystem for data processing
Containerizing Shiny Apps with {shiny2docker}: A Step-by-Step Guide (rtask.thinkr.fr, 2025-06-03). Containerizing Shiny applications with shiny2docker simplifies Dockerfile creation, ensuring reproducibility and ease of deployment by managing R package versions and automating the CI/CD pipeline for Shiny app deployment
KI in der R-Programmierung: Neue R-Pakete (statistik-dresden.de, 2025-06-09). New R packages enhance integration with AI, including ellmer for LLMs, chores for task automation, gander for environment optimization, ragnar for RAG, and vitals for LLM evaluation, revolutionizing R programming
Our June Issue is out now! (methodsblog.com, 2025-06-04). The June issue features the ECKOchain blockchain database for ecological data, new R packages for spatial analysis, and the CEPHALOPOD framework for marine habitat modelling, highlighting advances in open science and ecologic methodologies
Posit and Typst (posit.co, 2025-06-03). Posit supports Typst, a modern typesetting engine that simplifies PDF creation, enhancing scientific communication. It integrates with Quarto, promoting open-source tools for data science and facilitating dynamic research reporting
Introducing Portable Linux R Binary Packages (posit.co, 2025-06-04). Discover Portable Linux R Binary Packages for streamlined sharing and management of R and Python data insights using RStudio, Jupyter, and VS Code with centralized package repositories and fast project sharing
๐ Data Analysis & Visualization
Correlation vs Causation: Understanding the Difference (mfatihtuzen.netlify.app, 2025-06-03). Explore the difference between correlation and causation through real-world examples, R code simulations, and insights from Judea Pearl and David Freedman, highlighting critical thinking in data analysis
LA County Population (kieranhealy.org, 2025-06-09). LA County has approximately 9.66 million residents, with tools like the Census Bureau API and R libraries tidyverse and tidycensus for accessing population data and geometry information
Power and 'fragile' p-values (freerangestats.info, 2025-06-08). Analysis reveals that 26% of significant p-values are classified as 'fragile' between 0.01 and 0.05, utilizing power calculations and simulating studies with consistent methodologies
Let It Flow: recreating a FACS plot with ggplot (quantixed.org, 2025-06-05). Stephen Royle recreates a FACS plot using R and ggplot, detailing data processing with flowCore, ggcyto, and dplyr, culminating in several histogram and line visualization techniques after adjusting binwidth and scaling methods
Tidyverse with GitHub Copilot for Healthcare Analytics โ Part 1 (rworks.dev, 2025-06-04). Explore healthcare analytics using Tidyverse and GitHub Copilot to analyze complex data sets, including diabetes data for improved patient outcomes and AI-assisted data manipulation
The ggplot2 package popularity index (erikgahner.dk, 2025-06-09). Explore the ggplot2 package popularity ranking based on CRAN downloads, focusing on extensions that adhere to the grammar of graphics, with insights on useful tools like cranlogs and packages such as cowplot and ggpubr
๐ Statistical Methods & Modeling
Why parameter estimates in fitPagel can hit upper bounds, even if they don't in fitMk for each trait modeled separately.... (blog.phytools.org, 2025-06-04). Demonstrates how fitPagel can hit parameter upper bounds in correlated binary character models, even when fitMk does not, using simulated traits and emphasizing limitations of rate matrix values
The Hypothesis-Testing Philosophy of Harold Jeffreys Expressed As a 13-Word Slogan (bayesianspectacles.org, 2025-06-04). Harold Jeffreys' Bayesian philosophy on hypothesis testing emphasizes caution, supporting point-null hypotheses until alternatives offer improved predictive power, encapsulating his approach in a succinct 13-word slogan related to scientific reasoning
AI fails again (statsinthewild.com, 2025-06-05). A user attempted to fit a mixed effects model in R using the lmer function from the lme4 package but encountered issues specifying the random effects while considering ANOVA for model comparison
tenets of quantile-based inference in Bayesian models (xianblog.wordpress.com, 2025-06-07). This 2023 paper explores quantile-based Bayesian inference using cumulative distribution functions, emphasizing connections to ABC and highlighting computational challenges with quantile parameterization and numerical inversion in sampling distributions
R version of Probabilistic Machine Learning (for longitudinal data) Reserving (work in progress) (thierrymoudiki.github.io, 2025-06-05). Explore an R implementation of Probabilistic Machine Learning for longitudinal data reserving using libraries like reticulate, sklearn, and modeling techniques such as RidgeCV, ExtraTreesRegressor, and RandomForestRegressor
fairmetrics: An R package for group fairness evaluation (arxiv:cs, 2025-06-06). The fairmetrics R package provides a user-friendly framework for evaluating group-based fairness criteria in machine learning, featuring metrics like statistical parity and equalized odds, alongside example datasets from the MIMIC-II database
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!