๐Ÿ“Š

Data Scientist (with R): 10th June 2025

Newsletters sent once a week, unsubscribe anytime.

Published 10th June 2025

๐ŸŒŸ Community & Events

Putting Open Science Into Practice: Reflections from the 2025 RT2 Training (bitss.org, 2025-06-06). The RT2 training at UC Berkeley engaged 28 early-career researchers in open science practices, using tools like OSF and GitHub, tackling issues of reproducibility, pre-registration, and data transparency across various disciplines

New Mentoring Team, Same Open Science Spirit (ropensci.org, 2025-06-05). The rOpenSci Champions Program introduces a new team of mentors from Latin America, focusing on R package development, open science, and community engagement through tools like R and statistical analysis techniques

R Weekly 2025-W24 Containerizing Shiny, ggplot2 explorer, quickr package (rweekly.org, 2025-06-09). Explore R Weekly 2025-W24 featuring a guide on containerizing Shiny apps with shiny2docker, the new quickr 0.1.0 compiler, and diverse package updates such as Rdatasets and bssbinom

EOI R Shiny Workshop (combine.org.au, 2025-06-04). The workshop focuses on R Shiny, teaching participants to create reactive elements and incorporate interactive tables and figures, aimed at students and early career researchers with prior R experience

Social Coworking and Office Hours - Getting to know the DSLC (ropensci.org, 2025-06-03). Join Jon Harmon and Steffi LaZerte for a two-hour online coworking event on June 3, 2025, focused on the Data Science Learning Community, featuring Zoom integration and collaborative note-taking

๐Ÿ› ๏ธ Tools & Development

CBIOโ€™s EuroBioc2024 posters and talks (lgatto.github.io, 2025-06-09). CBIO presents contributions to EuroBioc2024, including analyses in single-cell proteomics, differential correlation, peptide identification optimization, and the development of the RforMassSpectrometry ecosystem for data processing

Containerizing Shiny Apps with {shiny2docker}: A Step-by-Step Guide (rtask.thinkr.fr, 2025-06-03). Containerizing Shiny applications with shiny2docker simplifies Dockerfile creation, ensuring reproducibility and ease of deployment by managing R package versions and automating the CI/CD pipeline for Shiny app deployment

KI in der R-Programmierung: Neue R-Pakete (statistik-dresden.de, 2025-06-09). New R packages enhance integration with AI, including ellmer for LLMs, chores for task automation, gander for environment optimization, ragnar for RAG, and vitals for LLM evaluation, revolutionizing R programming

Our June Issue is out now! (methodsblog.com, 2025-06-04). The June issue features the ECKOchain blockchain database for ecological data, new R packages for spatial analysis, and the CEPHALOPOD framework for marine habitat modelling, highlighting advances in open science and ecologic methodologies

Posit and Typst (posit.co, 2025-06-03). Posit supports Typst, a modern typesetting engine that simplifies PDF creation, enhancing scientific communication. It integrates with Quarto, promoting open-source tools for data science and facilitating dynamic research reporting

Introducing Portable Linux R Binary Packages (posit.co, 2025-06-04). Discover Portable Linux R Binary Packages for streamlined sharing and management of R and Python data insights using RStudio, Jupyter, and VS Code with centralized package repositories and fast project sharing

๐Ÿ“Š Data Analysis & Visualization

Correlation vs Causation: Understanding the Difference (mfatihtuzen.netlify.app, 2025-06-03). Explore the difference between correlation and causation through real-world examples, R code simulations, and insights from Judea Pearl and David Freedman, highlighting critical thinking in data analysis

LA County Population (kieranhealy.org, 2025-06-09). LA County has approximately 9.66 million residents, with tools like the Census Bureau API and R libraries tidyverse and tidycensus for accessing population data and geometry information

Power and 'fragile' p-values (freerangestats.info, 2025-06-08). Analysis reveals that 26% of significant p-values are classified as 'fragile' between 0.01 and 0.05, utilizing power calculations and simulating studies with consistent methodologies

Let It Flow: recreating a FACS plot with ggplot (quantixed.org, 2025-06-05). Stephen Royle recreates a FACS plot using R and ggplot, detailing data processing with flowCore, ggcyto, and dplyr, culminating in several histogram and line visualization techniques after adjusting binwidth and scaling methods

Tidyverse with GitHub Copilot for Healthcare Analytics โ€“ Part 1 (rworks.dev, 2025-06-04). Explore healthcare analytics using Tidyverse and GitHub Copilot to analyze complex data sets, including diabetes data for improved patient outcomes and AI-assisted data manipulation

The ggplot2 package popularity index (erikgahner.dk, 2025-06-09). Explore the ggplot2 package popularity ranking based on CRAN downloads, focusing on extensions that adhere to the grammar of graphics, with insights on useful tools like cranlogs and packages such as cowplot and ggpubr

๐Ÿ“ˆ Statistical Methods & Modeling

Why parameter estimates in fitPagel can hit upper bounds, even if they don't in fitMk for each trait modeled separately.... (blog.phytools.org, 2025-06-04). Demonstrates how fitPagel can hit parameter upper bounds in correlated binary character models, even when fitMk does not, using simulated traits and emphasizing limitations of rate matrix values

The Hypothesis-Testing Philosophy of Harold Jeffreys Expressed As a 13-Word Slogan (bayesianspectacles.org, 2025-06-04). Harold Jeffreys' Bayesian philosophy on hypothesis testing emphasizes caution, supporting point-null hypotheses until alternatives offer improved predictive power, encapsulating his approach in a succinct 13-word slogan related to scientific reasoning

AI fails again (statsinthewild.com, 2025-06-05). A user attempted to fit a mixed effects model in R using the lmer function from the lme4 package but encountered issues specifying the random effects while considering ANOVA for model comparison

tenets of quantile-based inference in Bayesian models (xianblog.wordpress.com, 2025-06-07). This 2023 paper explores quantile-based Bayesian inference using cumulative distribution functions, emphasizing connections to ABC and highlighting computational challenges with quantile parameterization and numerical inversion in sampling distributions

R version of Probabilistic Machine Learning (for longitudinal data) Reserving (work in progress) (thierrymoudiki.github.io, 2025-06-05). Explore an R implementation of Probabilistic Machine Learning for longitudinal data reserving using libraries like reticulate, sklearn, and modeling techniques such as RidgeCV, ExtraTreesRegressor, and RandomForestRegressor

fairmetrics: An R package for group fairness evaluation (arxiv:cs, 2025-06-06). The fairmetrics R package provides a user-friendly framework for evaluating group-based fairness criteria in machine learning, featuring metrics like statistical parity and equalized odds, alongside example datasets from the MIMIC-II database

You may also like

About Data Scientist (with R)

Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.

Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.

Subscribe now to join thousands of professionals who receive our weekly updates!