Data Scientist (with R): 13th May 2025
Published 13th May 2025
📣 R Community & Ecosystem Updates
Mapping research landscapes and dynamics: Some basic bibliometric analyses with R (geekcologist.wordpress.com, 2025-05-06). Explore bibliometric analysis with R to map research dynamics using techniques like word clouds, co-occurrence networks, thematic maps, and keyword evolution, enabling deeper understanding of scientific communication
Posit with Snowflake: A Sweet Spot for Data Scientists (posit.co, 2025-05-09). Posit integrates RStudio and Jupyter with Snowflake's data ecosystem, enhancing data science workflows through secure package management, quick application sharing, and innovative AI tool development via the Posit Workbench Native App
R Weekly 2025-W20 R ecosystem, GitHub Actions weather, PIN code India (rweekly.org, 2025-05-12). This week features R's data analysis ecosystem versus Python, GitHub Actions for weather graphs, and data.gov.in's PIN Code boundaries in India. Updates on RStudio IDE and R coding with Gemini 2.5 Pro are included
RStudio IDE and Posit Workbench 2025.05.0: What’s New (posit.co, 2025-05-08). RStudio IDE and Posit Workbench 2025.05.0 introduces centralized management for RStudio, Jupyter, and VS Code, supports secure package repositories, and enables rapid sharing of Python and R applications alongside enhanced data insights
📊 Data Visualization with R
svglite 2.2.0 (tidyverse.org, 2025-05-07). svglite 2.2.0 introduces full R graphics engine support, enhanced text rendering, font handling improvements, and the ability to create complex SVG paths, embedding web fonts for better visual representations
Fonts in R (tidyverse.org, 2025-05-12). Explore font concepts in R including typefaces, font files, and important formats like TrueType, OpenType, and WOFF. Gain insight into digital typography and how systemfonts makes font data accessible in R
Fixing Bad Charts: EV Acceleration as a Function of Battery Charge Level (conormclaughlin.net, 2025-05-10). A redesign of EV acceleration charts using ggplot highlights performance trends across battery charge levels, revealing linear declines for some models and a flat performance curve for the Hyundai Ioniq 5
Trend-Anomaly Analysis: Ethereum’s Pectra Upgrade (datageeek.com, 2025-05-07). Ethereum's Pectra upgrade may drive a 40% price increase, as shown through trend-anomaly analysis using R packages like tidyverse, tidyquant, and timetk
🛠️ R Tools & Tutorials
A Data Scientist’s View of Running R in Visual Studio Code (rworks.dev, 2025-05-08). Explore running R in Visual Studio Code with integration options like Jupyter notebooks, R and Python interoperability via rpy2, and Quarto for document creation
Using Infomap for Food Web Modularity (lsaravia.github.io, 2025-05-07). Calculate food web modularity with the multiweb R package using Infomap; demonstrates its application on the Potter Cove food web and includes network randomization techniques for null model comparison
https with {plumber} using Caddy (josiahparry.com, 2025-05-12). Set up HTTPS for a Plumber API using Caddy, a simple reverse proxy, and run it with Docker on platforms like DigitalOcean or AWS ECS while demonstrating API functionality with R
pkgdown.offline: Build pkgdown websites without an internet connection (nanx.me, 2025-05-10). pkgdown.offline is an R package for building pkgdown websites offline, bundling essential frontend assets and providing alternative core functions to bypass internet requirements for CI/CD pipelines and air-gapped environments
Roaringly Acknowledge Organizations with ROR IDs in DESCRIPTION (ropensci.org, 2025-05-09). Organizations in R can now be recognized using ROR IDs in DESCRIPTION files, paralleling the use of ORCID for individual authors, with support from the devtools ecosystem and the desc, roxygen2, and pkgdown packages
Use use() in R #2 (erikgahner.dk, 2025-05-11). Discussing the use of base::use() in R, Erik Gahner Larsen highlights key limitations, including single-package loading, the importance of explicit syntax, and the advantages of using box::use() in certain contexts
Hack your way to a good git history (ropensci.org, 2025-05-13). Learn best practices for git, including small, atomic commits and informative messaging, to enhance coding workflows. Discover the saperlipopette R package for safe practice, and share git tips in this online session
R/exams Presents: Fun with Flags (R-exams.org, 2025-05-07). An interactive quiz utilizing R/exams and Quarto to test knowledge of country flags and their neighbors, featuring source files and coding elements like exams2forms for generating assessments
🧑‍🔬 Academic & Scholarly Articles
CHOIR: A principled approach to clustering single-cell data (tomsing1.github.io, 2025-05-11). CHOIR offers a principled method for clustering single-cell data, utilizing random forest classifiers to distinguish sibling clusters based on gene expression, while minimizing overclustering risks through various implemented techniques
May 2025: Interactive charts and fast Bayesian inference (edinbr.org, 2025-05-09). Nicola Rennie discusses interactive charts in R, while Jordan Richards presents NeuralEstimators for fast Bayesian inference, emphasizing likelihood-free methods and user-friendly integration with the Julia package for advanced statistical analysis
Review of the Rostock Open Science Workshop 2025 (demogr.mpg.de, 2025-05-09). The Rostock Open Science Workshop 2025 discussed open science practices, emphasizing transparency, reproducibility, and hands-on sessions. Keynote speakers shared insights on the impact of open science amid current political challenges affecting academic freedom
OWABI@BioInference2025 [29 May] (xianblog.wordpress.com, 2025-05-12). OWABI hosts a webinar featuring two talks on Bayesian inference in biological modeling, using tools like stochastic differential equations and Gaussian mixture models, livestreamed from BioInference 2025 on May 29
RCOMPSs: A Scalable Runtime System for R Code Execution on Manycore Systems (arxiv:stat, 2025-05-11). RCOMPSs is a scalable runtime system for efficient parallel execution of R applications on multicore systems, enabling automated task management and dependency tracking, demonstrated through KNN, K-means, and linear regression performance evaluations
GLOSSA: a user-friendly R Shiny application for Bayesian machine learning analysis of marine species distribution (arxiv:stat, 2025-05-09). GLOSSA is an open-source R Shiny app for species distribution modeling using Bayesian Additive Regression Trees, featuring data processing, model fitting, visualization, and variable importance calculations for marine species analysis
A Unified Approach to Covariate Adjustment for Survival Endpoints in Randomized Clinical Trials (arxiv:stat, 2025-05-08). A covariate adjustment approach for survival endpoints enhances statistical efficiency using an augmentation method, optimizing treatment effect estimators with various statistical and machine learning techniques, implemented in the R package 'sleete'
Variable Selection for Fixed and Random Effects in Multilevel Functional Mixed Effects Models (arxiv:stat, 2025-05-08). A new multilevel functional mixed effects selection (MuFuMES) method identifies age and race-specific heterogeneity in physical activity patterns, utilizing spike-and-slab group lasso priors and an ECM algorithm for variable selection
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!