๐Ÿ“Š

Data Scientist (with R): 3rd June 2025

Newsletters sent once a week, unsubscribe anytime.

Published 3rd June 2025

๐ŸŽฏ Applied Analysis & Community

Vivre ses Valeurs : Comment la Science Ouverte Renforce nos Liens / Walking the Talk: How Open Science Strengthens Connections (isabellelaforestlapointe.wordpress.com, 2025-06-02). Isabelle Laforest-Lapointe emphasizes the importance of open science through free R/RStudio workshops, fostering collaboration and breaking academic isolation while advocating for accessible scientific tools and community support

A Pace Far Different: finding best running pace with R (quantixed.org, 2025-05-27). Utilizing R for tracking personal best running times, the article explores segment analysis across distances like 5K and marathon, employing libraries for data processing and visualizations

Unmasking Long Covid: PCA & Clustering Analysis of Symptom Syndromes (drmowinckels.io, 2025-06-02). Exploring personal Long Covid data through PCA, Dr. Mowinckel identifies four symptom clusters: Menstruation, Exertion, Emotional, and Neurological, offering data-driven insights into managing symptoms with clear visualizations

The Dynamics of the โ€œGentle Wayโ€: Exploring Judo Attack Combinations as Networks in R (geekcologist.wordpress.com, 2025-05-27). Explore Judo techniques using R for network analysis. Discover crucial attack combinations, influential moves, and how to visualize throwing techniques as networks with applications in martial arts and ecology

Nordic-RSE 2025 event blog (digitalflapjack.com, 2025-05-31). The Nordic-RSE 2025 conference in Gothenburg explored open science, design patterns in software, 3D visualization, and data collection methods using tools like Google Takeout, with a focus on reproducibility in research

๐Ÿ”ง R Development & Infrastructure

R-Version 4.5.0: Was ist neu? (statistik-dresden.de, 2025-05-27). R-Version 4.5.0 introduces faster package installations with libcurl, enhanced linear algebra routines via updated BLAS and LAPACK libraries, and new datasets including 'palmerpenguins' and 'gait' for improved data analysis

April 2025 Top 40 New CRAN Packages (rworks.dev, 2025-05-28). Explore 40 new R packages across diverse fields such as AI, biology, epidemiology, and machine learning, featuring tools like PacketLLM, clockSim, and RRgeo for enhanced data analysis and model simulations

Using renv in R (erikgahner.dk, 2025-05-31). Utilize the R package 'renv' to manage R environments effectively, ensuring reproducibility by capturing package dependencies and versions needed for projects, thus mitigating issues arising from package updates

STICr: An open-source package and workflow for stream temperature, intermittency, and conductivity (STIC) data (samzipper.com, 2025-05-30). STICr is an open-source R package for processing Stream Temperature, Intermittency, and Conductivity (STIC) data, facilitating FAIR data practices in hydrology through sensor calibration and reproducible workflows across interdisciplinary studies

R Package Repository Snapshots for Clinical Trial Submissions (posit.co, 2025-05-28). Explore R Package Repository Snapshots for efficient clinical trial submissions, leveraging centralized management for RStudio, Jupyter, and VS Code, while facilitating quick publishing of R and Python applications

๐Ÿ“Š Statistical Methods & Modeling

AMA OLS vs Poisson regression (andrewpwheeler.com, 2025-05-28). Andrew Wheeler discusses the differences between OLS and Poisson regression for evaluating place-based interventions in crime analysis, emphasizing model selection based on dose-response relationships and implications for extrapolation

Continuous Variable? You Can Categorize it! (dethwench.com, 2025-06-02). Categorizing continuous variables enhances descriptive analysis. Using R programming, the blog demonstrates transforming continuous data from UK electric vehicle charge points into a categorical variable for clearer interpretation and understanding

A Bayesian Network model of pregnancy outcomes for England and Wales (wherearethenumbers.substack.com, 2025-05-27). A Bayesian Network model uses large-scale data to accurately predict pregnancy outcomes in England and Wales, offering a decision-aid tool for clinicians and pregnant women for risk assessment throughout pregnancy

Continuous diff in diff (causalinf.substack.com, 2025-05-31). Scott Cunningham discusses the rapid rise of continuous diff-in-diff methods, highlighting Brant Callaway's R package, 'contdid,' and the technique's role in causal inference, particularly focusing on continuous treatments

๐Ÿค– AI & Modern R Applications

The Modern R Stack for Production AI (blog.stephenturner.us, 2025-06-02). R is now a strong contender in AI development, leveraging tools like ellmer, ollamar, gander, and ragnar for LLM interactions, NLP, and retrieval-augmented generation, revitalizing its role in modern data science

Computer vision with LLMs in R (posit.co, 2025-05-30). Explore computer vision applications using R and LLMs, enabling rapid development, data insights, and streamlined sharing through RStudio, Jupyter, and secure package management for Python and R environments

Natural language data science with RStudio and Databricks (posit.co, 2025-05-30). Integrate RStudio and Databricks for natural language data science, enabling centralized management, secure package repositories, rapid app sharing, and faster report delivery with engaging tools like Python, R, and Shiny applications

๐Ÿ“š Academic & Research Articles

Not your everyday meta-analysis: venturing into mixed methods, machine learning and expert elicitation (aliceinstatisticsland.wordpress.com, 2025-05-30). Associate Professor Samantha Low-Choy presented innovative meta-analysis techniques, including mixed methods, multiverse analysis, and expert elicitation using bespoke Bayesian models during a talk at RSFAS in Canberra

a novel discrepancy measure (xianblog.wordpress.com, 2025-05-31). EJ Wagenmakers and Raoul Grasman propose a novel discrepancy measure, D_EP, providing a symmetric alternative to Kullback-Leibler divergence for comparing distributions, emphasizing Bayesian approaches without requiring absolute continuity

Ordinal regression for meta-analysis of test accuracy: a flexible approach for utilising all threshold data (arxiv:stat, 2025-05-29). Ordinal regression models enhance meta-analysis of test accuracy by utilizing all threshold data, providing summary estimates through a flexible framework. Implemented in the MetaOrdDTA R package, it supports MCMC summaries and meta-regression

You may also like

About Data Scientist (with R)

Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.

Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.

Subscribe now to join thousands of professionals who receive our weekly updates!