Data Scientist (with R): 3rd June 2025
๐ฏ Applied Analysis & Community
Vivre ses Valeurs : Comment la Science Ouverte Renforce nos Liens / Walking the Talk: How Open Science Strengthens Connections (isabellelaforestlapointe.wordpress.com, 2025-06-02). Isabelle Laforest-Lapointe emphasizes the importance of open science through free R/RStudio workshops, fostering collaboration and breaking academic isolation while advocating for accessible scientific tools and community support
A Pace Far Different: finding best running pace with R (quantixed.org, 2025-05-27). Utilizing R for tracking personal best running times, the article explores segment analysis across distances like 5K and marathon, employing libraries for data processing and visualizations
Unmasking Long Covid: PCA & Clustering Analysis of Symptom Syndromes (drmowinckels.io, 2025-06-02). Exploring personal Long Covid data through PCA, Dr. Mowinckel identifies four symptom clusters: Menstruation, Exertion, Emotional, and Neurological, offering data-driven insights into managing symptoms with clear visualizations
The Dynamics of the โGentle Wayโ: Exploring Judo Attack Combinations as Networks in R (geekcologist.wordpress.com, 2025-05-27). Explore Judo techniques using R for network analysis. Discover crucial attack combinations, influential moves, and how to visualize throwing techniques as networks with applications in martial arts and ecology
Nordic-RSE 2025 event blog (digitalflapjack.com, 2025-05-31). The Nordic-RSE 2025 conference in Gothenburg explored open science, design patterns in software, 3D visualization, and data collection methods using tools like Google Takeout, with a focus on reproducibility in research
๐ง R Development & Infrastructure
R-Version 4.5.0: Was ist neu? (statistik-dresden.de, 2025-05-27). R-Version 4.5.0 introduces faster package installations with libcurl, enhanced linear algebra routines via updated BLAS and LAPACK libraries, and new datasets including 'palmerpenguins' and 'gait' for improved data analysis
April 2025 Top 40 New CRAN Packages (rworks.dev, 2025-05-28). Explore 40 new R packages across diverse fields such as AI, biology, epidemiology, and machine learning, featuring tools like PacketLLM, clockSim, and RRgeo for enhanced data analysis and model simulations
Using renv in R (erikgahner.dk, 2025-05-31). Utilize the R package 'renv' to manage R environments effectively, ensuring reproducibility by capturing package dependencies and versions needed for projects, thus mitigating issues arising from package updates
STICr: An open-source package and workflow for stream temperature, intermittency, and conductivity (STIC) data (samzipper.com, 2025-05-30). STICr is an open-source R package for processing Stream Temperature, Intermittency, and Conductivity (STIC) data, facilitating FAIR data practices in hydrology through sensor calibration and reproducible workflows across interdisciplinary studies
R Package Repository Snapshots for Clinical Trial Submissions (posit.co, 2025-05-28). Explore R Package Repository Snapshots for efficient clinical trial submissions, leveraging centralized management for RStudio, Jupyter, and VS Code, while facilitating quick publishing of R and Python applications
๐ Statistical Methods & Modeling
AMA OLS vs Poisson regression (andrewpwheeler.com, 2025-05-28). Andrew Wheeler discusses the differences between OLS and Poisson regression for evaluating place-based interventions in crime analysis, emphasizing model selection based on dose-response relationships and implications for extrapolation
Continuous Variable? You Can Categorize it! (dethwench.com, 2025-06-02). Categorizing continuous variables enhances descriptive analysis. Using R programming, the blog demonstrates transforming continuous data from UK electric vehicle charge points into a categorical variable for clearer interpretation and understanding
A Bayesian Network model of pregnancy outcomes for England and Wales (wherearethenumbers.substack.com, 2025-05-27). A Bayesian Network model uses large-scale data to accurately predict pregnancy outcomes in England and Wales, offering a decision-aid tool for clinicians and pregnant women for risk assessment throughout pregnancy
Continuous diff in diff (causalinf.substack.com, 2025-05-31). Scott Cunningham discusses the rapid rise of continuous diff-in-diff methods, highlighting Brant Callaway's R package, 'contdid,' and the technique's role in causal inference, particularly focusing on continuous treatments
๐ค AI & Modern R Applications
The Modern R Stack for Production AI (blog.stephenturner.us, 2025-06-02). R is now a strong contender in AI development, leveraging tools like ellmer, ollamar, gander, and ragnar for LLM interactions, NLP, and retrieval-augmented generation, revitalizing its role in modern data science
Computer vision with LLMs in R (posit.co, 2025-05-30). Explore computer vision applications using R and LLMs, enabling rapid development, data insights, and streamlined sharing through RStudio, Jupyter, and secure package management for Python and R environments
Natural language data science with RStudio and Databricks (posit.co, 2025-05-30). Integrate RStudio and Databricks for natural language data science, enabling centralized management, secure package repositories, rapid app sharing, and faster report delivery with engaging tools like Python, R, and Shiny applications
๐ Academic & Research Articles
Not your everyday meta-analysis: venturing into mixed methods, machine learning and expert elicitation (aliceinstatisticsland.wordpress.com, 2025-05-30). Associate Professor Samantha Low-Choy presented innovative meta-analysis techniques, including mixed methods, multiverse analysis, and expert elicitation using bespoke Bayesian models during a talk at RSFAS in Canberra
a novel discrepancy measure (xianblog.wordpress.com, 2025-05-31). EJ Wagenmakers and Raoul Grasman propose a novel discrepancy measure, D_EP, providing a symmetric alternative to Kullback-Leibler divergence for comparing distributions, emphasizing Bayesian approaches without requiring absolute continuity
Ordinal regression for meta-analysis of test accuracy: a flexible approach for utilising all threshold data (arxiv:stat, 2025-05-29). Ordinal regression models enhance meta-analysis of test accuracy by utilizing all threshold data, providing summary estimates through a flexible framework. Implemented in the MetaOrdDTA R package, it supports MCMC summaries and meta-regression
You may also like
About Data Scientist (with R)
Our Data Scientist newsletter covers the latest developments, packages, techniques, and insights in R programming and data science. Each week, we curate the most important content from your favourite R blogs so you don't have to spend hours searching.
Whether you're a beginner or expert in data science with R, our newsletter provides valuable information to keep you informed.
Subscribe now to join thousands of professionals who receive our weekly updates!