R & Python Interoperability in Data Science Teams | Dave Gruenewald | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We'd love to see you!
We were recently joined by Dave Gruenewald, Senior Director of Data Science at Centene, to chat about polyglot teams, data science best practices, right-sizing development efforts, and process automation.
In this Hangout, we explore working in a polyglot team and fostering interoperability (a word that Libby loves, but struggles to pronounce out loud). Dave Gruenewald emphasizes that teams should use the tools they are comfortable with, whether that's R or Python. Some strategies for collaboration across languages that Dave suggests include tools like Quarto to seamlessly run R and Python code in the same report. Teams utilize data science checkpoints, saving outputs as platform-agnostic file types like Parquet so that they can be accessed by any language. The use of REST APIs allows R processes to be accessed programmatically by Python (and vice versa), which can be a real game-changer. The newly released nanonext package was also highlighted as a promising development for improved interoperability.
Resources mentioned in the video and zoom chat:
Posit Conf 2025 Table and Plotnine Contests → https://posit.co/blog/announcing-the-2025-table-and-plotnine-contests/
nanonext 1.7.0 Tidyverse Blog Post → https://www.tidyverse.org/blog/2025/09/nanonext-1-7-0/
If you didn’t join live, one great discussion you missed from the zoom chat was about pivoting away from academia, including leaving PhD programs. Many attendees shared their personal experiences of making the difficult decision to drop out of a PhD program. The community suggested alternative terms like "pivot," "reallocating your resources," or being a "refugee fleeing academia" instead of "drop out." Dave Gruenewald shared that he himself left a PhD program but has "no regrets about that." Did you leave a PhD program? You're not alone!
► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu
Follow Us Here:
Website: https://www.posit.co
Hangout: https://pos.it/dsh
LinkedIn: https://www.linkedin.com/company/posit-software
Bluesky: https://bsky.app/profile/posit.co
Thanks for hanging out with us!
Timestamps:
00:00 Introduction
02:21 "What types of data do your teams use?"
06:53 "Which of the three pillars you mentioned is your personal favorite to work on?"
09:26 "How do you avoid or divert scope creep?"
11:41 "How much of the project should be "planning" before any code happens?"
13:53 "Do you feel like people are just hopping in and going, hey, LLM, make me a POC?"
14:28 "Do you give them what they say they want, or do you give them what they need?"
16:40 "I'm wondering what public data do you wish existed?"
18:48 "Why not Positron yet?"
20:43 "How do you unify as a team and make it so that I can always read everybody else's code?"
23:10 "Could you talk a little bit about how R and Python work together?"
27:28 "How to start package development with a team who are very new to package development."
33:01 "What's your greatest regret career wise?"
35:53 "What about your biggest wins, specifically in your early career?"
39:40 "How would you recommend building a data science culture and community from scratch?"
41:49 "Would you set a specific timeline for EDA, exploratory analysis, to scope the project better?"
45:15 "How do you define fun projects, and how much time do you allocate for exploration in those?"
48:21 "Does your team use DVC or something similar for data version control?"
50:00 "Can you talk a bit more about your pivot from academia into data science?"
51:31 "Any advice on where to look for opportunities in data science after getting a masters degree?"
nanonext
plotnine
positron
Quarto
tidyverse
tidyverse.org