Why R?

In the flier for this workshop, I say R is:


The lingua franca of statistics


But what does this really mean?

Dafynition: Lingua Franca

From the Merriam-Webster Online Dictionary (2016-10-19)

  1. A common language consisting of Italian mixed with French, Spanish, Greek, and Arabic that was formerly spoken in Mediterranean ports
  2. Any of various languages used as common or commercial tongues among peoples of diverse speech
  3. Something resembling a common language "movies are the lingua franca of the twentieth century" — Gore Vidal

R Is The Common
Language of Data Science

Programming Languages
  • C
  • Java
  • JMP
  • Mathematica
  • MATLAB
  • Python
  • SAS
  • SPSS
  • Statistica
  • tableau
Database
Vendors
  • Hadoop
  • Microsoft SQL Server
  • Oracle RDBMS
  • PostgreSQL
  • Vertica
Business Intelligence
  • Alteryx
  • Microsoft Azure Machine Learning
  • Jaspersoft
  • Oracle Business Intelligence Enterprise Edition
  • Pentaho
  • SAP (and SAP HANA)

Packages

Packages

An ever expanding collection of professional tools

The Ecosystem Is The Difference

  • R is not a tool, it is an ecosystem
  • A p-value from SAS is as good useless as a p-value from R
  • This scale of the community/ecosystem is the fundamental difference between legacy tools like SAS and SPSS and modern tools like R and Python
  • R isn't just a tool to calculate a statistic, it is a means of communicating scientific methodology
  • And ANYONE can use it, anywhere in the world

A Personal Anecdote:

Diabetes Self Management
  • This could not have happened if I had published my work in SAS syntax
  • Open Source tools, including R, democratize data science