In the flier for this workshop, I say R is:
The lingua franca of statistics
But what does this really mean?
From the Merriam-Webster Online Dictionary (2016-10-19)
-
A common language consisting of Italian
mixed with French, Spanish, Greek, and Arabic that was
formerly spoken in Mediterranean ports
-
Any of various languages used as common or
commercial tongues among peoples of diverse speech
-
Something resembling a common language "movies are the
lingua franca of the twentieth century" — Gore Vidal
R Is The Common
Language of Data Science
Programming Languages
- C
- Java
- JMP
- Mathematica
- MATLAB
- Python
- SAS
- SPSS
- Statistica
- tableau
Database
Vendors
- Hadoop
- Microsoft SQL Server
- Oracle RDBMS
- PostgreSQL
- Vertica
Business Intelligence
- Alteryx
- Microsoft Azure Machine Learning
- Jaspersoft
- Oracle Business Intelligence Enterprise Edition
- Pentaho
- SAP (and SAP HANA)
Packages
Packages
An ever expanding collection of professional tools
The Ecosystem Is The Difference
- R is not a tool, it is an ecosystem
- A p-value from SAS is as
good useless as a p-value from R
- This scale of the community/ecosystem is the fundamental difference between legacy tools like SAS and SPSS and modern tools like R and Python
- R isn't just a tool to calculate a statistic, it is a means of communicating scientific methodology
- And ANYONE can use it, anywhere in the world
A Personal Anecdote:
Diabetes Self Management
- This could not have happened if I had published my work in SAS syntax
- Open Source tools, including R, democratize data science