Data Science in Public Affairs
We are compiling resources for faculty and students in public policy working with R. These include pedagogical resources for students doing graduate-level work in MPA and MPP programs, PhD seminars that share data programming resources, and data science methods within the field of public policy that use R.
Example Courses in R at Public Affairs Schools
Arizona State [ Data Science for the Social Sector ]
Brigham Young: [ Data Science for Public Management ] [ Data Viz ]
Georgetown: Data Science for Public Policy
Syracuse University: Data-Driven Management for Public Organizations
Syracuse iSchool: Applied Data Science
Carnegie Mellon: [ Data Analytics Course ] [ PhD in Machine Learning & PP ]
SUNY Albany: PhD Seminar in Social Network Analysis
Hertie School of Governance: [ Data ] [ Collaboration ] [ Text ]
University of Washington: [ course ]
Carleton University, CA [ Big Data & Society ]
Georgia Tech: Elective Workshop in R (Juan Rogers)
Example Programs in Public Policy and Data Analytics
Arizona State University: MS in Policy Analytics and Program Evaluation
University of Southern California: Civic Tech USC
University of Chicago: MS in Computational Science and Public Policy
Examples of Public Policy Faculty Teaching Courses in R
Jeff Chen & Dan Hammer (Georgetown)
Michael Siciliano (University of Illinois Chicago)
Karl Rethemeyer (SUNY Albany)
Juan Rogers (Georgia Tech)
Jesse Lecy (Arizona State University)
Alexandra Chouldechova (Carnegie Mellon)
José Manuel Magallanes (Univ. of Washington)
Data Science in Government
US Digital Services Overview
Inside Obama's Stealth Startup [ link ]
Why I Joined the US Digital Services [ link ]
Five Examples of How Federal Agencies Use Big Data [ link ]
Data Science Training in Government
Datapolitan [ link ]
San Francisco [ link ]
New York [ link ]
Department of Commerce [ link ]
Health and Human Services [ link ]
Open Data for Government
The Data Transparency Act [ overview ] [ link ] [ link ] [ link ]
Project Open Data [ link ] [ principles ]
Open North standards [ link ]
Keynote Speech on Importance of DATA Act [ link ]
40 Brilliant Open Data Projects for Smart Cities [ link ]
US Cities Open Data Census [ link ]
Ben Wellington's TED Talk on Open Data in NYC [ link ]
Background on the Open Data Movement [ link ]
Sunlight Foundation's Open Data Guidelines [ link ]
Global Impact of Open Data Book: GovLab / O'Reilly [ link ]
Progress Tracker on Federal Open Data Compliance [ link ]
How to Make Government Data Sites Better [ link ] [ link ]
Statewide Portal Tested in California [ link ]
Five Largest Cities Now Have Open Data Policies [ link ]
The Hidden Cost (and Benefits) of Open Data [ link ]
Realizing the Promise of Big Data: IBM Center for Gov. [ link ]
Data Used in 2017 Public Policy Dissertations [ link ]
Examples of Good Local Government Portals
Washington DC [ site ] [ shapefiles on github ] [ data community dc ]
Chattanooga Tableau Site [ link ]
Artificial Intelligence’s Impact on Government
AI to Transform Government [ link ]
Brookings Center report on automation [ link ]
Developing AI for federal government [ link ]
Examples of Data-Driven Journalism Project Portals
BBC creates graphics cookbook [ link ] [ cookbook ]
Buzzfeed [ all projects on GitHub ]
LA Times [ datadesk on GitHub ]
Washington Post [ projects on GitHub ]
Associated Press [ GitHub ] [ project template ] [ example ]
The Economist [ GitHub ]
Useful Data APIs
Awesome Public Datasets Page [ GitHub ]
Quandl API (many data sources) [ link ] [ r package ]
Census Data API [ acs package ] [ census api ]
Fun Data for Teaching [ link ]
Forbes: 35 Open Data Sources of Note [ link ]
100 Interesting Datasets [ link ]
TwitteR Package API [ link ]
19 Free Public Datasets (Springboard blog) [ link ]
ckanr [ github ] [ vignette ]
Rsocrata [ github ]
censusapi Package [ github ] [ slides ] [ tutorial ]
@unitedstates [ about ] [ github ]
Data USA [ link ] [ documentation ]
Data Science Toolkit [ link ] [ rpackage ]
Federal Government APIs [ link ]
Strava GPS Data of Athletes by City [ blog ]
Performance Management Tools
Visualization
Compendium of Clean Graphs in R: [ link ]
The Data Viz Project [ link ]
Gallery of ggplot geoms [ link ]
Creating More Effective Graphs [ book ] [ gallery ]
Data + Design: Ebook On Data [ pdf ]
An Economist's Guide to Visualizing Data [ pdf ]
Visuals for Teaching Statistics [ link ] [ link ]
Bl.ocks.org Graphics Gallery [ link ]
Help Me Viz Graphics Gallery [ link ]
What Makes a Map Beautiful? [ link ]
Tableau: Which chart or graph is right for you. [ link ]
Flowing Data [ link ]
Graphics in R Tutorial: [ FlowingData ]
ChartsNThings: A Blog by the NYT Graphics Dept [ link ]
Data Viz Syllabus by Quealy & Carter [ link ]
Junk Charts: Blog on Making Graphics Better [ link ]
Primer on Making Great Graphs in R [ download ]
10 Tips for Making R Graphics Look Good [ link ]
Data USA [ link ]
CityBike Data Visualized [ link ]
Arms Sales Visualized [ link ]
Pedestrian & Routes in US Cities Visualized [ link ] & Europe [ link ]
Winners of Infographic Awards [ link ]
Visual Essays [ link ]
Bad Graphs
How to Display Data Badly [ link ]
Clowns [ link ]
Worst of 2017 [ link ]
More Worst [ link ]
Calling Bullshit [ Misleading Axes ] [ Proportional Ink ]
Label Your Axes [ link ]
Pie Charts [ link ] [ link ]
Foreign Aid as Missile Attacks [ link ]
Dashboard Design
R Shiny Showcase [ link ]
R Shiny Widgets Gallery [ link ]
Nonprofit Dashboard Design [ webinar ] [ slides ]
Tableau: 6 Best Practices of Effective Dashboards [ download ]
Dashboard Examples
Pittsburgh Building Permits [ link ]
Government Performance in Chattanooga [ link ]
Fundraising Dashboard in R [ link ]
DataUSA [ link ]
Census Reporter [ link ]
Teacher Dashboard on Student Performance [ link ]
Vehicle collisions in Edinburgh [ link ]
Traffic accidents in London [ link ]
Life Expectancy Charts [ link ] [ link ]
Rise of Inequality [ link ]
World Development Indicators [ link ]
Demographics in Catalonia, Spain [ link ]
Tableau Gallery [ link ]
Predictive Analytics Models
Food Inspection Forecasting: case study on predictive analytics for food violations in Chicago[ link ]
Optimizing Infrastructure Repair [ measurement ] [ model ] [ news ]
Pretrial Criminal Risk Assessment for Judges [ link ]
Predicting Fire Hazards [ link ] [ model ]
Why the Bronx Really Burned - Predictive Analytics Fail [ link ]
Use Machine Learning to Predict Infrastructure Failure [ link ]
Using Prediction to Prioritize Water Infrastructure Maintenance [ link ]
Using RFIDs to Regulate Marijuana Distribution in Colorado [ link ]
Crowd-Sources Solutions [ about DrivenData ] [ current competitions ]
State and National Presidential Poll Aggregation [ link ]
Open Innovation
The Data-Driven Justice Initiative [ link ]
Next Stage in the Open Data Movement [ link ]
Challenge.gov: Using Competitions to Spur Innovation [ link ]
Data for Democracy [ link ]
Text Analysis Tools
Quanteda [ link ]
Who Wrote the Anonymous Op-Ed? [ link ] [ link ]
R for Open Source Data Analytics
Open source data programming languages have evolved rapidly and are quickly becoming industry standard for data scientists. Public Affairs programs are adopting these technologies because they are free, and a language like R can perform statistical analysis, dynamic reporting, GIS, analysis for qualitative research, and other functions, meaning it can be substituted for several expensive software licenses, making it a good choice for public sector and nonprofit organizations that don't have large technology budgets. It also lends itself to open innovation since analytical solutions to public sector problems can be easily shared and adopted across localities to encourage collaboration and support an ecosystem of performance.
Job Growth for R Skills [ link ] [ current positions & here ] [ blog ]
Learning R
Tutorials and Curricula
Resources for Learning R [ link ]
Courses in Data Programming [ link ] [ link ]
Spatial Analysis (GIS) [ link ]
Network Analysis [ link ]
Collect Social Media Data [ link ]
Online Courses
Coursera: R Programming
Datacamp: Free Introduction to R
Code School: Try R
Graphics in R: FlowingData
Learn R in R: Swirl and Swirl Course List
Useful Cheat Sheets and References
R Style Guides [ Google's] [ Hadley Wickam's ] [ datacamp ]
R Cheat Sheet Library [ link ]
Short Reference Card [ link ] [ link ]
Project Management Guide [ download ] [ link ]
GitHub is Going Mainstream [ link ]
Data Science Toolkit [ link ] [ rpackage ]
Recommended Textbooks
Kabacoff (2015), R in Action [ github ]
Teetor (2011), R Cookbook
Chang (2013), The R Graphics Cookbook [ github ]
Matloff (2012), The Art of R Programming [ github ]
Spector (2008), Data Manipulation with R [ github ]
Stanton (2013), Introduction to Data Science [ free download ]
Wickham (2015), Advanced R [ free online ]
Schwarzer et. al (2015), Meta-Analysis with R
Chen & Peace (2013), Applied Meta-Analysis with R
Communities
R Open Science [ link ]
Blogs and Listservs
R-Bloggers [ link ]
R Weekly [ link ]
Stack Overflow [ link ]
Flowing Data [ link ]
Data Science Podcasts
Data Points by GovEx [ link ]
Partial Derivative [ link ]
DMV Nation [ link ]
Becoming a Data Scientist [ link ]
Data Stories [ link ]
Talking Machines [ link ]
Not So Standard Deviations [ link ]
Data Skeptic [ link ]
More Or Less [ link ]
Linear Digression [ link ]
R-Podcast [ link ]
Data Journalists, Bloggers & Civic Groups
Trend CT [ link ] [ github ] [ style guide ]
Todd Schneider [ blog ] [ github ]
I Quant NY [ blog ]
ChartsNThings: A Blog by the NYT Graphics Dept [ link ]
Data for Democracy [ link ]
Jobs
Fellowships:
Data Incubator [ fellowship overview ]
Data Science for the Public Good [ link ]
Job Boards:
Indeed + “R Statistics” [ link ]
R Users [ link ]
Zip Recruiter [ link ]