diff --git a/.gitignore b/.gitignore index 058dd6c2..d17e5544 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,4 @@ _site .DS_Store .Rhistory +.Rproj.user diff --git a/README.html b/README.html deleted file mode 100644 index 99239569..00000000 --- a/README.html +++ /dev/null @@ -1,102 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - -
-

DSS Community Site

-

Since the beginning of the Data Science Specialization we’ve noticed the unbelievable passion students have about our courses and the generosity they show toward each other on the course forums. A couple students have created quality content around the subjects we discuss, and many of these materials are so good we feel that they should be shared with all of our students. This site is meant to serve as a central directory for community created content.

-
-

Contributing

-

If you’ve created a web page, video, sideshow, or any other kind of media you think should be shared through this directory you should:

-
    -
  1. Fork this repository.
  2. -
  3. Add a link to your content on the appropriate course page.
  4. -
  5. Commit your changes.
  6. -
  7. Submit a pull request.
  8. -
-

We’ve created a sample pull request to show you what we would like to see in a pull request. If we think your creation is well made, informative, and adds something new to this repository of content then we’ll merge your request and add you to our list of contributors. If you happen to notice any inaccuracies or idiosyncrasies on this site or in this site’s content, please let us know by opening an issue.

-

If you are not the author of the content you are submitting you are welcome to add your link to the Curated Knowledge page. We’ve created this page specifically so that you can share data science resources that you’ve found useful.

-

Otherwise if you are the author of the content you’re submitting you should ask yourself the following questions:

-
    -
  1. Does my contribution teach?
  2. -
  3. Does the content of my contribution clearly address topics in the Data Science Specialization?
  4. -
  5. Could my contribution be seamlessly integrated into the canonical course materials?
  6. -
-

If you’re on the fence about any of these, err on the side of sending a pull request!

-
-
- - -
- - - - - - - - diff --git a/README.md b/README.md index 9c326a21..83b0568d 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Since the beginning of the Data Science Specialization we've noticed the unbelie ## Contributing -If you've created a web page, video, sideshow, or any other kind of media you think should be shared through this directory you should: +If you've created a web page, video, slideshow, or any other kind of media you think should be shared through this directory you should: 1. Fork this repository. 2. Add a link to your content on the appropriate course page. diff --git a/about.md b/about.md index 1685c410..37ecc9da 100644 --- a/about.md +++ b/about.md @@ -19,7 +19,10 @@ The [Data Science Specialization](https://www.coursera.org/specialization/jhudat - [Kevin Markham](http://www.dataschool.io/) - Derek Franks - David Hood +- [Leonard Greski](https://github.com/lgreski) - Michael Sachs - Allan Inocêncio de Souza Costa - [stepds](https://github.com/stepds) -- Bastiaan Quast \ No newline at end of file +- Bastiaan Quast +- [Xing Su](http://sux13.github.io/DataScienceSpCourseNotes/) +- [Edmund julian Ofilada](https://github.com/DocOfi) diff --git a/capstone.md b/capstone.md new file mode 100644 index 00000000..6285e422 --- /dev/null +++ b/capstone.md @@ -0,0 +1,14 @@ +--- +title: "Capstone" +permalink: /capstone/ +layout: page +--- +## Reference Material + +- [Speech and Language Processing, 3rd Edition](https://web.stanford.edu/~jurafsky/slp3/) Working version of Jurafsky, et. al. book on natural language processing whose content on n-grams is helpful for the capstone. + +## Course Project + +- [n-gram Computations and Computer Capacity](http://bit.ly/2couvxh) Explains the amount of memory required to convert the text files for the course project into n-grams, using the quanteda package. +- [Capstone Strategy](http://bit.ly/2rGcgc6) Describes a general strategy to get through the Capstone: use the simplest approaches possible. +- [Choosing a Text Analysis Package](http://bit.ly/2qagsPa) Reviews pros and cons of various R packages used for natural language processing, in the context of requirements for the Capstone project. diff --git a/curated.md b/curated.md index 67da2130..8c806fd8 100644 --- a/curated.md +++ b/curated.md @@ -6,20 +6,48 @@ permalink: /curated/ ### Analytics +- [Huge Trello Board Collection of Data Science Resources](https://trello.com/b/rbpEfMld/data-science) +- [Diving Into Data Science Flipboard](https://flipboard.com/@thiakx/diving-into-data-science-5823ectuy) - [OLAP Operation in R](http://architects.dzone.com/articles/olap-operation-r) - [Journal of Statistical Software: Tidy data](http://www.jstatsoft.org/v59/i10/paper) +- [Verzani: simpleR – Using R for Introductory Statistics](http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf) +- [Data Visualization packages](http://www.datavis.ca/R/) +- [Visualization hints: plotting numeric data by groups](http://www.r-bloggers.com/visualization-series-insight-from-cleveland-and-tufte-on-plotting-numeric-data-by-groups/) +- [Matrix rotation for image and contour plots in R](http://blog.snap.uaf.edu/2012/06/08/matrix-rotation-for-image-and-contour-plots-in-r/) +- [Fig Data: 11 Tips on How to Handle Big Data in R (and 1 Bad Pun)](http://theodi.org/blog/fig-data-11-tips-how-handle-big-data-r-and-1-bad-pun) +- [Data from 538](https://github.com/fivethirtyeight/data) +- [Getting started with python notebook](https://medium.com/@adhira_deo/the-environment-for-building-machine-learning-models-a1552116b355) ### Command Line - [explainshell.com - match command-line arguments to their help text](http://explainshell.com/) - [The Command Line Crash Course - Quick course in using the command line](http://cli.learncodethehardway.org/book/) +- [Mastering the command line, in one page](https://github.com/jlevy/the-art-of-command-line/blob/master/README.md) ### R - [Try R](http://tryr.codeschool.com/) +- [The R Book by Michael J. Crawley](https://archive.org/details/TheRBook/) +- [Univ. of Calif. Riverside R Programming](http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-R-Basics) +- [G. Sanchez - Strings in R](http://gastonsanchez.com/Handling_and_Processing_Strings_in_R.pdf) - [The Lubridate Package](http://www.jstatsoft.org/v40/i03/paper) - [Google Developers R Programming Video Lectures](http://www.r-bloggers.com/google-developers-r-programming-video-lectures/) +- [awesome R](https://github.com/qinwf/awesome-R) - A curated list of awesome R frameworks, packages and software. +- [awesome machine learning](https://github.com/josephmisiti/awesome-machine-learning#r) - A curated list of awesome Machine Learning frameworks, libraries and software. +- [Google's R Style Guide](https://google-styleguide.googlecode.com/svn/trunk/Rguide.xml) +- [Tufte-style HTML in rmarkdown](http://sachsmc.github.io/tufterhandout/) +- [Creating an R Package](http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/) +- [R Packages (Hadley online book)](http://r-pkgs.had.co.nz/) - How to write your own R packages. +- [Beautiful ggplot2 Cheatsheet](http://zevross.com/blog/2014/08/04/beautiful-plotting-in-r-a-ggplot2-cheatsheet-3/) +- [Intro to Graphics](http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/2.Plotting.pdf) +- [data.table cheat sheet](https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf) +- [Exploratory Data Analysis with data.table](http://varianceexplained.org/RData/lessons/lesson4/) +- [Fast summary statistics in R with data.table](http://blog.yhathq.com/posts/fast-summary-statistics-with-data-dot-table.html) +- [R online in r-fiddle.org](http://www.r-fiddle.org/) +### Probability and Statistics + +- [Probability and Statistics Cookbook](http://matthias.vallentin.net/probability-and-statistics-cookbook/) ### GitHub @@ -28,9 +56,16 @@ permalink: /curated/ - [Git Immersion - A guided tour through the fundamentals of Git](http://gitimmersion.com/) - [GitHub - Dealing with Multiple Accounts](http://hmkcode.com/git-tutorial/how-to-deal-with-multiple-github-accounts-on-one-computer/) - [Try Git](https://try.github.io/levels/1/challenges/1) +- [Learn Git Branching: Interactive Game](http://pcottle.github.com/learnGitBranching/) +- [Atlassian Git Tutorials - Branches](https://www.atlassian.com/git/tutorials/using-branches/) ### Reproducible Research - [Markdown live demo](http://markdown-here.com/livedemo.html) +- [Boosting Slides by Ron Meir](https://github.com/Aratinga/Misc/blob/master/BoostingTutorial.pdf) +- [Reproducible Research website](http://reproducibleresearch.net/) + +### Machine Learning +- [UC Irvine Machine Learning Data Repository](http://archive.ics.uci.edu/ml/) ### Textbooks - [OpenIntro textbook](https://www.openintro.org/stat/textbook.php) diff --git a/ddp.md b/ddp.md index b3d009e2..0af67104 100644 --- a/ddp.md +++ b/ddp.md @@ -5,6 +5,21 @@ permalink: /ddp/ --- - [Slidify to Github walkthrough](http://rpubs.com/thoughtfulbloke/25103) -- [ggvis and rmarkdown slides with interactive plots](http://qua.st/ggvis-shiny-html5-slides/) +- [ggvis and rmarkdown slides with interactive plots](http://qua.st/ggvis-shiny-html5-slides) + +## Shiny +- Choropleth of PBS WARN Distribution of Wireless Emergency Alerts + - [Code for Shiny App](https://github.com/amsilvr/shiny_choropleth) + - [App running on shinyapps.ip](https://silverman.shinyapps.io/warn_wea/) - [Shiny app to simulate 401K growth with interactive plots](http://www.mephistosoftware.com/shiny/401k_simulator/) - [Shiny Video Tutorials Playlist on Youtube](http://www.youtube.com/playlist?list=PL6wLL_RojB5xNOhe2OTSd-DPkMLVY9DfB) +- [Tutorial on writing Shiny simulation apps](https://github.com/homerhanumat/shinyTutorials) +- [Dockerize a Shiny App](http://www.rmining.net/2015/04/30/dockerizing-a-shiny-app/) +- [Git pushing Shiny Apps with Docker/Dokku](http://www.rmining.net/2015/05/11/git-pushing-shiny-apps-with-docker-dokku/) +- [Share your Shiny Apps with Docker and Kitematic](http://www.rmining.net/2015/08/10/share-your-shiny-apps-with-docker-and-kitematic/) +- [Shinyapps.io: Configuring Application Timeout](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/dataProd-shinyTimeoutConfig.md) +- [Plotting Natural Disasters](http://www.rpubs.com/DocOfi/367052) + +## Comprehensive Notes + +- Complete notes for [Developing Data Products](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/eda.md b/eda.md index 0ba7d760..1f56ac70 100644 --- a/eda.md +++ b/eda.md @@ -3,6 +3,14 @@ layout: page title: Exploratory Data Analysis permalink: /eda/ --- + - [Creating a Kite Graph](http://rpubs.com/thoughtfulbloke/kitegraph) +- [Analyzing Top/Green500 Supercomputer Technology Trends](http://github.com/ww44ss/Exascalar-Analysis-) +- [Emissions Choropleth Maps](https://github.com/BillSeliger/ExData_Plotting2) +- [Data Analysis using Twitter API and Python](http://blog.impiyush.com/2015/03/data-analysis-using-twitter-api-and.html) +- [Exploratory Data Analysis using Flexdashboard](http://rpubs.com/DocOfi/350830) +- [Plotting using Metricsgraphics](http://www.rpubs.com/DocOfi/352947) + +## Comprehensive Notes -- [Analyzing Top/Green500 Supercomputer Technology Trends](http://github.com/ww44ss/Exascalar-Analysis-) \ No newline at end of file +- Complete notes for [Exploratory Data Analysis](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/getclean.md b/getclean.md index 380a0515..deeccc56 100644 --- a/getclean.md +++ b/getclean.md @@ -6,8 +6,22 @@ permalink: /getclean/ - [Subsetting example walkthrough](http://rpubs.com/thoughtfulbloke/subset) - [Apples to Oranges Data Organisation Challenge](https://github.com/thoughtfulbloke/faoexample) -- [dplyr Video Tutorial](https://www.youtube.com/watch?v=jWjqLW-u3hc) and [R Markdown document](http://rpubs.com/justmarkham/dplyr-tutorial): An [update](http://blog.rstudio.org/2014/01/17/introducing-dplyr/) to the plyr package, useful for subsetting, sorting, summarizing, and merging data using a more intuitive syntax than plyr or base R. +- [dplyr introductory tutorial](https://www.youtube.com/watch?v=jWjqLW-u3hc) and [R Markdown document](http://rpubs.com/justmarkham/dplyr-tutorial): A 39-minute video tutorial that covers the five basic dplyr "verbs" and a dozen other dplyr functions. dplyr is an [update](http://blog.rstudio.org/2014/01/17/introducing-dplyr/) to the plyr package, useful for subsetting, sorting, summarizing, and merging data using a more intuitive syntax than plyr or base R. +- [dplyr "going deeper" tutorial](https://www.youtube.com/watch?v=2mh1PqfsXVI) and [R Markdown document](http://rpubs.com/justmarkham/dplyr-tutorial-part-2): A 37-minute video tutorial that covers the new functionality in dplyr versions 0.3 and 0.4. - [Downloading files general advice](http://rpubs.com/thoughtfulbloke/downloadtips) - [Codebook sample](https://gist.github.com/kirstenfrank/218c36a1938055d0f4e4) - [Second Codebook sample](https://gist.github.com/kirstenfrank/699abe3e16fd1dc36e5d) - [Query string (and other fields-within-fields) unrolling](http://rpubs.com/schnee/32988) +- [Pre-processing Excel files before loading them into R](https://github.com/alkashef/cleaningexceldata) +- [Codebook template that can be used in the Getting and Cleaning Data project](https://gist.github.com/JorisSchut/dbc1fc0402f28cad9b41) +- ["Real world" example - reading American Community Survey 2000 PUMS Data:](https://github.com/lgreski/acsexample) Demonstrates how to extract records of a given type from a data file containing multiple record types, and how to use an Excel-based code book to specify arguments for reading a fixed-width file. +- [18 Months of CTA advice](https://thoughtfulbloke.wordpress.com/2015/08/31/hello-world) +- [Common Problems: Quiz 1 - Missing Java Runtime](http://bit.ly/2jjtyXM) Explains how to solve the problem of a missing Java Runtime for the question that requires students to process a Microsoft Excel spreadsheet. +- [Strategy for Reading Files & APIs / Quiz 2](http://bit.ly/2e4L5oF) +- [Common Problems: Quiz 2 - sqldf() driver fails to connect](http://bit.ly/2kD2KTY) +- [Tutorial: Downloading Files](http://bit.ly/2iP2suj) Illustrates various ways of downloading files, including binary and text files. +- [Creating dataframes from xml data](https://www.dropbox.com/s/7bbzzp4bwsmfl5y/CreatingDataframesfrom%20XmlFiles.odt?dl=0) + +## Comprehensive Notes + +- Complete notes for [Getting and Cleaning Data](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/index.md b/index.md index 6c035e70..761f3e41 100644 --- a/index.md +++ b/index.md @@ -4,7 +4,7 @@ layout: page ## Table of Contents -This is site is meant to serve as a directory for the amazing content the +This site is meant to serve as a directory for the amazing content the community has created around the Data Science Specialization. If you are interested in contributing [click here](https://github.com/DataScienceSpecialization/DataScienceSpecialization.github.io#contributing). @@ -17,6 +17,7 @@ interested in contributing [click here](https://github.com/DataScienceSpecializa 7. [Regression Models](/regmod/) 8. [Practical Machine Learning](/pml/) 9. [Developing Data Products](/ddp/) +10. [Capstone](/capstone/) - [Other Resources](/other/) - [Curated Pages](/curated/) diff --git a/other.md b/other.md index bb521d05..ddb49135 100644 --- a/other.md +++ b/other.md @@ -7,9 +7,11 @@ permalink: /other/ ## Configuring R and RStudio (Linux) - [Installing xlsx and XML packages on Debian Wheezy](http://allanino.me/blog/programming/installing-some-r-packages/) +- [Rscript to customize R environment](http://bit.ly/r-customize-script) - Installs packages used in the specialization. - [Installing Some Basic R Packages in Ubuntu; Ibrahim El Merehbi](http://elmerehbi.wordpress.com/2014/09/09/installing-some-basic-r-packages-in-ubuntu) - [Using Projects in RStudio](https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects) - [Using Version Control with RStudio](https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN) +- [Using R behind HTTP/HTTPS Proxy](https://support.rstudio.com/hc/en-us/articles/200488488-Configuring-R-to-Use-an-HTTP-or-HTTPS-Proxy) ### Ignoring R & RStudio files - [gitignore template for R](https://github.com/github/gitignore/blob/master/R.gitignore) (source:[gitignore](https://github.com/github/gitignore)) @@ -21,8 +23,11 @@ permalink: /other/ ## Pre-built virtual machines for R development. - [Here's a pre-built lightweight Linux machine with R and RStudio already installed](https://github.com/queirozfcom/r-box). You just need to install [vagrant](https://www.vagrantup.com/downloads.html), download (or clone) the github repository and you'll get a clean ubuntu machine with the tools you'll need for the assignments. -- [Data Science Appliance](http://datascienceappliance.com/) - A perfectly provisioned virtual machine for data scientists. - - [Data Science Toolbox](http://datasciencetoolbox.org/) - A virtual environment that allows you to start doing data science in a matter of minutes. -- [Virtual machine with RStudio server and github setup](https://github.com/tboloo/vagrant-rstudio) - A VirtualBox, Vagrant & chef-solo managed virtual machine which provides RStudio server with git & github setup \ No newline at end of file +- [Virtual machine with RStudio server and github setup](https://github.com/tboloo/vagrant-rstudio) - A VirtualBox, Vagrant & chef-solo managed virtual machine which provides RStudio server with git & github setup + +## Deploying and sharing Shiny Apps with Docker +- [Dockerize a Shiny App](http://www.rmining.net/2015/04/30/dockerizing-a-shiny-app/) +- [Git pushing Shiny Apps with Docker/Dokku](http://www.rmining.net/2015/05/11/git-pushing-shiny-apps-with-docker-dokku/) +- [Share your Shiny Apps with Docker and Kitematic](http://www.rmining.net/2015/08/10/share-your-shiny-apps-with-docker-and-kitematic/) diff --git a/pml.md b/pml.md index dd7974f2..1054002d 100644 --- a/pml.md +++ b/pml.md @@ -11,8 +11,27 @@ permalink: /pml/ ## Supplementary Videos +- [What is machine learning, and how does it work?](https://www.youtube.com/watch?v=elojMnjn4kk): A high-level overview of machine learning in a 10-minute video - [Video lectures from "An Introduction to Statistical Learning"](http://www.dataschool.io/15-hours-of-expert-machine-learning-videos/): Videos for Chapters 4, 5, 6, 8, and 10 can help to deepen your understanding of the topics presented in this course. ## Machine Learning Competitions - [Participating in Kaggle's Allstate Purchase Prediction Challenge](http://www.dataschool.io/kaggle-allstate-purchase-prediction-challenge/): Description of what it's like to compete in a Kaggle competition, including links to a project paper, R code, presentation slides, and a presentation video. + +## Choosing a Machine Learning Model + +- [Comparing Supervised Learning Algorithms](http://www.dataschool.io/comparing-supervised-learning-algorithms/): Comparing 8 common supervised learning algorithms (for regression and classification) on 13 different dimensions. + +## Content Related to the Lectures + +- Complete notes for [Practical Machine Learning](http://sux13.github.io/DataScienceSpCourseNotes/) +- [Week 4: Combining Predictors -- Math Explained](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-combiningPredictorsBinomial.md) + +## Configuring Github Pages with RStudio for PML Project + +- Step by step instructions to [Configure Github Pages with RStudio](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-ghPagesSetup.md) to support the PML course project. + +## Improving Runtime Performance of Caret + +- Step by step instructions to [implement parallel processing in caret::train()](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-randomForestPerformance.md) on a random forest model, along with runtime performance analysis for a variety of laptops, ranging from an Intel Atom-based tablet to a quad-core i7 processor. + diff --git a/regmod.md b/regmod.md index a18c17c7..1445c83d 100644 --- a/regmod.md +++ b/regmod.md @@ -7,3 +7,7 @@ permalink: /regmod/ ## Supplementary Videos - [Video lectures from "An Introduction to Statistical Learning"](http://www.dataschool.io/15-hours-of-expert-machine-learning-videos/): Videos for Chapter 3 can help to deepen your understanding of regression. + +## Comprehensive Notes + +- Complete notes for [Regression Models](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/repres.md b/repres.md index ead7eac4..cba776f9 100644 --- a/repres.md +++ b/repres.md @@ -7,3 +7,10 @@ permalink: /repres/ - [Turning a RPubs document into a Github website walkthrough](https://github.com/thoughtfulbloke/appleorange) - [Introduction to knitr with rmarkdown](https://sachsmc.github.io/knit-git-markr-guide/knitr/knit.html) - [Trends and severity of Data Breaches](http://rpubs.com/ww44ss/29389) +- [Benefit-cost analysis of a park user fee](https://rstudio-pubs-static.s3.amazonaws.com/72135_dc45211d976842c2a9a8c8b5f2472ff0.html) +- [Data Lake Integrity](http://rpubs.com/rshane/81297) +- [ProjectTemplate in RStudio with Git](http://padamson.github.io/r/rstudio/projecttemplate/git/2016/01/17/projecttemplate-in-rstudio-with-git.html) + +## Comprehensive Notes + +- Complete notes for [Reproducible Research](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/rprog.md b/rprog.md index 082f8ff5..47df54d1 100644 --- a/rprog.md +++ b/rprog.md @@ -1,20 +1,57 @@ --- -layout: page -title: R Programming +title: "R Programming" permalink: /rprog/ +layout: page --- +## Getting Started +- [Resources for R Programming](http://bit.ly/2dhZ8Dy) +- [References for R Programming](http://bit.ly/2b8AxhF) +- [Data Science Specialization Value Proposition](http://bit.ly/2j3EcCn) +- [R Onboarding for SAS Users](http://bit.ly/2dr7yum) + ## Programming Assignments +- [Strategy for Coding the Programming Assignments](http://bit.ly/2ddFh9A) - [Tutorial for those struggling with Programming Assignment 1](https://github.com/derekfranks/practice_assignment) +- [Breaking Down pollutantmean](http://bit.ly/2cHyiCl) +- [Assignment 1: A More Elegant Solution](http://bit.ly/2kwBBlK) +- [A SAS Version of pollutantmean?](http://bit.ly/2d3DR4e) +- [Tutorial for those struggling with Programming Assignment 2](https://github.com/DanieleP/PA2-clarifying_instructions) +- [Tutorial for those struggling with Programming Assignment 3](https://github.com/DanieleP/PA3-tutorial) - [PA1-test: `testthat`, Unit Tests for Programming Assignment 1](https://github.com/cbryant1000/pa1test) - [PA3-test: `testthat`, Unit Tests for Programming Assignment 3](https://github.com/cbryant1000/pa3test) +- [Alternative submit script for Programming Assignment 1 that makes submitting more convenient by allowing selection of multiple parts plus prompting if user wants to submit another part before exiting](https://github.com/rchampoux/coursera/blob/master/rprog-scripts-submitscript1.R) +- [Grading the SHA-1 Hash Code](http://bit.ly/2iUWoB6) +- [Assignment 2: Demystifying makeVector](http://bit.ly/2bTXXfq) +- [Assignment 2: makeCacheMatrix as an Object](http://bit.ly/2byUe4e) ## R Language - [Some notes on the R Language](http://lopezrj.github.io) +- [A Data Frame is Also a List](http://bit.ly/2fmMRAp) +- [S Objects, R Objects, and Lexical Scoping](http://bit.ly/2dtOSXi) +- [Common R Mistakes: Overwriting Functions with Data Objects](http://bit.ly/2i3gmoA) +- [Forms of the Extract Operator](http://bit.ly/2bzLYTL) +- [Functions to Sort Data Frames](http://bit.ly/2dxItzw) +- [Creative Use of R: Downloading Course Lectures](http://bit.ly/2bGlI7R) Article illustrating how to use R to automate the download of lectures from *Data Science Specialization* courses, such as *R Programming*. Techniques used in this article are helpful to make research reproducible, as required for courses like *Getting and Cleaning Data* and *Reproducible Research*. +- [Lexical Scoping and Statistical Computing](http://bit.ly/2cmqAPy) Article by Robert Gentleman and Ross Ihaka at the University of Auckland describing how lexical scoping works, and why it is valuable in statistical computing. +- [Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind](http://bit.ly/2oCHulX) Bob Muenchen's take on the job market for various data science langauges. + + ## R language cheatsheet + - [R cheatsheet covering all lectures](https://github.com/startupjing/Tech_Notes/blob/master/R/R_language.md) +## R and Commercial Statistics Packages + +- [R Onboarding for SAS Users](http://bit.ly/2dr7yum) Provides an overview and links to a variety of resources to help people with SAS experience make the transition to R +- [Commercial Statistics Packages: An Historical Perspective](http://bit.ly/2fPj2qN) +- [Why is R More Difficult than SAS?](http://bit.ly/2erxk3A) +- [Thinking in R versus Thinking in SAS](http://bit.ly/2cH3u8x) + +## Comprehensive Notes + +- Complete notes for [R Programming](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/statinf.md b/statinf.md index 51014339..19592a27 100644 --- a/statinf.md +++ b/statinf.md @@ -3,3 +3,20 @@ layout: page title: Statistical Inference permalink: /statinf/ --- + +- [Why degrees of freedom decrease for sample variance](https://github.com/Manu58/bias/blob/master/bias.pdf) +- [CONCEPTS: Calculating Area for a Point on the Normal Curve](http://bit.ly/2hw5AMF) Reviews the mathematics that explain why one cannot calculate the exact proability for a specific value within a distribution for a continuous variable, and illustrates how to calculate a quantile for a point on the curve. +- [Analysis of exponential distribution of births data set from the CDC](https://gist.github.com/ProgramErgoSum/5316008387746fcd84de) +- [Exponential Distribution / Central Limit Theorem - Assignment Checklist](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-expDistChecklist.md) +- [ToothGrowth Analysis - Assignment Checklist](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/ToothGrowthChecklist.md) +- [Exploratory Data Analysis in ToothGrowth Assignment](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/edaInToothGrowthAnalysis.md), explaining the exploratory data analysis requirement for students who have not taken the *Exploratory Data Analysis* course prior to taking *Statistical Inference*. +- [Using MathJax with Discussion Forums, R Markdown, and Github Pages](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/mathjaxWithGithubMarkdown.md) +- [Kable Tables with Data Frames](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/kableDataFrameTable.md) illustrates how to display a custom table in a `knitr()` document by creating a data frame to contain the information to be rendered with `kable()`. +- [Interactive Confidence Interval Visualization](https://github.com/amcadie/interactive_CI) +- [Installing MiKTeK on Windows 10 / Generate a PDF from knitr](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-generatePDF.md) +- [Power calculations: optimal szmple size](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-optimalSampleSize.md) +- [Permutation Tests Explained](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-permutationTests.md) + +## Comprehensive Notes + +- Complete notes for [Statistical Inference](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/toolbox.md b/toolbox.md index 2f9a8134..3c2dfc68 100644 --- a/toolbox.md +++ b/toolbox.md @@ -6,6 +6,8 @@ permalink: /toolbox/ ## Command Line +- [Working with files in Bash](http://edgarsh.es/ins/working-with-files-in-bash/) + ## Git/GitHub - [Git & GitHub Video Playlist](https://www.youtube.com/playlist?list=PL5-da3qGB5IBLMp7LtN8Nc3Efd4hJq0kD) (also available for [download](https://drive.google.com/folderview?id=0BxRfg0msVmAoRlZFQjJ3T3VTOUE&usp=sharing) as mp4 files) @@ -13,3 +15,12 @@ permalink: /toolbox/ - [Understanding the Relationship Between Git and GitHub](http://www.dataschool.io/github-is-just-dropbox-for-git/) - [Simple Guide to GitHub Forks](http://www.dataschool.io/simple-guide-to-forks-in-github-and-git/) - [Github Repo Tutorial How to fork a repo, download it to your local drive and commit changes ](https://www.youtube.com/watch?v=MY94AIplcaU) +- [Configuring RStudio to work with Git / Github - Mac OSX](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/configureRStudioGitOSXVersion.md) +- [Configuring RStudio to work with Git / Github - Windows](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/configureRStudioGitWindowsVersion.md) + +## Comprehensive Notes + +- Complete notes for [The Data Scientist's Toolbox](http://sux13.github.io/DataScienceSpCourseNotes/) + +## Miscellaneous +- [Using Editor Modes in Coursera Discussion Forum Posts](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/usingMarkdownInForumPosts.md)