2014-08-27

創用 CC 授權條款

Outline

  • Preface
  • HTML Documents
  • Markdown Basics
  • R Code Chunks
  • Deployment
  • Miscellaneous
  • Appendix
    • R Presentation
    • ioslides

Preface

It happened everyday

  1. Data preparation
  2. Modeling
  3. Generating report
  4. Something wrong in your data
  5. Repeat 1 ~ 4

logo

Local data scientists need better tool!

Research Pipeline

R Markdown

A convenient tool to generate reproducible document.

  • Markdown
    • Remove HTML tag for higher readibility.
    • Inline HTML is avaliable.
  • R markdown
    • Markdown + embedded R code chunks
    • Rmd -> md -> html(docx, pdf)
  • Why R Markdown
    • Consolidate your code and document into single file.
    • Easy for version control.

Version

  • v1: based on package knitr and markdown
  • v2: based on knitr and pandoc, intergrated into RStudio v0.98.932 or later
    • default bootstrap template
    • docx and pdf
    • slides with ioslides and Beamer
    • embedded shiny apps
  • keep using R Markdown v1:
<!-- rmarkdown v1 -->

Installing R Markdown

Overview

  • YAML Metadata: YAML document options
  • Markdown: Article text
  • R Code Chunk: Executible R code

Rendering Output

  • RStudio: "Knit" command (Ctrl+Shift+K)
  • Command line: rmarkdown::render function

    rmarkdown::render("input.Rmd")

Let's rock with R Markdown!

ggvis

shiny + googleVis

HTML Documents

Create a HTML Document

Currently YAML doesn't work well with Chinese on Windows. Don't use Chinese title if you are a Windows guy.

output_option output_option

Output Option

With Rstudio, You can edit various output options with friendly UI.

---
title: "R Markdown Exercise"
author: "Mansun Kuo"
date: "July 24, 2014"
output:
  html_document:
    css: assets/css/custom.css
    fig_caption: no
    fig_height: 5
    fig_width: 7
    highlight: default
    keep_md: no
    number_sections: no
    theme: default
    toc: yes
---

output_option

Apply CSS Style

CSS:

#nextsteps {
   color: blue;
}

.emphasized {
   font-size: 1.2em;
}

Apply to Whole Section:

### Apply to Whole Section {#nextsteps .emphasized}

Exercise

使用RStudio產生一個HTML文件:

  • 標題為R Markdown Exercise

  • include table of contents

Markdown Basics

Markdown Quick Reference

md_quick_ref

Emphasis

*italic*   **bold**

_italic_   __bold__

I am italic

I am bold

Headers

Setext:

Header 1
=============

Header 2
-------------

atx:

# Header 1

## Header 2

### Header 3

Exercise

加上報告的段落標題:

## 油電業薪資近9萬,是教服業的4倍?

### 行業別每人每月薪資 - Top 3 & Last 3

### 行業別每人每月薪資總表

Manual Line Breaks

End a line with two or more spaces:

Roses are red,  
Violets are blue.

Lists

Unordered List:

* Item 1
* Item 2
    + Item 2a
    - Item 2b

Ordered List:

1. Item 1
2. Item 2
3. Item 3
    + Item 3a
    + Item 3b



Unordered List:

  • Item 1
  • Iteivm 2
    • Item 2a
    • Item 2b

Ordered List:

  1. Item 1
  2. Item 2
  3. Item 3
    • Item 3a
    • Item 3b

The Four-space Rule

Subsequent paragraphs must be preceded by a blank line and indented four spaces or a tab.

* fruits

    delicious!!!
    + apples
        - macintosh
        - red delicious
    + pears
    + peaches
* vegetables  
  healthy!!!
    + broccoli
    + chard



  • fruits

    delicious!!!
    • apples
      • macintosh
      • red delicious
    • pears
    • peaches
  • vegetables
    healthy!!!
    • broccoli
    • chard

Links

Images

Inline Image:

![logo](assets/img/Taiwan-R-logo.png)



Reference Image:

![logo][R]

[R]: assets/img/Taiwan-R-logo.png "R logo"

Exercise

在「油電業薪資近9萬,是教服業的4倍?」標題下加上新聞來源連結的Lists:

新聞連結:

- [自由時報](https://tw.news.yahoo.com/%E6%B2%B9%E9%9B%BB%E6%A5%AD%E8%96%AA%E8%B3%87%E8%BF%919%E8%90%AC-%E6%95%99%E6%9C%8D%E6%A5%AD%E7%9A%844%E5%80%8D-221333602.html)
- [經濟部](http://www.moea.gov.tw/Mns/populace/news/News.aspx?kind=1&menu_id=40&news_id=35719)

Blockquotes

A friend once said:

> It's always better to give
> than to receive.

Plain Code Blocks

  • Plain code blocks are displayed in a fixed-width font but not evaulated.

    ```
    This text is displayed verbatim / preformatted
    ```
    
  • Specify the language of the block is avaliable

    ```r
    x = rnorm(10)
    ```
    x = rnorm(10)

Embedding Equations

Horizontal Rule / Page Break

Three or more asterisks or dashes:

******

------

Tables

First Header  | Second Header
------------- | -------------
Content Cell  | Content Cell
Content Cell  | Content Cell


First Header Second Header
Content Cell Content Cell
Content Cell Content Cell

R Code Chunks

Overview

R code will be evaluated and printed

```{r}
summary(cars$dist)
```
summary(cars$dist)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       2      26      36      43      56     120

Named R code chunk.

```{r plot}
summary(cars)
plot(cars)
```
  • Easy Navigation in RStudio

Basic Chunk Options

  • echo(TRUE): whether to include R source code in the output file
  • eval(TRUE): whether to evaluate the code chunk
  • message(TRUE): whether to preserve messages emitted by message()
  • include(TRUE): if include=FALSE, nothing will be written into the output document, but the code is still evaluated and plot files are generated
  • warning(TRUE): whether to preserve warnings in the output

Set global chunk options:

knitr::opts_chunk$set()

Exercise

在「油電業薪資近9萬,是教服業的4倍?」之前設定Global Chunk Options:

```{r, include=FALSE}
# load("data/salary.RData")
data(salary, package="DSC2014Tutorial")
knitr::opts_chunk$set(warning = FALSE,
                      echo = FALSE,
                      message = FALSE)
```

Plots

  • dev('png'): figure format(png, jpeg, tiff, svg, …)
  • fig.path('figure/'): figure path
  • fig.width(7): figure width
  • fig.height(7): figure height
  • dpi(72): DPI (dots per inch)
```{r dev='svg', fig.path='myplot', fig.height=8}
plot(iris)
```

Exercise

將前面課程練習過的圖加在「行業別每人每月薪資 - Top 3 & Last 3」之下:

```{r plot, dpi=75, fig.width=10}
a = order(salary_2013$每人每月薪資)
salary_news = matrix(salary_2013$每人每月薪資[c(head(a,3),tail(a,3))],ncol = 6)
colnames(salary_news) = salary_2013$行業[c(head(a,3),tail(a,3))]

# Mac顯示中文需設置字型
# http://equation85.github.io/blog/graph-font-of-r-in-mac-os-x/
par(family='STKaiti')
mp = barplot(salary_news) #x軸座標
text(mp,10000,salary_news) #標註薪資
```

Table Output

Set results='asis' to write raw results from R into the output document

  • knitr::kable

    ```{r, results='asis'}
    knitr::kable(women)
    ```
    
  • xtable::xtable

    ```{r, results='asis'}
    print(xtable::xtable(women), 
          type="html", 
          include.rownames=FALSE)
    ```
    
height weight
58 115
59 117
60 120
61 123
62 126
63 129

Exercise

在「行業別每人每月薪資總表」下加上表格:

```{r results='asis'}
index = order(salary_2013$每人每月薪資, decreasing = TRUE)
knitr::kable(salary_2013[index, c("行業", "每人每月薪資")], 
             row.names=FALSE)
```

Caching

  • cache(FALSE): whether to cache a code chunk to improve proformance for expensive computing

  • If you run into problems with cached output, you can always clear the knitr cache by removing the folder named with a _cache suffix.

Language Engines

  • engine('R'): the language name of the code chunk
    • 'bash'
    • 'python'
    • 'Rcpp'
```{r engine='bash'}
whoami
```

Inline R Code


I counted 2 red trucks on the highway.

Exercise

使用Inline R Code在圖形下方加上一些基本的敘述:

從上方圖表可以清楚地看到,`r colnames(salary_news)[which.max(salary_news)]`每人每月薪資為
`r salary_news[which.max(salary_news)]`元,
是所有行業別中最高的,
第二名`r colnames(salary_news)[length(salary_news)-1]`也有
`r salary_news[length(salary_news)-1]`,
第三名`r colnames(salary_news)[length(salary_news)-2]`較第二名少了
`r salary_news[length(salary_news)-1]  - salary_news[length(salary_news)-2]`,但還是有
`r salary_news[length(salary_news)-2]`元。

整體薪資水準較為不佳則是`r colnames(salary_news)[1]`
、`r colnames(salary_news)[2]`和`r colnames(salary_news)[3]`,
其中`r colnames(salary_news)[1]`每人每月薪資是
`r salary_news[1]`,
約僅有`r colnames(salary_news)[which.max(salary_news)]`的
`r round(salary_news[1] / salary_news[length(salary_news)] * 100)`%。

Result

Extra: Generate word document with Knit Word

Deployment

Publish

Exercise

Deploy your HTML document on the web.

rpubs

Miscellaneous

How to Use

  • Using R markdown to generate reproducible report
  • Slides + R Markdown
  • Config file + R Markdown

About Document Content

You can add R Markdown and HTML in the YAML content.

hack_yaml

Generate Markdown and HTML

```{r results='asis', echo=FALSE}
library(whisker)
temp = '<span class="{{color}}{{number}}">{{color}}{{number}}</span>'
numbers = c("", "2", "3")
colors = c("red", "blue", "green", "yellow", "gray")
for (color in colors){
    cat("- ")
    for (number in numbers){
        out = whisker.render(temp)
        cat(out)
    }
    cat("\n")
}
```

Some Useful HTML

  • iframe: displaying a web page within a web page

    <iframe src="http://twconf.data-sci.org/" height=600 width=800></iframe>
  • img: inserting images into an HTML document.

    Much easier for adjusting width and height.

    <img src="assets/img/Taiwan-R-logo.png" alt="logo" height="42" width="42">

ggvis code

library(knitr)
library(ggvis)
mtcars %>%
    ggvis(x = ~wt, y = ~mpg) %>%
    layer_smooths(se=TRUE, opacity := 0.5, opacity.hover := 0.75) %>% 
    layer_points(fill = ~factor(cyl), size := 50, size.hover := 200) %>%
    set_options(hover_duration = 250)

Interactive Documents

It’s possible to embed a Shiny application within a document.

hack_yaml hack_yaml

References

Appendix

R Presentation

Overview

  • A feature of RStudio that enable easy authoring of HTML5 presentations based on R Markdown.
  • Special Feature
    • Extensive support for authoring and previewing presentations within the RStudio IDE
    • Flexible two column layouts
  • Getting Started

Slide Basics

Set global option in first slide.

R Presentation
========================================================
author: Mansun Kuo
date: June 22, 2014

Slides automatically display their titles unless title: false is specified.

Slide 1
====================================
title: false

- Bullet 1
- Bullet 2
- Bullet 3

Preview

  • Every time you save your presentation the preview is refreshed and navigated to whatever slide you were editing.
  • Within the preview pane, you can press the Edit button on the toolbar to jump immediately to it's location in the source file.

r_pres_preview

Two Column Layout

Two-Column Slide
====================================
left: 70%

First column
***
Second column

Transitions

  • transition: transition style
    • none, linear, rotate, fade, zoom, concave
  • transition-speed: transition speed
    • default, slow, fast
R Presentation
========================================================
author: Mansun Kuo
date: June 22, 2014
transition: rotate
transition-speed: fast

Slide Type

  • type: slide appearance
    • section, sub-section, prompt, alert
Type
========================================================
type: sub-section
incremental: true

Incremental Display

  • incremental: display content incrementally
    • true, false
Incremental Display
========================================================
incremental: true

- Bullet 1
- Bullet 2
- Bullet 3

Preview -> More

  • Clear Knitr Cache: clear knitr cache for this presentation
  • View in Browser: view the presentation in an external web browser
  • Save AS Web Page: save the presentation as a standalone web page
  • Publish to RPubs: publish the presentation to RPubs

Custom CSS

R Presentation
========================================================
author: Mansun Kuo
date: June 22, 2014
css: assets/css/rpres.css

Applying Styles

Apply to individual slides:

My Slide
===================================
class: illustration

Apply to spans of text:

My Slide
================================== 
<span class="emphasized">Pay attention to this!</span>

ioslides

Overview

  • A feature of RStudio that create an ioslides presentation.
  • Special Feature
    • Code Highlighting
    • Presenter mode
  • Getting Started

ioslides_start

Section

  • #: create a section
  • ##: create a new slide
  • ---: create a new slide without a header(horizontal rule)
  • |: add a subtitle
# section

## slide 1

---

## slide 2 | with subtitle

Display Modes

  • 'f' enable fullscreen mode

  • 'w' toggle widescreen mode

  • 'o' enable overview mode

  • 'h' enable code highlight mode

  • 'p' show presenter notes

  • 'Esc' exits all of these modes.

Incremental Bullets

---
output:
  ioslides_presentation:
    incremental: true
---

Render bullets incrementally for specific slide:

> - Eat eggs
> - Drink coffee
  • Eat eggs
  • Drink coffee

Presentation Size

  • widescreen: widescreen mode
  • smaller: smaller text
---
output:
  ioslides_presentation:
    widescreen: true
    smaller: true
---

Set smaller text for specific slide:

## Getting up {.smaller}

Transition Speed

transition: default, slower, faster

---
output:
  ioslides_presentation:
    transition: slower
---

Build Slides

Let content be displayed incrementally.

## Build Slides {.build}

Adding a Logo

Code Highlighting

Using ### <b> and ### </b> to enclose the lines you want to highlight.

x <- 10
y <- 20

Center

To center content on a slide:

## Center {.flexbox .vcenter}

To horizontally center content:

<div class="centered">This text is centered.</div>

Two-column

Note that the content will flow across the columns.

<div class="columns-2">
  ![Image](assets/img/Taiwan-R-logo.png)

  - Bullet 1
  - Bullet 2
  - Bullet 3
</div>

Text Color

You can color content using base color classes red, blue, green, yellow, and gray (or variations of them e.g. red2, red3, blue2, blue3, etc.).

<div class="red2">
This text is red
</div>
  • red red2 red3
  • blue blue2 blue3
  • green green2 green3
  • yellow yellow2 yellow3
  • gray gray2 gray3

Presenter Mode

To enable presenter mode:

mypresentation.html?presentme=true

To disable presenter mode:

mypresentation.html?presentme=false

logo

To add presenter notes:

<div class="notes">
This is my note.
- It can contain markdown
- like this list
</div>