library(dplyr); library(stringr); library(purrr); library(lubridate)
In computer programming, an application programming interface (API) is a set of subroutine definitions, communication protocols, and tools for building software. In general terms, it is a set of clearly defined methods of communication among various components.
Representational State Transfer (REST) is a software architectural style that defines a set of constraints to be used for creating web services. Web services that conform to the REST architectural style, termed RESTful web services, provide interoperability between computer systems on the Internet. RESTful web services allow the requesting systems to access and manipulate textual representations of web resources by using a uniform and predefined set of stateless operations.
stateless operation: The client–server communication is constrained by no client context being stored on the server between requests. Each request from any client contains all the information necessary to service the request, and session state is held in the client.
an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet.
HTTP:
an application protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen. HTTP was developed to facilitate hypertext and the World Wide Web.
A typical request message that follows HTTP should have:
a request line (e.g., GET /images/logo.png HTTP/1.1, which requests a resource called /images/logo.png from the server.)
範例:
資源路徑:/orgs/octokit/repos
資源伺服器網址: https://api.github.com
點https://api.github.com/orgs/octokit/repos可取得所要資源
資源伺服器網址也叫做root endpoint。
request header fields (e.g., Accept-Language: en).
an empty line
an optional message body
Current HTTP allows GET, PUT, POST, DELETE methods, etc. Among them,
GET: 取得資源(資料)
POST: 新增資源
a status line which includes the status code and reason message (e.g., HTTP/1.1 200 OK, which indicates that the client’s request succeeded.)
response header fields (e.g., Content-Type: text/html)
an empty line
an optional message body
由前面敍述我們知道REST API是一個符合HTTP規則的client(你的電腦)與server(GitHub.com)溝通方式,此溝通方式為REST形式——只用來進行文字化的資訊操作,且server端並不會儲存client端的私人訊息。
因為是HTTP規則,所以要送出請求(request message)時,必需要包含之前提及的:
a request line: 資料伺服器位置,及所需資料路徑
request header fields
an empty line
an optional message body
Reference: https://developer.github.com/v3/
REST API的schema必需要細讀,GitHub描述到:
All API access is over HTTPS, and accessed from https://api.github.com.
資料伺服器: https://api.github.com
All data is sent and received as JSON.
Response message會是JSON格式。
Three authentication ways: https://developer.github.com/v3/#authentication
涉及個人資料的請求會需要認證,這裡有三種方法,認證資訊可放header也可放parameter。
query parameters (查詢參數):
提到兩種查詢參數方式,視Request Method而定:
For GET requests, any parameters not specified as a segment in the path(指資源路徑) can be passed as an HTTP query string parameter——即在資源路徑的後面接著?key1=value1&key2+value2....
等,key
為要查詢參數名稱,value
為查詢符合的條件值,注意要以問號開始。
範例:https://api.github.com/repos/vmg/redcarpet/issues?state=closed
For POST, PATCH, PUT, and DELETE requests, parameters not included in the URL should be encoded as JSON with a Content-Type of ‘application/json’. 指以JSON格式來描述並放到Request message的message body裡。
Host: 是伺服器網址,可想像成台北大學圖書館位置。
Path: 是所要的書本在圖書館「內」的位置。
Query String: 以?
開頭,是在Path所在資源下的進一步搜尋條件,可想像Host/Path帶你到了台北大學圖書館的英文期刊經濟類書架,你在想書架上找出所有「產業經濟學」的期刊。
如果對照前面Request message的名詞:
第一行是Request line,說明了到了圖書館後要做的事(即Method和Query String等)地(即Path)。
接著的區塊都是Header fields。
接著以空行區隔。
接著是optional message body(即圖中的payload)。
由於REST API只是一種HTTP規格的網路訊息溝通,我們只需要能完成HTTP溝通的套件即可,這裡我們使用httr(架構在curl套件上,但也可以用更原始的curl套件)。
httr援HTTP所有的Method透過對應的GET(),PUT(),POST(),DELETE()
等。
Reference:
httr: https://cran.r-project.org/web/packages/httr/vignettes/quickstart.html
curl: https://ec.haxx.se/
library(httr)
GET,PUT,…等有時稱之為HTTP verbs,以下用METHOD()
來代表這些verb函數。其使用方法主要為:
METHOD(
hostPath,
add_headers(...),
body
)
三個input arguments分別對應request message的三個區塊,
GET /events
HOST https://api.github.com
那hostPath就是"https://api.github.com/events"
.
要留意的是:如果有query parameter, 可以直接在hostPath再以
?
開始延伸,如:"http://example.com/path/to/page?name=ferret&color=purple"
它要查詢name=ferret且color=purple的資訊。
add_headers(…): Header fields;...
是name=value一串用逗號分隔的設定。
body: optional message body; 為named list,即要附帶的訊息必需要是list(...)
形成的list object,...
也是一串逗號分隔的name=value訊息。
其中hostPath一定要有,另兩項視METHOD要求而定(要讀REST API使用說明)。
各個雲端平台如果要使用其API,幾乎都要先至其developer site註冊,請先到以下平台註冊(使用你的github account): https://developer.github.com/program/
有些API功能會涉及使用者隱私,因此會有認證需要。為以下練習請先依以下說明產生「個資使用權杖」(access token):https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/
由於權杖等於是個資使用鑰匙,所以要保存好,盡量不要直接寫在你的練習Rmd中。一個方法是在Console視窗執行Sys.setenv(github_token="你的access token值")
,之後需要用到它是用Sys.getenv("github_token")
叫出。
Github所有的動作都會形成一個event,如commit, push, pull request, create an issue等等。
這個範例在列出某個使用者產生的(公開)events。
List events performed by a user: https://developer.github.com/v3/activity/events/#list-events-performed-by-a-user
說明:
GET /users/:username/events
:username
:使用者名稱請把username換成你自己的
username<-"tpemartin"
"/users/:username/events" %>%
str_replace("(:username)",username) %>%
paste0("https://api.github.com",.) -> hostPath
hostPath
response <-GET(hostPath) # 完成請求
response # 查看
reponse內容有兩大部份:
Response [https://api.github.com/users/tpemartin/events]
Date: 2019-02-14 21:35
Status: 200
Content-Type: application/json; charset=utf-8
Size: 135 kB
{
"id": "9032479754",
"type": "WatchEvent",
"actor": {
"id": 6549594,
"login": "tpemartin",
"display_login": "tpemartin",
"gravatar_id": "",
"url": "https://api.github.com/users/...
...
HTTP回傳的查詢資訊稱之為Payload。
不同event的payload說明:https://developer.github.com/v3/activity/events/types/
content()
可將reponse存成list物件。
response %>% content() -> responseContent # 查看內容
統計各類event次數
responseContent %>%
map(~.x$type) %>%
unlist %>%
table
資訊限制: The fixed page size is 30 items. Fetching up to ten pages is supported, for a total of 300 events. Only events created within the past 90 days will be included in timelines.
資訊限制提到,每次回傳資訊只會有30項(形成一頁),這表示超過時得進行分頁查詢。
Pagination reference: https://developer.github.com/v3/#pagination
Requests that return multiple items will be paginated to 30 items by default. You can specify further pages with the ?page parameter. For some resources, you can also set a custom page size up to 100 with the ?per_page parameter.
這裡?page
及?per_page
表示直接在path後方加query string即可。
查第2頁:
hostPath %>%
paste0("?page=2")
response2 <-GET(hostPath)
response2 %>% content() -> responseContent2 #
這個範例在列出所有Assigned給某使用者的issues.
如果不清楚GitHub issue是什麼,可以點https://github.com/Jane0901/hackathon-practice/issues
Reference: https://developer.github.com/v3/issues/
GET /issues
"/issues" %>%
paste0("https://api.github.com",.) -> hostPath
hostPath
hostPath %>%
paste0("?filter=assigned") -> hostPath
GET(hostPath)
這裡會得到404的錯誤訊息,因為issues說明裡有提到:
List all issues assigned to the authenticated user ….
也就是請求訊息必需包含使用者認識(authentication)
延續先前範例,但加上使用者授權。
之前有教授access token設定方式,那個方式每次打開RStudio都要再重執行一次,如果要一勞永逸,要以RProject方式來開啟RStudio(不清楚如何以RProject方式開啟,可跳過以下兩步驟)。
步驟1: 建立.Rprofile
檔
file.edit(".Rprofile")
此檔內容請放此行,接著儲存關掉。
Sys.setenv(github_token="Your_token")
步驟2:重新以RProject方式開啟RStudio。
使用Query方式來授權,說明:https://developer.github.com/v3/#authentication
與先前唯一的不同是多了?access_token
的hostPath延伸:
# HostPath
"/issues" %>%
paste0("https://api.github.com",.) -> hostPath
hostPath
# Adding query
hostPath %>%
paste0("?filter=assigned&access_token=",Sys.getenv("github_token")) -> hostPath
GET(hostPath)-> responseOAuth
responseOAuth %>% content() -> responseOAuthList
查詢指定repo內的issues
GET /repos/:owner/:repo/issues
# HostPath setup
"/repos/:owner/:repo/issues" %>%
str_replace_all(
c(
"(:owner)"="Jane0901",
"(:repo)"="hackathon-practice"
)
) %>%
paste0("https://api.github.com",.) -> hostPath
hostPath
# Adding query
hostPath %>%
paste0("?access_token=",Sys.getenv("github_token")) -> hostPath
# Request
hostPath %>%
GET -> responseOAuth2
responseOAuth2 %>% content() -> responseOAuthList2
以下說明有用到token部份,老師是用Sys.setenv(…)存在.Rprofile,再用Sys.getenv(…)取出。如果你不了解這意思,你可以把Sys.get(…)部份直接換成你的token貼上。
gitter.im chatroom
$token
的部份就換成它。Reference: https://developer.gitter.im/docs/messages-resource
HTTP verb expression
POST /v1/rooms/:roomId/chatMessages
curl complete command: 此說明說有同時列出使用curl語法的說明,其內容:
$ curl -X POST -i -H "Content-Type: application/json" -H "Accept: application/json" -H "Authorization: Bearer $token" "https://api.gitter.im/v1/rooms/:roomId/chatMessages" -d '{"text":"You should also check https://irc.gitter.im/"}'
我們並不直接使用curl,但上述的說明指出所有元素:
Method: POST
hostPath: “https://api.gitter.im/v1/rooms/:roomId/chatMessages”
:roomID
是指hackathon-practice聊天室的代碼(要先查出來)headers: 有三個,
“Content-Type: application/json”,“Accept: application/json”,“Authorization: Bearer $token”
一般HTTP套件json是內定格式用不設
body message: {“text”:“You should also check https://irc.gitter.im/”}
# 以下還缺:roomID故無法執行
hostPath<-"https://api.gitter.im/v1/rooms/:roomId/chatMessages"
POST(hostPath,
add_headers(
Authorization=paste0("Bearer ",Sys.getenv("gitter_token"))
),
body=list(
text="'來自03-REST-API.Rmd'程式所發出的練習"
))
Reference: https://developer.gitter.im/docs/rooms-resource - scroll to List rooms section.
GET /v1/rooms
curl expression
$ curl -i -H "Accept: application/json" -H "Authorization: Bearer $token" "https://api.gitter.im/v1/rooms"
查詢所有「我」在裡面的聊天室
hostPath <-
"https://api.gitter.im/v1/rooms"
GET(hostPath,
add_headers(
Authorization=paste0("Bearer ",Sys.getenv("gitter_token"))
)
) %>% content -> allMyRooms
找出其中name
有hackathon-practice
的
allMyRooms %>%
map(function(.x) str_detect(.x$name,"hackathon-practice")) %>%
unlist %>% which %>%
allMyRooms[[.]]
得到roomID為“5c3f54eed73408ce4fb4f63d”
# 把:roomID補上
hostPath<-"https://api.gitter.im/v1/rooms/5c3f54eed73408ce4fb4f63d/chatMessages"
POST(hostPath,
add_headers(
Authorization=paste0("Bearer ",Sys.getenv("gitter_token"))
),
body=list(
text="'來自03-REST-API.Rmd'程式所發出的練習,請勿理會"
),
encode = "json") %>% content
注意:這裡多加了
encode="json"
,這樣body message才會以gitter API所要的json格式送出。
這個練習在透過程式取得最新氣象資訊
依使用說明取得授權碼。
API 說明:http://opendata.cwb.gov.tw/dist/opendata-swagger.html
新北市樹林區未來2天天氣預報
在API說明找到新北市樹林區未來2天天氣預報,點它會有try it out;再點try it out會出現測試表單。填上你的$token,及各縣市所對應鄉鎮名稱填上樹林區,點表單下方的execute,會出現如下curl expression:
curl -X GET "https://opendata.cwb.gov.tw/api/v1/rest/datastore/F-D0047-069?Authorization=$token&locationName=%E6%A8%B9%E6%9E%97%E5%8D%80," -H "accept: application/json"
注意:authorization是以query形式進行。locationName是中文編碼結果。
hostPath <- "https://opendata.cwb.gov.tw/api/v1/rest/datastore/F-D0047-069"
# Add query
hostPath %>%
paste0("?Authorization=",Sys.getenv("cwb_token"),"&locationName=%E6%A8%B9%E6%9E%97%E5%8D%80") -> hostPath
GET(hostPath) %>% content -> ntpuWeatherIn2Days
查詢結果:
ntpuWeatherIn2Days[["records"]][["locations"]][[1]][["location"]][[1]]$weatherElement
綜合練習1、2用程式在聊天室通知大家今天氣象:
ntpuWeatherIn2Days[["records"]][["locations"]][[1]][["location"]][[1]]$weatherElement %>% map(~.x$description)
todayWeather<-ntpuWeatherIn2Days[["records"]][["locations"]][[1]][["location"]][[1]]$weatherElement
todayWeather[[1]]$time[[1]]$elementValue[[1]]$value-> humidity
todayWeather[[4]]$time[[1]]$elementValue[[1]]$value-> temperature
todayWeather[[3]]$time[[1]]$elementValue[[1]]$value-> feelLike
textMessage<-"大家早,我是hackathon氣象播報機器人。目前溫度為攝氐_temperature_,體感溫度為攝氐_feelLike_,濕度為_humidity_%。想了解我是怎麼運作的,請學習03-REST-API.Rmd,並發揮你無限想像空限創造練習實例。 Cheers。"
textMessage %>%
str_replace_all(
c(
"_temperature_"=temperature,
"_feelLike_"=feelLike,
"_humidity_"=humidity
)
) -> textMessage
#textMessage
hostPath<-"https://api.gitter.im/v1/rooms/5c3f54eed73408ce4fb4f63d/chatMessages"
POST(hostPath,
add_headers(
Authorization=paste0("Bearer ",Sys.getenv("gitter_token"))
),
body=list(
text=textMessage
),
encode = "json") %>% content