1.3 openxlsx
openxlsx是当我需要定制输出Excel表格或报表时常用R包。目前该包的版本4.2.3,通过使用Rcpp加速,包的读写速度在Excel的百万级下是可接受状态,包的相关函数功能完善且简易好用,并且正在积极开发中,相信它以后功能会越来越强大。
项目官方地址:https://ycphs.github.io/openxlsx/index.html
个人感觉主要优势:
- 不依赖java环境
- 读写速度可接受
- 可设置条件格式,与Excel中『开始』选项卡的条件格式功能接近
- 可批量插入ggplot2图
- 可插入公式
- 可渲染大部分Excel格式,并且效率相比部分python包高效
- 可添加页眉页脚以及其他格式,方便直接打印
- 功能稳定可用并且在积极开发中
版本信息查看
packageVersion("openxlsx")
本人公众号:宇飞的世界中有更加详细的阐述:https://mp.weixin.qq.com/s/ZD0dJb0y8fsWGI1dCPh2mQ
1.3.1 安装
稳定版
# 稳定版
install.packages("openxlsx", dependencies = TRUE, repos = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/")
开发版
install.packages(c("Rcpp", "devtools"), dependencies = TRUE)
library(devtools)
install_github("ycphs/openxlsx")
1.3.2 基础功能
本文仅呈现基础功能部分,即读写EXCEL文件。其它功能,请查阅项目官方地址或微信公众号文章R包-openxlsx-学习笔记
1.3.2.1 读取Excel
read.xlsx()是读取函数,主要参数如下:
library(openxlsx)
read.xlsx(
xlsxFile,sheet = 1,
startRow = 1,
colNames = TRUE,
rowNames = FALSE,
detectDates = FALSE,
skipEmptyRows = TRUE,
skipEmptyCols = TRUE,
rows = NULL,
cols = NULL,
check.names = FALSE,
sep.names = ".",
namedRegion = NULL,
na.strings = "NA",
fillMergedCells = FALSE
)
以上参数中需要注意 :
detecDates参数,当你的Excel表格中带日期列时需要将参数设置为TRUE,不然将会把日期识别为数字读入。
fillMergedCells参数,当你读取的表格中存在合并单元格,将用值填充其他全部单元格,如下所示:
read.xlsx('./test.xlsx',detectDates = TRUE,fillMergedCells = TRUE)
读取后如下所示:
readWorkbook()也可以读取Excel表格数据,参数与read.xlsx基本一致。
<- system.file("extdata", "readTest.xlsx", package = "openxlsx")
xlsxFile <- readWorkbook(xlsxFile = xlsxFile, sheet = 1) df1
1.3.2.2 写入Excel
数据清洗完之后,或者是透视表已经完成,需要将结果从R导出到Excle,这时就利用函数将结果数据集data.frame写入Excle中。
write.xlsx()函数写入
write.xlsx(iris, file = "writeXLSX1.xlsx", colNames = TRUE, borders = "columns")
带格式输出
<- createStyle(
hs textDecoration = "BOLD", fontColour = "#FFFFFF", fontSize = 12,
fontName = "Arial Narrow", fgFill = "#4F80BD"
)## Not run:
write.xlsx(iris,
file = "writeXLSX3.xlsx",
colNames = TRUE, borders = "rows", headerStyle = hs
)
1.3.3 带格式输出
输出过程共四步,第一步创建workbook,第二步添加sheet,第三步写入数据,第四步保存workbook。在输出的过程中可以通过addStyle()
、createStyle()
或conditionalFormatting
添加格式或条件格式。
<- data.frame(a=1:10,b=1:10,d=1:10)
df <- createWorkbook(creator = 'zhongyf',title = 'test')
wb addWorksheet(wb,sheetName = 'test')
writeData(wb,sheet = 'test',x = df)
saveWorkbook(wb, "test.xlsx", overwrite = TRUE)
我们以上面四步输出的方式,查看包自带的例子。
createWorkbook()
addWorksheet()
writeData()
saveWorkbook()
<- createWorkbook("Fred")
wb
## Add 3 worksheets
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2", gridLines = FALSE)
addWorksheet(wb, "Sheet 3", tabColour = "red")
addWorksheet(wb, "Sheet 4", gridLines = FALSE, tabColour = "#4F81BD")
## Headers and Footers
addWorksheet(wb, "Sheet 5",
header = c("ODD HEAD LEFT", "ODD HEAD CENTER", "ODD HEAD RIGHT"),
footer = c("ODD FOOT RIGHT", "ODD FOOT CENTER", "ODD FOOT RIGHT"),
evenHeader = c("EVEN HEAD LEFT", "EVEN HEAD CENTER", "EVEN HEAD RIGHT"),
evenFooter = c("EVEN FOOT RIGHT", "EVEN FOOT CENTER", "EVEN FOOT RIGHT"),
firstHeader = c("TOP", "OF FIRST", "PAGE"),
firstFooter = c("BOTTOM", "OF FIRST", "PAGE")
)
addWorksheet(wb, "Sheet 6",
header = c("&[Date]", "ALL HEAD CENTER 2", "&[Page] / &[Pages]"),
footer = c("&[Path]&[File]", NA, "&[Tab]"),
firstHeader = c(NA, "Center Header of First Page", NA),
firstFooter = c(NA, "Center Footer of First Page", NA)
)
addWorksheet(wb, "Sheet 7",
header = c("ALL HEAD LEFT 2", "ALL HEAD CENTER 2", "ALL HEAD RIGHT 2"),
footer = c("ALL FOOT RIGHT 2", "ALL FOOT CENTER 2", "ALL FOOT RIGHT 2")
)
addWorksheet(wb, "Sheet 8",
firstHeader = c("FIRST ONLY L", NA, "FIRST ONLY R"),
firstFooter = c("FIRST ONLY L", NA, "FIRST ONLY R")
)
## Need data on worksheet to see all headers and footers
writeData(wb, sheet = 5, 1:400)
writeData(wb, sheet = 6, 1:400)
writeData(wb, sheet = 7, 1:400)
writeData(wb, sheet = 8, 1:400)
## Save workbook
## Not run:
saveWorkbook(wb, "addWorksheetExample.xlsx", overwrite = TRUE)
1.3.4 函数参数
输出Excel的过程分为四步,本小节主要拆解createWorkbook
,addWorksheet
,writeDataTable
,saveWorkbook
四个函数的参数以及用法。
- createWorkbook
createWorkbook(
creator = ifelse(.Platform$OS.type == "windows", Sys.getenv("USERNAME"),
Sys.getenv("USER")),
title = NULL,
subject = NULL,
category = NULL
)
<- createWorkbook(
wb creator = "宇飞的世界",
title = "标题",
subject = "主题",
category = "类别目录"
)
- addWorksheet
addWorksheet(
wb,
sheetName,gridLines = TRUE,
tabColour = NULL,
zoom = 100,
header = NULL,
footer = NULL,
evenHeader = NULL,
evenFooter = NULL,
firstHeader = NULL,
firstFooter = NULL,
visible = TRUE,
paperSize = getOption("openxlsx.paperSize", default = 9),
orientation = getOption("openxlsx.orientation", default = "portrait"),
vdpi = getOption("openxlsx.vdpi", default = getOption("openxlsx.dpi", default = 300)),
hdpi = getOption("openxlsx.hdpi", default = getOption("openxlsx.dpi", default = 300))
)
gridLines参数:表格中是否有网格线,在Excle『视图』选项卡下面的网格线去除打勾的效果一致
tabColour参数:输出表格sheet标签颜色
-400
zoom:发大缩小,默认是100,可选范围10
:页眉 长度为3的字符向量,左、中、右三个位置,用Na可跳过一位置,以下页眉页脚相同。
header
: 页脚
footer
: 每页页眉
evenHeader
: 每页页脚
evenFooter
: 第一页页眉
firstHeader
: 第一页页脚
firstFooter
:sheet是否隐藏,如果为否sheet将被隐藏
visible
:页面大小,详见 ?pageSetup
paperSize
:One of "portrait" or "landscape" 不清楚干嘛用
orientation
: 屏幕分辨率 默认值即可,不用调整
vdpi
: 屏幕分辨率 默认值即可,不用调整 hdpi
- writeDataTable
writeDataTable()函数将data.frame写入Excel。
wb:即createWorkbook()函数创建
- saveWorkbook
saveWorkbook(wb, file, overwrite = FALSE, returnValue = FALSE)
参数较为简单,wb即上文中的workbook对象,file即输出的文件名,overwrite即如果存在是否覆盖,returnValue如果设置为TRUE,返回TRUE代表保存成功
1.3.5 总结
openxlsx包功能较为强大,更多详细用法大家可自行探索,或关注我的语雀笔记,笔记会不定期持续更新。
R包openxlsx学习笔记:https://www.yuque.com/docs/share/7a768e6f-95e0-417c-a9b5-dfc8862dc6be?#