1.3 openxlsx

openxlsx是当我需要定制输出Excel表格或报表时常用R包。目前该包的版本4.2.3,通过使用Rcpp加速,包的读写速度在Excel的百万级下是可接受状态,包的相关函数功能完善且简易好用,并且正在积极开发中,相信它以后功能会越来越强大。

项目官方地址:https://ycphs.github.io/openxlsx/index.html

个人感觉主要优势:

  • 不依赖java环境
  • 读写速度可接受
  • 可设置条件格式,与Excel中『开始』选项卡的条件格式功能接近
  • 可批量插入ggplot2图
  • 可插入公式
  • 可渲染大部分Excel格式,并且效率相比部分python包高效
  • 可添加页眉页脚以及其他格式,方便直接打印
  • 功能稳定可用并且在积极开发中

版本信息查看

packageVersion("openxlsx")

本人公众号:宇飞的世界中有更加详细的阐述:https://mp.weixin.qq.com/s/ZD0dJb0y8fsWGI1dCPh2mQ

1.3.1 安装

稳定版

# 稳定版
install.packages("openxlsx", dependencies = TRUE, repos = "https://mirrors.tuna.tsinghua.edu.cn/CRAN/")

开发版

install.packages(c("Rcpp", "devtools"), dependencies = TRUE)
library(devtools)
install_github("ycphs/openxlsx")

1.3.2 基础功能

本文仅呈现基础功能部分,即读写EXCEL文件。其它功能,请查阅项目官方地址或微信公众号文章R包-openxlsx-学习笔记

1.3.2.1 读取Excel

read.xlsx()是读取函数,主要参数如下:

library(openxlsx)
read.xlsx(
  xlsxFile,
  sheet = 1,
  startRow = 1,
  colNames = TRUE,
  rowNames = FALSE,
  detectDates = FALSE,
  skipEmptyRows = TRUE,
  skipEmptyCols = TRUE,
  rows = NULL,
  cols = NULL,
  check.names = FALSE,
  sep.names = ".",
  namedRegion = NULL,
  na.strings = "NA",
  fillMergedCells = FALSE
)

以上参数中需要注意 :

detecDates参数,当你的Excel表格中带日期列时需要将参数设置为TRUE,不然将会把日期识别为数字读入。

fillMergedCells参数,当你读取的表格中存在合并单元格,将用值填充其他全部单元格,如下所示:

read.xlsx('./test.xlsx',detectDates = TRUE,fillMergedCells = TRUE)

读取后如下所示:

openxlsx-merge-xlsx

readWorkbook()也可以读取Excel表格数据,参数与read.xlsx基本一致。

xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
df1 <- readWorkbook(xlsxFile = xlsxFile, sheet = 1)

1.3.2.2 写入Excel

数据清洗完之后,或者是透视表已经完成,需要将结果从R导出到Excle,这时就利用函数将结果数据集data.frame写入Excle中。

write.xlsx()函数写入

write.xlsx(iris, file = "writeXLSX1.xlsx", colNames = TRUE, borders = "columns")

带格式输出

hs <- createStyle(
  textDecoration = "BOLD", fontColour = "#FFFFFF", fontSize = 12,
  fontName = "Arial Narrow", fgFill = "#4F80BD"
)
## Not run: 
write.xlsx(iris,
  file = "writeXLSX3.xlsx",
  colNames = TRUE, borders = "rows", headerStyle = hs
)

1.3.3 带格式输出

输出过程共四步,第一步创建workbook,第二步添加sheet,第三步写入数据,第四步保存workbook。在输出的过程中可以通过addStyle()createStyle()conditionalFormatting添加格式或条件格式。

df <- data.frame(a=1:10,b=1:10,d=1:10)
wb <- createWorkbook(creator = 'zhongyf',title = 'test')
addWorksheet(wb,sheetName = 'test')
writeData(wb,sheet = 'test',x = df)
saveWorkbook(wb, "test.xlsx", overwrite = TRUE)

我们以上面四步输出的方式,查看包自带的例子。

  • createWorkbook()

  • addWorksheet()

  • writeData()

  • saveWorkbook()

wb <- createWorkbook("Fred")

## Add 3 worksheets
addWorksheet(wb, "Sheet 1")
addWorksheet(wb, "Sheet 2", gridLines = FALSE)
addWorksheet(wb, "Sheet 3", tabColour = "red")
addWorksheet(wb, "Sheet 4", gridLines = FALSE, tabColour = "#4F81BD")

## Headers and Footers
addWorksheet(wb, "Sheet 5",
  header = c("ODD HEAD LEFT", "ODD HEAD CENTER", "ODD HEAD RIGHT"),
  footer = c("ODD FOOT RIGHT", "ODD FOOT CENTER", "ODD FOOT RIGHT"),
  evenHeader = c("EVEN HEAD LEFT", "EVEN HEAD CENTER", "EVEN HEAD RIGHT"),
  evenFooter = c("EVEN FOOT RIGHT", "EVEN FOOT CENTER", "EVEN FOOT RIGHT"),
  firstHeader = c("TOP", "OF FIRST", "PAGE"),
  firstFooter = c("BOTTOM", "OF FIRST", "PAGE")
)

addWorksheet(wb, "Sheet 6",
  header = c("&[Date]", "ALL HEAD CENTER 2", "&[Page] / &[Pages]"),
  footer = c("&[Path]&[File]", NA, "&[Tab]"),
  firstHeader = c(NA, "Center Header of First Page", NA),
  firstFooter = c(NA, "Center Footer of First Page", NA)
)

addWorksheet(wb, "Sheet 7",
  header = c("ALL HEAD LEFT 2", "ALL HEAD CENTER 2", "ALL HEAD RIGHT 2"),
  footer = c("ALL FOOT RIGHT 2", "ALL FOOT CENTER 2", "ALL FOOT RIGHT 2")
)

addWorksheet(wb, "Sheet 8",
  firstHeader = c("FIRST ONLY L", NA, "FIRST ONLY R"),
  firstFooter = c("FIRST ONLY L", NA, "FIRST ONLY R")
)

## Need data on worksheet to see all headers and footers
writeData(wb, sheet = 5, 1:400)
writeData(wb, sheet = 6, 1:400)
writeData(wb, sheet = 7, 1:400)
writeData(wb, sheet = 8, 1:400)

## Save workbook
## Not run: 
saveWorkbook(wb, "addWorksheetExample.xlsx", overwrite = TRUE)

1.3.4 函数参数

输出Excel的过程分为四步,本小节主要拆解createWorkbookaddWorksheet,writeDataTable,saveWorkbook四个函数的参数以及用法。

  • createWorkbook
createWorkbook(
  creator = ifelse(.Platform$OS.type == "windows", Sys.getenv("USERNAME"),
    Sys.getenv("USER")),
  title = NULL,
  subject = NULL,
  category = NULL
)
wb <- createWorkbook(
  creator = "宇飞的世界",
  title = "标题",
  subject = "主题",
  category = "类别目录"
)
  • addWorksheet
addWorksheet(
  wb,
  sheetName,
  gridLines = TRUE,
  tabColour = NULL,
  zoom = 100,
  header = NULL,
  footer = NULL,
  evenHeader = NULL,
  evenFooter = NULL,
  firstHeader = NULL,
  firstFooter = NULL,
  visible = TRUE,
  paperSize = getOption("openxlsx.paperSize", default = 9),
  orientation = getOption("openxlsx.orientation", default = "portrait"),
  vdpi = getOption("openxlsx.vdpi", default = getOption("openxlsx.dpi", default = 300)),
  hdpi = getOption("openxlsx.hdpi", default = getOption("openxlsx.dpi", default = 300))
)
gridLines参数:表格中是否有网格线,在Excle『视图』选项卡下面的网格线去除打勾的效果一致

tabColour参数:输出表格sheet标签颜色

zoom:发大缩小,默认是100,可选范围10-400

header:页眉 长度为3的字符向量,左、中、右三个位置,用Na可跳过一位置,以下页眉页脚相同。

footer: 页脚

evenHeader: 每页页眉

evenFooter: 每页页脚

firstHeader: 第一页页眉

firstFooter: 第一页页脚

visible:sheet是否隐藏,如果为否sheet将被隐藏

paperSize:页面大小,详见 ?pageSetup 

orientation:One of "portrait" or "landscape" 不清楚干嘛用

vdpi: 屏幕分辨率 默认值即可,不用调整

hdpi: 屏幕分辨率 默认值即可,不用调整
  • writeDataTable

writeDataTable()函数将data.frame写入Excel。

wb:即createWorkbook()函数创建

  • saveWorkbook
saveWorkbook(wb, file, overwrite = FALSE, returnValue = FALSE)

参数较为简单,wb即上文中的workbook对象,file即输出的文件名,overwrite即如果存在是否覆盖,returnValue如果设置为TRUE,返回TRUE代表保存成功

1.3.5 总结

openxlsx包功能较为强大,更多详细用法大家可自行探索,或关注我的语雀笔记,笔记会不定期持续更新。

R包openxlsx学习笔记:https://www.yuque.com/docs/share/7a768e6f-95e0-417c-a9b5-dfc8862dc6be?#

个人主页:https://www.yuque.com/zyufei