4.14 其它操作

4.14.1 strwrap

strwrap(x, width = 0.9 * getOption("width"), indent = 0,
        exdent = 0, prefix = "", simplify = TRUE, initial = prefix)

该函数把一个字符串当成一个段落的文字(不管字符串中是否有换行符),按照段落的格式(缩进和长度)和断字方式进行分行,每一行是结果中的一个字符串。

# 读取一段文本
x <- paste(readLines(file.path(R.home("doc"), "THANKS")), collapse = "\n")
## 将文本拆分为段落,且移除前三段
x <- unlist(strsplit(x, "\n[ \t\n]*\n"))[-(1:3)]
# 每一段换两行
x <- paste(x, collapse = "\n\n")
# 每一行的宽度设定为60个字符
writeLines(strwrap(x, width = 60))
## J. D. Beasley, David J. Best, Richard Brent, Kevin Buhr,
## Michael A. Covington, Bill Cleveland, Robert Cleveland,, G.
## W. Cran, C. G. Ding, Ulrich Drepper, Paul Eggert, J. O.
## Evans, David M. Gay, H. Frick, G. W. Hill, Richard H.
## Jones, Eric Grosse, Shelby Haberman, Bruno Haible, John
## Hartigan, Andrew Harvey, Trevor Hastie, Min Long Lam,
## George Marsaglia, K. J. Martin, Gordon Matzigkeit, C. R.
## Mckenzie, Jean McRae, Cyrus Mehta, Fionn Murtagh, John C.
## Nash, Finbarr O'Sullivan, R. E. Odeh, William Patefield,
## Nitin Patel, Alan Richardson, D. E. Roberts, Patrick
## Royston, Russell Lenth, Ming-Jen Shyu, Richard C.
## Singleton, S. G. Springer, Supoj Sutanthavibul, Irma
## Terpenning, G. E. Thomas, Rob Tibshirani, Wai Wan Tsang,
## Berwin Turlach, Gary V. Vaughan, Michael Wichura, Jingbo
## Wang, M. A. Wong, and the Free Software Foundation (for
## autoconf code and utilities). See also files under
## src/extras.
## 
## Many more, too numerous to mention here, have contributed
## by sending bug reports and suggesting various improvements.
## 
## Simon Davies whilst at the University of Auckland wrote the
## original version of glm().
## 
## Julian Harris and Wing Kwong (Tiki) Wan whilst at the
## University of Auckland assisted Ross Ihaka with the
## original Macintosh port.
## 
## R was inspired by the S environment which has been
## principally developed by John Chambers, with substantial
## input from Douglas Bates, Rick Becker, Bill Cleveland,
## Trevor Hastie, Daryl Pregibon and Allan Wilks.
## 
## A special debt is owed to John Chambers who has graciously
## contributed advice and encouragement in the early days of R
## and later became a member of the core team.
## 
## Stefano Iacus (up to 2014, a former member of R Core) and
## Simon Urbanek developed the macOS port, including the R.app
## GUI, toolchains and packaging.
## 
## The Windows port was originally developed by Guido
## Masarotto (for a while a member of R Core) and Brian
## Ripley, then further by Duncan Murdoch (a former member of
## R Core) and then Jeroen Ooms (base) and Uwe Ligges
## (packages).  Tomas Kalibera is the current main developer
## of the Windows port and provides assistance with package
## porting.
## 
## Tomas Kalibera's work has been sponsored by Jan Vitek and
## funded by his European Research Council grant "Evolving
## Language Ecosystems (ELE)".
## 
## Computing support (including hardware, hosting and
## infrastructure) has been provided/funded by the R
## Foundation, employers of R-Core members (notably WU Wien,
## ETH Zurich, U Oxford and U Iowa) and by Northeastern
## University and the University of Kent.
## 
## Distributions of R contain the recommended packages, whose
## authors/contributors are listed in their DESCRIPTION files.
# 每一段的段首缩进5个字符
writeLines(strwrap(x, width = 60, indent = 5))
##      J. D. Beasley, David J. Best, Richard Brent, Kevin
## Buhr, Michael A. Covington, Bill Cleveland, Robert
## Cleveland,, G. W. Cran, C. G. Ding, Ulrich Drepper, Paul
## Eggert, J. O. Evans, David M. Gay, H. Frick, G. W. Hill,
## Richard H. Jones, Eric Grosse, Shelby Haberman, Bruno
## Haible, John Hartigan, Andrew Harvey, Trevor Hastie, Min
## Long Lam, George Marsaglia, K. J. Martin, Gordon
## Matzigkeit, C. R. Mckenzie, Jean McRae, Cyrus Mehta, Fionn
## Murtagh, John C. Nash, Finbarr O'Sullivan, R. E. Odeh,
## William Patefield, Nitin Patel, Alan Richardson, D. E.
## Roberts, Patrick Royston, Russell Lenth, Ming-Jen Shyu,
## Richard C. Singleton, S. G. Springer, Supoj Sutanthavibul,
## Irma Terpenning, G. E. Thomas, Rob Tibshirani, Wai Wan
## Tsang, Berwin Turlach, Gary V. Vaughan, Michael Wichura,
## Jingbo Wang, M. A. Wong, and the Free Software Foundation
## (for autoconf code and utilities). See also files under
## src/extras.
## 
##      Many more, too numerous to mention here, have
## contributed by sending bug reports and suggesting various
## improvements.
## 
##      Simon Davies whilst at the University of Auckland
## wrote the original version of glm().
## 
##      Julian Harris and Wing Kwong (Tiki) Wan whilst at the
## University of Auckland assisted Ross Ihaka with the
## original Macintosh port.
## 
##      R was inspired by the S environment which has been
## principally developed by John Chambers, with substantial
## input from Douglas Bates, Rick Becker, Bill Cleveland,
## Trevor Hastie, Daryl Pregibon and Allan Wilks.
## 
##      A special debt is owed to John Chambers who has
## graciously contributed advice and encouragement in the
## early days of R and later became a member of the core team.
## 
##      Stefano Iacus (up to 2014, a former member of R Core)
## and Simon Urbanek developed the macOS port, including the
## R.app GUI, toolchains and packaging.
## 
##      The Windows port was originally developed by Guido
## Masarotto (for a while a member of R Core) and Brian
## Ripley, then further by Duncan Murdoch (a former member of
## R Core) and then Jeroen Ooms (base) and Uwe Ligges
## (packages).  Tomas Kalibera is the current main developer
## of the Windows port and provides assistance with package
## porting.
## 
##      Tomas Kalibera's work has been sponsored by Jan Vitek
## and funded by his European Research Council grant "Evolving
## Language Ecosystems (ELE)".
## 
##      Computing support (including hardware, hosting and
## infrastructure) has been provided/funded by the R
## Foundation, employers of R-Core members (notably WU Wien,
## ETH Zurich, U Oxford and U Iowa) and by Northeastern
## University and the University of Kent.
## 
##      Distributions of R contain the recommended packages,
## whose authors/contributors are listed in their DESCRIPTION
## files.
# 除了段首,每一段的余下诸行都缩进5个字符
writeLines(strwrap(x, width = 60, exdent = 5))
## J. D. Beasley, David J. Best, Richard Brent, Kevin Buhr,
##      Michael A. Covington, Bill Cleveland, Robert
##      Cleveland,, G. W. Cran, C. G. Ding, Ulrich Drepper,
##      Paul Eggert, J. O. Evans, David M. Gay, H. Frick, G.
##      W. Hill, Richard H. Jones, Eric Grosse, Shelby
##      Haberman, Bruno Haible, John Hartigan, Andrew Harvey,
##      Trevor Hastie, Min Long Lam, George Marsaglia, K. J.
##      Martin, Gordon Matzigkeit, C. R. Mckenzie, Jean McRae,
##      Cyrus Mehta, Fionn Murtagh, John C. Nash, Finbarr
##      O'Sullivan, R. E. Odeh, William Patefield, Nitin
##      Patel, Alan Richardson, D. E. Roberts, Patrick
##      Royston, Russell Lenth, Ming-Jen Shyu, Richard C.
##      Singleton, S. G. Springer, Supoj Sutanthavibul, Irma
##      Terpenning, G. E. Thomas, Rob Tibshirani, Wai Wan
##      Tsang, Berwin Turlach, Gary V. Vaughan, Michael
##      Wichura, Jingbo Wang, M. A. Wong, and the Free
##      Software Foundation (for autoconf code and utilities).
##      See also files under src/extras.
## 
## Many more, too numerous to mention here, have contributed
##      by sending bug reports and suggesting various
##      improvements.
## 
## Simon Davies whilst at the University of Auckland wrote the
##      original version of glm().
## 
## Julian Harris and Wing Kwong (Tiki) Wan whilst at the
##      University of Auckland assisted Ross Ihaka with the
##      original Macintosh port.
## 
## R was inspired by the S environment which has been
##      principally developed by John Chambers, with
##      substantial input from Douglas Bates, Rick Becker,
##      Bill Cleveland, Trevor Hastie, Daryl Pregibon and
##      Allan Wilks.
## 
## A special debt is owed to John Chambers who has graciously
##      contributed advice and encouragement in the early days
##      of R and later became a member of the core team.
## 
## Stefano Iacus (up to 2014, a former member of R Core) and
##      Simon Urbanek developed the macOS port, including the
##      R.app GUI, toolchains and packaging.
## 
## The Windows port was originally developed by Guido
##      Masarotto (for a while a member of R Core) and Brian
##      Ripley, then further by Duncan Murdoch (a former
##      member of R Core) and then Jeroen Ooms (base) and Uwe
##      Ligges (packages).  Tomas Kalibera is the current main
##      developer of the Windows port and provides assistance
##      with package porting.
## 
## Tomas Kalibera's work has been sponsored by Jan Vitek and
##      funded by his European Research Council grant
##      "Evolving Language Ecosystems (ELE)".
## 
## Computing support (including hardware, hosting and
##      infrastructure) has been provided/funded by the R
##      Foundation, employers of R-Core members (notably WU
##      Wien, ETH Zurich, U Oxford and U Iowa) and by
##      Northeastern University and the University of Kent.
## 
## Distributions of R contain the recommended packages, whose
##      authors/contributors are listed in their DESCRIPTION
##      files.
# 在输出的每一行前面添加前缀
writeLines(strwrap(x, prefix = "THANKS> "))
## THANKS> J. D. Beasley, David J. Best, Richard Brent, Kevin Buhr,
## THANKS> Michael A. Covington, Bill Cleveland, Robert Cleveland,, G. W.
## THANKS> Cran, C. G. Ding, Ulrich Drepper, Paul Eggert, J. O. Evans,
## THANKS> David M. Gay, H. Frick, G. W. Hill, Richard H. Jones, Eric
## THANKS> Grosse, Shelby Haberman, Bruno Haible, John Hartigan, Andrew
## THANKS> Harvey, Trevor Hastie, Min Long Lam, George Marsaglia, K. J.
## THANKS> Martin, Gordon Matzigkeit, C. R. Mckenzie, Jean McRae, Cyrus
## THANKS> Mehta, Fionn Murtagh, John C. Nash, Finbarr O'Sullivan, R. E.
## THANKS> Odeh, William Patefield, Nitin Patel, Alan Richardson, D. E.
## THANKS> Roberts, Patrick Royston, Russell Lenth, Ming-Jen Shyu, Richard
## THANKS> C. Singleton, S. G. Springer, Supoj Sutanthavibul, Irma
## THANKS> Terpenning, G. E. Thomas, Rob Tibshirani, Wai Wan Tsang, Berwin
## THANKS> Turlach, Gary V. Vaughan, Michael Wichura, Jingbo Wang, M. A.
## THANKS> Wong, and the Free Software Foundation (for autoconf code and
## THANKS> utilities). See also files under src/extras.
## THANKS> 
## THANKS> Many more, too numerous to mention here, have contributed by
## THANKS> sending bug reports and suggesting various improvements.
## THANKS> 
## THANKS> Simon Davies whilst at the University of Auckland wrote the
## THANKS> original version of glm().
## THANKS> 
## THANKS> Julian Harris and Wing Kwong (Tiki) Wan whilst at the
## THANKS> University of Auckland assisted Ross Ihaka with the original
## THANKS> Macintosh port.
## THANKS> 
## THANKS> R was inspired by the S environment which has been principally
## THANKS> developed by John Chambers, with substantial input from Douglas
## THANKS> Bates, Rick Becker, Bill Cleveland, Trevor Hastie, Daryl
## THANKS> Pregibon and Allan Wilks.
## THANKS> 
## THANKS> A special debt is owed to John Chambers who has graciously
## THANKS> contributed advice and encouragement in the early days of R and
## THANKS> later became a member of the core team.
## THANKS> 
## THANKS> Stefano Iacus (up to 2014, a former member of R Core) and Simon
## THANKS> Urbanek developed the macOS port, including the R.app GUI,
## THANKS> toolchains and packaging.
## THANKS> 
## THANKS> The Windows port was originally developed by Guido Masarotto
## THANKS> (for a while a member of R Core) and Brian Ripley, then further
## THANKS> by Duncan Murdoch (a former member of R Core) and then Jeroen
## THANKS> Ooms (base) and Uwe Ligges (packages).  Tomas Kalibera is the
## THANKS> current main developer of the Windows port and provides
## THANKS> assistance with package porting.
## THANKS> 
## THANKS> Tomas Kalibera's work has been sponsored by Jan Vitek and
## THANKS> funded by his European Research Council grant "Evolving
## THANKS> Language Ecosystems (ELE)".
## THANKS> 
## THANKS> Computing support (including hardware, hosting and
## THANKS> infrastructure) has been provided/funded by the R Foundation,
## THANKS> employers of R-Core members (notably WU Wien, ETH Zurich, U
## THANKS> Oxford and U Iowa) and by Northeastern University and the
## THANKS> University of Kent.
## THANKS> 
## THANKS> Distributions of R contain the recommended packages, whose
## THANKS> authors/contributors are listed in their DESCRIPTION files.

再举一个烧脑的例子

x <- paste(sapply(
  sample(10, 100, replace = TRUE), # 从1-10个数字中有放回的随机抽取100个数
  function(x) substring("aaaaaaaaaa", 1, x)
), collapse = " ")
sapply(
  10:40,
  function(m)
    c(target = m, actual = max(nchar(strwrap(x, m))))
)
##        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## target   10   11   12   13   14   15   16   17   18    19    20    21    22
## actual   10   10   11   12   13   14   15   16   17    18    19    20    21
##        [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
## target    23    24    25    26    27    28    29    30    31    32    33    34
## actual    22    23    24    25    26    27    28    29    30    31    31    33
##        [,26] [,27] [,28] [,29] [,30] [,31]
## target    35    36    37    38    39    40
## actual    34    35    36    37    38    39

4.14.2 strtrim

strtrim(x, width)

strtrim 函数将字符串x修剪到特定的显示宽度,返回的字符串向量的长度等于字符串向量 x 的长度,如果 width 的参数值(它是一个整型向量)的长度小于 x 的,就循环补齐。

strtrim(c("abcdef", "abcdef", "abcdef"), c(1, 5, 10))
## [1] "a"      "abcde"  "abcdef"

4.14.3 strrep

strrep(x, times)

以给定的次数重复字符串向量中每个元素的个数,并连接字符串的各个副本

strrep("ABC", 2)
## [1] "ABCABC"
strrep(c("A", "B", "C"), 1 : 3)
## [1] "A"   "BB"  "CCC"
# 创建一个字符串向量,指定每个元素中空格的数量
strrep(" ", 1 : 5)
## [1] " "     "  "    "   "   "    "  "     "

4.14.4 trimws

trimws(x, which = c("both", "left", "right"), whitespace = "[ \t\r\n]")

trimws 函数用于移除字符串中的空格,这种空格可以来自制表符、回车符和换行符,位置可以位于字符串的开头或者结尾,which 参数指定空格的大致位置。举例如下

x <- "  Some text. "
x
## [1] "  Some text. "
trimws(x)
## [1] "Some text."
trimws(x, "l")
## [1] "Some text. "
trimws(x, "r")
## [1] "  Some text."
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")

stringr::str_replace(string = shopping_list, pattern = "\\d", replacement = "aa")
## [1] "apples xaa"   "bag of flour" "bag of sugar" "milk xaa"
# https://github.com/hadley/stringb/issues/5
# x is vector
str_replace <- function(x, pattern, fun, ...) {
  loc <- gregexpr(pattern, text = x, perl = TRUE)
  matches <- regmatches(x, loc)
  out <- lapply(matches, fun, ...)

  regmatches(x, loc) <- out
  x
}


loc <- gregexpr(pattern = "\\d", text = shopping_list, perl = TRUE)

matches = regmatches(x = shopping_list, loc)

matches

out <- lapply(matches, transform, "aa")

regmatches(x = shopping_list, loc) <- out


shopping_list


str_replace(shopping_list, pattern = "\\\\d", replace = "aa")

4.14.5 tolower

tolower 和 toupper 是一对,将大写转小写,小写转大写

simpleCap <- function(x) {
  x <- tolower(x)
  s <- strsplit(x, " ")[[1]]
  paste(toupper(substring(s, 1, 1)), substring(s, 2),
    sep = "", collapse = " "
  )
}
# 参考文献条目里需要将每个英文单词的首字母大写
simpleCap(x = "THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS")
## [1] "The Use Of Multiple Measurements In Taxonomic Problems"