7 fread
readDWD
has the argument fread
to read datasets through data.table::fread
.
This is significantly faster than base unzip + read.table
, especially for large historical files.
The default is fread=NA
, which checks for availability of the R package data.table
and the system command unzip
.
7.1 windows unzip
On Windows, the system command unzip
is not available by default, only through e.g. a git shell or Rtools (source).
If needed, install Rtools (directly at C:/Rtools since compiler paths may not have spaces, as there would be with C:/Program Files/R/Rtools/).
You might then need to add something like C:/rtools40/usr/bin
to PATH, e.g. with the R code
and then restart R (CTRL + SHIFT + F10 should suffice).
Check the availability of unzip
with
Note that github user Mightynasty did not get unzip
in Rtools to work correctly.
For more background information, see rdwd issues 22, 23 and 24.
7.2 alternative
In case you do not need fast reading and don’t want the warning about unzip, simply use readDWD("file.zip", fread=FALSE)
.
7.3 error messages
The following error messages have been reported to me.
[pfile] is the produkt file, e.g. produkt_klima_tag_19370101_19860630_00001.txt for
[path] DWDdata/daily_kl_historical_tageswerte_KL_00001_19370101_19860630_hist.zip
‘unzip’ is not recognized as an internal or external command, operable program or batch file.
'(unzip -p [path].zip [pfile].txt) > [Rtempfile]'
execution failed with error code 1
File ‘[Rtempfile]’ has size 0. Returning a NULL data.frame. File contains no rows: [path].zip
Error in data.table::fread(paste(“unzip -p”, f, fp), na.strings = na9(), : File is empty: [Rtempfile]
In addition: Warning messages:
1: running command 'C:\Windows\system32\cmd.exe /c (unzip -p [path].zip [pfile].txt) > [Rtempfile]
had status 1
2: In shell(paste("(", input, ") > ", tt, sep = "")) : '(unzip -p [path].zip [pfile].txt) > [Rtempfile]'
execution failed with error code 1
Der Befehl “unzip” ist entweder falsch geschrieben oder konnte nicht gefunden werden.
The command “unzip” is either misspelled or could not be found.