Parse Web of Science (WoS) export files in multiple formats and return a
tidy table. The function automatically dispatches to a specialized parser
based on the format
argument and can also download from a URL if
file
points to an http://
or https://
resource.
Arguments
- file
Character scalar or vector. Path(s) to a WoS export file, or a single URL (
http://
orhttps://
) pointing to a WoS export.- format
Character scalar. Export format; one of
"bib"
,"ris"
,"txt-plain-text"
, or"txt-tab-delimited"
.- normalized_names
Logical. If
TRUE
(default), use standardized column names when possible; ifFALSE
, keep original WoS field tags.
Value
A tibble with the parsed WoS records. See Details for notes on
added/coerced columns (DI2
, PY
, DB
) and column ordering.
Details
file
may be a single path/URL or a vector of paths; multiple files will be combined row-wise when applicable.When
file
is a URL, the file is downloaded to a temporary path before parsing (a progress message is printed).If
normalized_names = TRUE
, selected WoS tags are mapped to standardized names (e.g.,AU
→author
,TI
→title
,PY
→year
,DI
→doi
,DE
→keywords
,SR
→unique_id
, etc.; the exact mapping depends on the format). Otherwise, original field tags are preserved.The output includes:
DI2
: an uppercase, punctuation-stripped variant ofDI
(if present),PY
: coerced to numeric (when present),DB
: a provenance flag indicating the source/format and whether names were normalized.
Columns with ALL-CAPS tags (e.g.,
AU
,TI
,PY
) are placed first, followed by other columns, andDI2
is relocated just afterDI
.
Supported formats
"bib"
— BibTeX export"ris"
— RIS export"txt-plain-text"
— Plain-text export"txt-tab-delimited"
— Tab-delimited export
See also
Internal parsers used by this function:
read_wos_bib
, read_wos_ris
,
read_wos_plain
, read_wos_tab
.