Read Web of Science exported files

Parse Web of Science (WoS) export files in multiple formats and return a tidy table. The function automatically dispatches to a specialized parser based on the format argument and can also download from a URL if file points to an http:// or https:// resource.

Usage

read_wos(file, format = "bib", normalized_names = TRUE)

Arguments

file: Character scalar or vector. Path(s) to a WoS export file, or a single URL (http:// or https://) pointing to a WoS export.
format: Character scalar. Export format; one of "bib", "ris", "txt-plain-text", or "txt-tab-delimited".
normalized_names: Logical. If TRUE (default), use standardized column names when possible; if FALSE, keep original WoS field tags.

Value

A tibble with the parsed WoS records. See Details for notes on added/coerced columns (DI2, PY, DB) and column ordering.

Details

file may be a single path/URL or a vector of paths; multiple files will be combined row-wise when applicable.
When file is a URL, the file is downloaded to a temporary path before parsing (a progress message is printed).
If normalized_names = TRUE, selected WoS tags are mapped to standardized names (e.g., AU → author, TI → title, PY → year, DI → doi, DE → keywords, SR → unique_id, etc.; the exact mapping depends on the format). Otherwise, original field tags are preserved.
The output includes:
- DI2: an uppercase, punctuation-stripped variant of DI (if present),
- PY: coerced to numeric (when present),
- DB: a provenance flag indicating the source/format and whether names were normalized.
Columns with ALL-CAPS tags (e.g., AU, TI, PY) are placed first, followed by other columns, and DI2 is relocated just after DI.

Supported formats

"bib" — BibTeX export
"ris" — RIS export
"txt-plain-text" — Plain-text export
"txt-tab-delimited" — Tab-delimited export

Examples

if (FALSE) { # \dontrun{

# load data from websites
# M <- birddog::read_wos('http://yoursite/wos-savedrecs-plain-text.txt', format = "txt-plain-text")

 # load from local files
  M <- read_wos('~/Downloads/savedrecs.bib', format = "bib", normalized_names = TRUE)

} # }