Analyzing Pubmed MetaData with VOSviewer and Biblioshiny || Bibliometric and Systematic Review

TL;DR

Create a PubMed query, then export results in a format compatible with bibliometric tools after applying inclusion filters.

Briefing Cornell Notes

Briefing

PubMed records can be turned into a full bibliometric and systematic-review workflow by pulling author- and keyword-level data from NCBI, filtering it to a PRISMA-style inclusion set, and then running it through VOSviewer and Biblioshiny for network maps, thematic clustering, and publication trends. The practical payoff is that researchers can cross-check evidence coverage and generate visual outputs—keyword co-occurrence, author networks, density maps, and time-based publication counts—without manually cleaning spreadsheets or reformatting citations.

The process starts with creating an account and logging into PubMed via NCBI’s PubMed interface. After selecting the relevant search scope, the workflow uses a query built around a keyword (the transcript uses “ECG signal analysis” as an example) and then reviews the hit count. Results can be downloaded in formats suitable for bibliometric tools, and the interface supports additional controls like filtering by publication year range, article type (e.g., reviews), and study categories such as clinical trials. The transcript emphasizes that narrowing the dataset matters: one example reduces a large set by applying year limits (e.g., selecting a 10-year window), choosing article types, and optionally filtering by gender-related fields (e.g., “female data only”), with the document count dropping accordingly.

Once the filtered PubMed results are selected, the workflow shifts to exporting a file (the transcript mentions creating a PubMed-format file and downloading it as a text file). That file is then imported into Biblioshiny, which runs bibliometric analyses in R. Biblioshiny’s PRISMA alignment is highlighted through a flow diagram concept: identification, screening, and inclusion steps that mirror systematic review reporting. The transcript notes that metadata quality affects downstream visuals—keyword-plus and keyword fields are treated as acceptable, while missing cited references can weaken certain analyses. After loading the dataset, Biblioshiny produces outputs such as annual scientific production (year-wise publication counts), most relevant authors, and network-based views including co-occurrence/keyword networks, thematic maps, and factor analyses when data sufficiency allows.

The workflow then moves to VOSviewer for additional visualization. Bibliographic data exported from PubMed is read through VOSviewer’s “create” options, with the transcript describing author network mapping and keyword mapping. It also covers a text-based map option that extracts terms from titles and abstracts (with the transcript mentioning that structural information can be ignored or retained depending on settings). Users can adjust parameters like the minimum occurrence threshold, scale, and overlay visualization, then save or share the resulting maps.

Overall, the transcript frames the method as a repeatable pipeline: query PubMed, apply systematic-review style filters, export in a bibliometric-friendly format, run Biblioshiny for PRISMA-like reporting and bibliometric networks, and use VOSviewer for publication and keyword/author visualizations. The key constraint is data adequacy—if the dataset is too small or metadata is incomplete, some network and thematic outputs may fail to form cleanly.

Cornell Notes

The workflow turns PubMed search results into bibliometric networks and systematic-review reporting by combining NCBI/PubMed export with Biblioshiny (R) and VOSviewer. After logging into PubMed and running a keyword query, researchers filter by year range, article type (e.g., reviews), and study categories such as clinical trials, optionally adding gender-related filters. The filtered records are exported in a PubMed-compatible format, then imported into Biblioshiny to generate PRISMA-style identification/screening/inclusion counts and bibliometric outputs like annual publication trends, author relevance, and keyword networks. VOSviewer then reads the same bibliographic file to create author networks, keyword density maps, and text-based maps from titles and abstracts. Results depend heavily on dataset size and metadata completeness (e.g., cited references).

How does the workflow connect PubMed searching to systematic-review style inclusion?

It uses PubMed to generate an initial set of records from a keyword query, then applies filters that reduce the dataset (example filters include year range, article type such as reviews, and clinical-trial-related categories). The reduced set becomes the “included” corpus. Biblioshiny then provides a PRISMA-like flow concept—identification, screening, and inclusion—so the counts at each stage can be reported alongside the bibliometric outputs.

Why do filters like year range and article type matter for bibliometric maps?

Bibliometric networks (keyword co-occurrence, author coupling, thematic maps) require enough records and consistent metadata. The transcript describes reducing a large hit count to a smaller inclusion set by selecting a 10-year window and restricting article types, which prevents noisy networks and helps the resulting clusters form more clearly.

What role does metadata completeness play in Biblioshiny outputs?

Metadata quality affects which analyses can be generated reliably. The transcript notes that keyword and keyword-plus fields are usable, but missing cited references can reduce the strength of certain network or reference-based analyses. In short: more complete, well-structured metadata tends to produce clearer clustering and stronger visual outputs.

What kinds of visualizations come from Biblioshiny after importing the exported PubMed file?

After importing the PubMed-format file into Biblioshiny, the workflow produces time-based publication trends (annual scientific production), lists of most relevant authors, and network-based views such as keyword networks and thematic maps. It also supports additional outputs like coupling/keyword density-style visualizations when the dataset is sufficient.

How does VOSviewer complement Biblioshiny in the pipeline?

VOSviewer focuses on visualization creation from bibliographic files. The transcript describes using VOSviewer’s create options to build author networks and keyword maps, adjusting parameters like minimum occurrences. It also includes a text-based mapping option that extracts terms from titles and abstracts (with settings such as whether to ignore structural information), enabling different perspectives on the same underlying corpus.

What parameter choices can change the appearance of VOSviewer maps?

The transcript highlights minimum occurrence thresholds (e.g., setting a minimum number of occurrences for terms/authors), scale and overlay visualization controls, and color/background settings. These choices affect which nodes appear and how dense or readable the resulting networks look.

Review Questions

If you apply a strict year filter and reduce the dataset size, which bibliometric outputs are most likely to become unstable or fail to form clear clusters—and why?
What are the main steps needed to reproduce the pipeline from PubMed export to Biblioshiny PRISMA-style counts and then to VOSviewer maps?
How would you decide between using keyword/keyword-plus fields versus title/abstract text extraction when building a VOSviewer map for your review topic?

Key Points

1
Create a PubMed query, then export results in a format compatible with bibliometric tools after applying inclusion filters.
2
Use systematic-review style filtering (year range, article type, and study category) to reduce noise before analysis.
3
Import the exported PubMed file into Biblioshiny (R) to generate PRISMA-like identification/screening/inclusion counts and bibliometric outputs.
4
Treat metadata completeness—especially cited references and keyword fields—as a key determinant of how strong and interpretable networks and thematic maps become.
5
Run VOSviewer after Biblioshiny to generate author networks, keyword density/co-occurrence maps, and text-based term maps from titles and abstracts.
6
Adjust VOSviewer parameters such as minimum occurrence thresholds and visualization settings to control which nodes appear and how readable the maps are.
7
Validate that the final dataset size and metadata quality are sufficient; otherwise, some clustering and thematic outputs may not appear cleanly.

Highlights

Filtering PubMed results by year range and article type can shrink a large hit set into a manageable inclusion corpus that produces clearer bibliometric networks.

Biblioshiny’s PRISMA-style flow counts (identification → screening → inclusion) can be paired directly with bibliometric outputs for evidence coverage and reporting.

VOSviewer can generate multiple map types from the same dataset—author networks, keyword maps, and title/abstract text-based maps—depending on the chosen “create” option.

Metadata completeness (e.g., availability of cited references) strongly influences whether thematic maps and network structures form well.

The pipeline is repeatable: PubMed export → Biblioshiny import → VOSviewer visualization, with parameter tuning at each stage.

Topics

PubMed Query
Biblioshiny Workflow
VOSviewer Networks
PRISMA Filtering
Bibliometric Analysis

Mentioned

NCBI
PRISMA
MESH