Introduction to Bibliometrix || Manual of Biblioshiny R Studio || User Guide to Bibliometrix

TL;DR

Load raw bibliographic exports (e.g., Scopus data) in Biblioshiny, then run processing to generate structured tables and visualizations.

Briefing Cornell Notes

Briefing

Bibliometrix (used through Biblioshiny in RStudio) turns Scopus-style bibliographic exports into an interactive, filterable analysis dashboard—then expands that dashboard into author, journal, affiliation, keyword, and citation-network insights. The practical payoff is speed: once the raw file is loaded and processed, the system outputs structured tables and multiple visualizations that summarize publication volume over time, top contributors, and the intellectual structure of a research field.

After installing and opening Biblioshiny in a browser from RStudio, the workflow starts with loading “raw data” (the transcript references Scopus data). Processing reads the dataset and reports key counts (for example, the number of documents), then renders results in tabular form. From there, users can apply filters and use an Overview panel to quickly scan headline metrics such as annual scientific production, average citations per year, and author-related summaries. The transcript emphasizes that this kind of bibliometric analysis requires at least some familiarity with the field’s dimensions—especially when moving beyond basic counts into deeper mapping.

The interface then branches into several analysis layers. A “three-field plot” links top authors, keywords, and journals, making it easier to see which authors publish in which outlets and what topics dominate. The system also surfaces “most relevant sources” (including highly cited documents and journals), plus author production over time—where bubble size reflects publication proportion and line trends reflect changes across the selected time span. Additional breakdowns connect output to affiliations and countries, including options to isolate or “disambiguate” specific institutions so users can focus on a single university or organization’s contribution.

Beyond descriptive analytics, the transcript highlights knowledge synthesis and mapping. It references conceptual, intellectual, and social structure analysis, including network-factorial approaches such as Multiple Correspondence Analysis (MCA) and clustering. Visual outputs include factor maps and dendrograms that group similar terms or concepts; the height/distance in dendrograms helps decide how to cut the tree into partitions. Thematic mapping is described as a way to classify themes into categories like emerging, declining, transversal, motor, and highly developed/isolated—using axes tied to development and centrality. Time-sliced views can show how themes shift across periods.

Finally, citation-based knowledge synthesis is presented as a way to trace research pathways. A citation network links documents and authors through directed “cites” relationships (past vs present), producing one-way paths that show how earlier work feeds into later publications. The transcript also notes that users can switch between different data views—authors, sources, and affiliations—so the same underlying dataset can be explored from multiple angles.

Overall, the core message is that Bibliometrix/Biblioshiny provides a structured pipeline: import → process → filter → visualize (descriptive and relational) → map intellectual structure (clusters, themes, networks). That pipeline is positioned as a foundation for broader research workflows, including related methods such as SLR, meta-analysis, and diplomatic analysis—where bibliometric mapping helps identify key authors, institutions, journals, and evolving topics.

Cornell Notes

Bibliometrix (via Biblioshiny in RStudio) converts bibliographic exports (e.g., Scopus data) into processed, filterable tables and visualizations. After loading raw data and running “process,” users get overview metrics like annual scientific production and average citations per year, plus author and affiliation summaries. The tool then links entities through visuals such as a three-field plot (authors–keywords–journals), and provides source/journal rankings and author production over time. For deeper insight, it supports knowledge synthesis: clustering and MCA-based factor maps, dendrogram-based partitioning, thematic maps (emerging/declining/motor/transversal), and citation networks that trace directed research pathways. This matters because it helps researchers quickly identify who publishes what, where, and how the intellectual structure evolves over time.

What is the basic Biblioshiny workflow described in the transcript, from installation to first results?

The workflow starts by installing Bibliometrix and running Biblioshiny from RStudio. In RStudio, the user launches a browser-based interface, then loads a “raw data” file (the transcript references Scopus exports). After selecting the file, the user clicks process; Biblioshiny reads the dataset, reports document counts (e.g., 188 documents in the example), and outputs results in tabular form. From there, users can use filters and an Overview panel to inspect headline metrics before moving into deeper visualizations.

Which outputs help users understand publication trends and key contributors quickly?

The Overview and related panels provide annual scientific production and average citations per year, along with author information. The transcript also highlights author production over time, where a line shows changes across the time span and bubble size indicates the proportion of publications. For contributors, the system surfaces top authors and top affiliations/institutions, including the ability to disambiguate or isolate a specific institution’s documents.

How does the “three-field plot” help connect authors, topics, and journals?

The three-field plot links three entities in one view: top authors, top keywords, and top journals. The transcript describes it as establishing a relationship between the most relevant author(s) and the keywords and journals they are associated with. It also notes that the plot is tied to the processed dataset’s elements (authors, keywords, and sources), making it easier to see which topics appear in which journals and who is driving them.

What role do clustering and MCA-style factor maps play in knowledge synthesis?

For intellectual structure, the transcript describes network-factorial analysis and clustering approaches. It references MCA and shows factor maps where colors represent clusters of related terms (similar words grouped together). Dendrograms can be plotted as well; the height/distance indicates how distinct clusters are, helping users choose where to cut the dendrogram to define partitions. The result is a structured grouping of concepts that supports interpretation of the field’s thematic organization.

How are thematic maps interpreted in the transcript?

Thematic mapping classifies themes based on development and centrality. The transcript mentions categories such as emerging themes, declining themes, transversal themes, motor themes, and highly developed/isolated themes. It also describes axes: development degree on one axis and centrality on the other, so users can infer which topics are gaining momentum and which are becoming less connected to the broader field. Time-sliced thematic views can show how themes shift across periods.

What does the citation network add beyond descriptive bibliometrics?

Citation networks trace directed relationships between documents and authors. The transcript explains that each node represents a document, and edges represent citation direction (e.g., A cites B vs B cites A). It also distinguishes past vs present work by showing pathways where earlier documents feed into later ones. This supports understanding of research trajectories—how a topic’s core ideas propagate through time and which works form the backbone of subsequent studies.

Review Questions

When moving from Overview metrics to knowledge synthesis, what additional steps or visualizations become necessary (e.g., clustering, thematic mapping, citation networks)?
How would you use institution disambiguation to compare single-institution output against broader country or multi-institution patterns?
What information does a citation network provide that a three-field plot does not, and how does directionality matter?

Key Points

1
Load raw bibliographic exports (e.g., Scopus data) in Biblioshiny, then run processing to generate structured tables and visualizations.
2
Use Overview metrics like annual scientific production and average citations per year to establish baseline trends before deeper analysis.
3
Apply filters to focus on subsets of interest, then interpret results through entity-specific panels (authors, affiliations, countries, sources).
4
Use the three-field plot to connect top authors, keywords, and journals in a single relationship view.
5
Leverage clustering and MCA-style factor maps (plus dendrograms) to group related concepts and decide how to partition themes.
6
Employ thematic maps to classify topics by development and centrality, and use time slicing to track theme evolution.
7
Use citation networks to trace directed research pathways and distinguish how earlier work links to later publications.

Highlights

Biblioshiny’s “process” step converts raw Scopus-style exports into immediately usable, filterable tables and dashboards, including headline trend metrics.

The three-field plot provides a fast way to see how top authors align with dominant keywords and the journals where those topics appear.

Thematic mapping classifies research themes (emerging, declining, motor, transversal, and isolated) using development and centrality, and can be tracked across time slices.

Citation networks add a directional, pathway-based view of how past documents feed into present research, not just who publishes what.