How to download Bibliometric Data from SCOPUS
Based on My Research Guide's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Build the Scopus query using exact keyword phrases in double quotation marks and combine terms with Boolean logic (AND for main-to-sub theme links, OR for alternative sub-themes).
Briefing
Getting bibliometric data from Scopus hinges on one practical step: build a precise keyword query first, then narrow the results with Scopus filters, and finally export in a format that fits the analysis workflow. The payoff is a clean, research-ready dataset—often tens of thousands of records—tailored to a specific theme like “climate change” and a narrower impact area such as “flood” or “drought.”
The workflow starts with choosing the right database and access method. Scopus requires a subscription, so the process begins by opening the official Scopus site and using its search interface. From there, the query should be targeted to the fields where keyword matching matters most—specifically “article title, abstract and keywords.” That choice determines whether the search terms actually land in the scholarly content rather than in unrelated metadata.
Keyword construction is treated as the core technical skill. The query uses quoted terms so each keyword phrase is matched exactly, and it combines a broad theme with sub-themes using Boolean logic. In the example, “climate change” is the broad theme, and “flood” and “drought” are sub-themes. The query structure follows the pattern: “climate change” AND (“flood” OR “drought”), with each term placed inside double quotation marks. When additional sub-themes are added, the same Boolean pattern is extended—using AND to connect main theme to sub-themes, and OR to include alternative sub-themes—while keeping the Boolean operators in capital letters.
After running the query, Scopus returns a large set of documents (the example cites more than 50,000 records for the climate change + flood/drought combination). The next phase is refinement using the left-side filters. These include subject area, document type, source title, publication stage, keywords, affiliation, funding country, and language. Subject area filtering can shrink the dataset dramatically—for instance, selecting “Environmental Science” reduces the count to roughly 26,000 in the example, while other subject areas can be selected depending on the research focus.
Document type and language filters are used to control quality and relevance. Selecting “original articles” reduces the results (the example notes a drop from about 50,000 to around 37,000 when limiting to original research articles). Language filtering is also applied, with English as the default preference for many bibliometric studies.
Once the dataset is narrowed, exporting becomes the final gate to analysis. Export format matters because it determines how easily the records can be screened and processed in tools like Excel, R, or reference managers. The example recommends exporting in CSV for screening in spreadsheet software and also mentions BibTeX for workflows that integrate with bibliographic tools. Scopus export can be limited to batches of up to 20,000 records, requiring multiple downloads (e.g., 1–20,000, 20,001–40,000, and so on) and then merging the files into one dataset. The export options also allow inclusion of citation and bibliographic fields, plus abstracts and keywords when needed for screening and thematic analysis.
Overall, the method is a repeatable pipeline: precise Boolean queries with quoted terms, filter aggressively by subject area/document type/language, and export in the right format (often CSV or BibTeX) in manageable batches for downstream bibliometric, co-citation, and network analysis.
Cornell Notes
Scopus bibliometric data collection starts with a subscription-based search, then moves to a disciplined keyword strategy and filtering. Keyword phrases should be placed in double quotation marks and combined with Boolean operators: use AND to connect a broad theme (e.g., “climate change”) to impact sub-themes, and OR to include alternatives (e.g., “flood” OR “drought”). After the initial search returns a large set (often 50,000+ records), filters on subject area, document type (such as “original articles”), and language (often English) narrow the dataset to what’s relevant. Exporting in the right format—commonly CSV for screening or BibTeX for bibliographic workflows—and downloading in batches of up to 20,000 records enables clean analysis and merging into a single file.
Why does the choice of Scopus search field (e.g., “article title, abstract and keywords”) matter for bibliometric accuracy?
How should a Boolean Scopus query be structured when combining a broad topic with multiple sub-themes?
What filters most effectively reduce a large Scopus result set to a study-ready dataset?
Why is export format a practical decision rather than a cosmetic one?
How does the 20,000-record export limit change the data collection workflow?
Review Questions
- When adding another sub-theme to an existing Scopus query, when should AND be used versus OR, and how should the terms be formatted?
- Which Scopus filters would you apply first if your initial search returns 50,000+ records, and what effect should each filter have on the count?
- What export formats are recommended for screening versus bibliographic/network workflows, and how do you handle the 20,000-record download cap?
Key Points
- 1
Build the Scopus query using exact keyword phrases in double quotation marks and combine terms with Boolean logic (AND for main-to-sub theme links, OR for alternative sub-themes).
- 2
Search within “article title, abstract and keywords” to ensure keyword matches reflect the scholarly content rather than unrelated fields.
- 3
Use left-side filters—especially subject area, document type (e.g., original articles), and language—to narrow tens of thousands of hits into a relevant dataset.
- 4
Export in a format aligned with the analysis workflow: CSV for spreadsheet screening and BibTeX for bibliographic processing and R-oriented workflows.
- 5
Plan for Scopus’s export limit by downloading results in batches of up to 20,000 records and merging them into one file.
- 6
Include the right fields during export (bibliographic/citation data, and optionally abstracts and keywords) depending on whether the study needs screening, thematic analysis, or co-citation/network work.