This repository contains the code, data processing scripts, and analysis pipelines used in the article:
A Scientometric Review of Practical Applications in Quantum Natural Language Processing (QNLP): Trends, Gaps, and Research Opportunities Victor R. Silva et al., IEEE Access, 2025
The goal of this repository is to ensure reproducibility, transparency, and extensibility of the scientometric study, enabling other researchers to replicate the analyses, adapt the pipeline to other domains, or extend the results with new data.
Quantum Natural Language Processing (QNLP) is an emerging interdisciplinary field at the intersection of quantum computing, machine learning, and natural language processing. Despite rapid growth, research remains fragmented.
This repository supports a large-scale scientometric and bibliometric analysis of QNLP literature, including:
- Temporal evolution of publications
- Author productivity and collaboration networks
- Core journals and sources (Bradfordβs Law)
- Country-level scientific production
- Keyword co-occurrence, thematic maps, and research gaps
- Identification of underexplored application domains (e.g., healthcare)
The analyses follow PRISMA guidelines and rely on data collected from Scopus and Web of Science (2014β2024).
qnlp-scientometrics/
βββ LICENSE
βββ README.md
βββ requirements.txt
β
βββ data/
β βββ raw/ # Raw bibliographic data (Scopus and Web of Science)
β β βββ scopus_base.csv
β β βββ wos_base.xls
β β
β βββ processed/ # Intermediate datasets after filtering and normalization
β β βββ filtereds_merged.xlsx
β β βββ merged_exported_scopus_mapped.xlsx
β β βββ scopus_filtered.csv
β β βββ wos_filtered.csv
β β βββ selection.csv
β β
β βββ ready/ # Final dataset used in the scientometric analysis
β β βββ final_dataset_merged.csv
β β
β βββ controlled_vocabulary/ # Controlled vocabulary for keyword normalization
β βββ tesauro.csv # Thesaurus for term equivalence and normalization
β βββ words_remove.txt # Domain-specific stoplist for keyword filtering
β
βββ src/
βββ bibliometrix_analysis.R # Bibliometric and thematic analysis (Bibliometrix/Biblioshiny)
β
βββ notebooks/
βββ tratamento_artigos.ipynb # Data preprocessing
The study combines Python-based data processing with specialized bibliometric tools:
- Python (pandas, numpy, matplotlib, seaborn)
- Bibliometrix / Biblioshiny (R) β descriptive indicators and thematic maps
- VOSviewer β co-authorship, co-citation, and keyword networks
- PyBIBix β bibliographic data parsing
β οΈ Some visualizations (e.g., VOSviewer maps) are generated externally and imported into this repository.
- Clone the repository
git clone https://github.com/vic37get/scientometric_qnlp.git
cd scientometric_qnlp- Install dependencies
pip install -r requirements.txt- Add bibliographic data
Place Scopus and Web of Science exports (
.bib,.csv, or.txt) into:
data/raw/
- Run preprocessing and analysis notebook Execute the notebooks:
notebooks/tratamento_artigos.ipynb
- (Optional) Run Biblioshiny
biblioshiny()Then upload the processed datasets from data/processed/.
The pipeline produces:
- Annual publication growth curves
- Lotkaβs Law author productivity analysis
- Bradfordβs Law core journals
- Country-level SCP vs MCP analysis
- Keyword co-occurrence networks
- Thematic evolution and thematic maps
- Application-focused keyword networks (after concept pruning)
These outputs directly support the figures and tables presented in the paper.
If you use this repository or build upon this work, please cite the article:
@ARTICLE{11271215,
author={Silva, Victor R. and Barbosa, FΓ‘bio R. and Silva, Jasson C. and Santos, Francisco J. and Rabelo, Ricardo A. L. and Rodrigues, Joel J. P. C.},
journal={IEEE Access},
title={A Scientometric Review of Practical Applications in Quantum Natural Language Processing (QNLP): Trends, Gaps, and Research Opportunities},
year={2025},
volume={13},
number={},
pages={210169-210184},
keywords={Natural language processing;Quantum computing;Bibliometrics;Computational modeling;Biological system modeling;Medical services;Databases;Market research;Data visualization;Training;Bibliometrics;natural language processing;quantum natural language processing;quantum computing;scientometrics},
doi={10.1109/ACCESS.2025.3638646}}
This repository is released under the Creative Commons Attribution 4.0 (CC BY 4.0) license, consistent with the IEEE Access publication.
You are free to:
- Share and adapt the material
- Use it for academic and commercial purposes
As long as proper attribution is given.
Victor R. Silva Federal University of PiauΓ (UFPI) π§ victor.silva@ufpi.edu.br
For questions, suggestions, or collaborations, feel free to open an issue or get in touch directly.