An Overview of Speaker Identification: Accuracy and Robustness Issues

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Book cover

Advances in Data-driven Computing and Intelligent Systems pp 109–121 Cite as

Speaker Identification Using Ensemble Learning With Deep Convolutional Features

  • Sandipan Dhar 13 ,
  • Sukonya Phukan 14 ,
  • Rajlakshmi Gogoi 14 &
  • Nanda Dulal Jana 13  
  • Conference paper
  • First Online: 22 June 2023

260 Accesses

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 653))

Speaker identification (SI) is an emerging area of research in the domain of digital speech processing. SI is the process of classification of speakers based on their speech features extracted from the speech utterances. After the recent developments of deep learning (DL) models, deep convolutional neural networks (DCNNs) have been widely used for solving the SI tasks. A CNN model consists of mainly two parts, deep convolutional feature extraction and classification. However, the training process of DCNN models is computationally expensive and time consuming. Therefore, in terms of reducing the computational cost of training a DCNN model an ensemble of machine learning (ML) models is proposed for the speaker identification task. The ensemble model classifies the speakers based on the deep convolutional features extracted from the input speech features. In this work, the deep convolutional features are extracted from mel-spectrograms in terms of flatten vectors (FVs) by utilizing a pre-trained DCNN model (VGG-16 model). The machine learning models that are considered for the hard voting ensemble approach are random forest (RF), extreme gradient boosting (XGBoost) and support vector machine (SVM). The models are trained and tested with voice conversion challenge (VCC) 2016 mono-lingual speech data, VCC 2020 multi-lingual speech data and multi-lingual emotional speech data (ESD). Moreover, three data augmentation techniques are used for increasing the samples of the speech data namely pitch-scaling, amplitude scaling and polarity inversion. The accuracy obtained from the proposed approach is significantly higher than the individual ML models.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Links of all the datasets are available at GitHub https://github.com/SandyPanda-MLDL/Speaker-Identification-Using-Ensemble-Learning-Approach-Considering-CNN-As-Feature-Extractor

Code implementation of the proposed work along with preprocessing part is available at GitHub https://github.com/SandyPanda-MLDL/Speaker-Identification-Using-Ensemble-Learning-Approach-Considering-CNN-As-Feature-Extractor

Bai Z, Zhang XL (2021) Speaker recognition based on deep learning: an overview. Neural Netw: Official J Int Neural Netw Soc 140:65–99

Article   Google Scholar  

Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining

Google Scholar  

Cristianini N, Ricci E (2008) Support vector machines. Springer US, Boston, MA, pp 928–932

Das A, Guha S, Singh PK, Ahmadian A, Senu N, Sarkar R (2020) A hybrid meta-heuristic feature selection method for identification of Indian spoken languages from audio signals. IEEE Access 8:181432–181449. https://doi.org/10.1109/ACCESS.2020.3028241

Dhar S, Jana ND, Das S (2022) An adaptive learning based generative adversarial network for one-to-one voice conversion. In: IEEE Transactions on artificial intelligence, pp 1–1

El-Moneim SA, Nassar MA, Dessouky MI, Ismail NA, El-Fishawy AS, Abd El-Samie FE (2020) Text-independent speaker recognition using lstm-rnn and speech enhancement. Multimedia Tools Appl 79(33):24013–24028

Farsiani S, Izadkhah H, Lotfi S (2022) An optimum end-to-end text-independent speaker identification system using convolutional neural network. Comput Electr Eng 100:107882

Ghezaiel W, Brun L, Lézoray O (2021) Hybrid network for end-to-end text-independent speaker identification. In: 2020 25th International conference on pattern recognition (ICPR), pp 2352–2359

Kabir MM, Mridha MF, Shin J, Jahan I, Ohi AQ (2021) A survey of speaker recognition: Fundamental theories, recognition methods and opportunities. IEEE Access 9:79236–79263

Kobayashi K, Takamichi S, Nakamura S, Toda T (2016) The nu-naist voice conversion system for the voice conversion challenge 2016. In: INTERSPEECH

Krajewski J, Golz M, Sommer D, Wieland R (2009) Genetic algorithm based feature selection applied on predicting microsleep from speech. In: Vander Sloten J, Verdonck P, Nyssen M, Haueisen J (eds) 4th European conference of the international federation for medical and biological engineering. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 184–187

Chapter   Google Scholar  

Liu Y, Wang Y, Zhang J (2012) New machine learning algorithm: random forest. In: Liu B, Ma M, Chang J (eds) Inf Comput Appl. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 246–252

Mazumder A, Ghosh S, Roy S, Dhar S, Jana ND (2022) Rectified adam optimizer-based cnn model for speaker identification. In: Advances in intelligent computing and communication. Springer Nature Singapore, Singapore, pp 155–162

Peng YH, Hu CH, Kang ACF, Lee HS, Chen PY, Tsao Y, Wang HM (2020) The academia sinica systems of voice conversion for vcc2020. ArXiv abs/2010.02669

Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556

Wu Y, Zhao J, Guo C, Xu J (2021) Improving Deep CNN Architectures with variable-length training samples for text-independent speaker verification. In: Proceedings interspeech 2021, pp 81–85

Ye F, Yang J (2021) A deep neural network model for speaker identification. Appl Sci 11(8)

Zhong Q, Dai R, Zhang H, Zhu Y, Zhou G (2021) Text-independent speaker recognition based on adaptive course learning loss and deep residual network. EURASIP J Adv Signal Process 2021(1):45

Download references

Author information

Authors and affiliations.

National Institute of Technology Durgapur, Durgapur, 713209, India

Sandipan Dhar & Nanda Dulal Jana

Jorhat Engineering College, Jorhat, 785007, India

Sukonya Phukan & Rajlakshmi Gogoi

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sandipan Dhar .

Editor information

Editors and affiliations.

Department of Electronics and Communication Sciences, Indian Statistical Institute, Kolkata, West Bengal, India

Swagatam Das

Department of Computer Science and Information Systems, Birla Institute of Technology and Science, Goa, India

Snehanshu Saha

Department of Computer Science, CINVESTAV-IPN, Mexico, Mexico

Carlos A. Coello Coello

Department of Mathematics, South Asian University, New Delhi, Delhi, India

Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Dhar, S., Phukan, S., Gogoi, R., Jana, N.D. (2023). Speaker Identification Using Ensemble Learning With Deep Convolutional Features. In: Das, S., Saha, S., Coello Coello, C.A., Bansal, J.C. (eds) Advances in Data-driven Computing and Intelligent Systems. Lecture Notes in Networks and Systems, vol 653. Springer, Singapore. https://doi.org/10.1007/978-981-99-0981-0_9

Download citation

DOI : https://doi.org/10.1007/978-981-99-0981-0_9

Published : 22 June 2023

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-0980-3

Online ISBN : 978-981-99-0981-0

eBook Packages : Engineering Engineering (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

This paper is in the following e-collection/theme issue:

Published on 17.4.2024 in Vol 26 (2024)

This is a member publication of National University of Singapore

Comparing Open-Access Database and Traditional Intensive Care Studies Using Machine Learning: Bibliometric Analysis Study

Authors of this article:

Author Orcid Image

Original Paper

  • Yuhe Ke 1 * , MBBS   ; 
  • Rui Yang 2 * , MSc   ; 
  • Nan Liu 2 , PhD  

1 Division of Anesthesiology and Perioperative Medicine, Singapore General Hospital, Singapore, Singapore

2 Centre for Quantitative Medicine, Duke-NUS Medical School, National University of Singapore, Singapore, Singapore

*these authors contributed equally

Corresponding Author:

Nan Liu, PhD

Centre for Quantitative Medicine

Duke-NUS Medical School

National University of Singapore

8 College Road

Singapore, 169857

Phone: 65 66016503

Email: [email protected]

Background: Intensive care research has predominantly relied on conventional methods like randomized controlled trials. However, the increasing popularity of open-access, free databases in the past decade has opened new avenues for research, offering fresh insights. Leveraging machine learning (ML) techniques enables the analysis of trends in a vast number of studies.

Objective: This study aims to conduct a comprehensive bibliometric analysis using ML to compare trends and research topics in traditional intensive care unit (ICU) studies and those done with open-access databases (OADs).

Methods: We used ML for the analysis of publications in the Web of Science database in this study. Articles were categorized into “OAD” and “traditional intensive care” (TIC) studies. OAD studies were included in the Medical Information Mart for Intensive Care (MIMIC), eICU Collaborative Research Database (eICU-CRD), Amsterdam University Medical Centers Database (AmsterdamUMCdb), High Time Resolution ICU Dataset (HiRID), and Pediatric Intensive Care database. TIC studies included all other intensive care studies. Uniform manifold approximation and projection was used to visualize the corpus distribution. The BERTopic technique was used to generate 30 topic-unique identification numbers and to categorize topics into 22 topic families.

Results: A total of 227,893 records were extracted. After exclusions, 145,426 articles were identified as TIC and 1301 articles as OAD studies. TIC studies experienced exponential growth over the last 2 decades, culminating in a peak of 16,378 articles in 2021, while OAD studies demonstrated a consistent upsurge since 2018. Sepsis, ventilation-related research, and pediatric intensive care were the most frequently discussed topics. TIC studies exhibited broader coverage than OAD studies, suggesting a more extensive research scope.

Conclusions: This study analyzed ICU research, providing valuable insights from a large number of publications. OAD studies complement TIC studies, focusing on predictive modeling, while TIC studies capture essential qualitative information. Integrating both approaches in a complementary manner is the future direction for ICU research. Additionally, natural language processing techniques offer a transformative alternative for literature review and bibliometric analysis.

Introduction

The start of critical care as a medical subspecialty can be traced back to a polio epidemic during which a substantial number of patients needed prolonged mechanical ventilation [ 1 ]. Over time, the field of critical care has experienced significant growth and continual evolution. Research in this field has played a pivotal role in unraveling the complexities of numerous diseases and treatment modalities, driving substantial advancements in clinical practice over the past decades [ 2 ]. Groundbreaking studies have investigated critical areas such as sepsis, mechanical ventilation, acute lung and kidney injuries, intensive care unit (ICU) delirium, and sedation in critically ill patients [ 3 ].

These research studies have often been conducted in traditional ways such as prospective and randomized controlled trials [ 4 ], cohort and observational studies, clinical trials [ 5 ], and clinical and translational research [ 6 ]. These traditional methods have revolutionized patient care and improved outcomes significantly. For instance, the implementation of protocol-driven, goal-directed management of sepsis and appropriate fluid therapy has led to remarkable reductions in mortality rates [ 7 , 8 ], and these findings have been integral in developing evidence-based practice guidelines that are now the gold standard [ 9 , 10 ].

Despite their undeniable merits, traditional research methods in intensive care also come with several limitations [ 11 ]. Clinical trials are known for their high costs [ 12 ], stringent standardization requirements, and ethical oversight [ 13 ]. Data collection can be laborious, prone to human errors, and constrained in terms of quantity and granularity [ 14 ]. Moreover, obtaining patient consent for most randomized controlled trials in the ICU poses challenges [ 15 ], necessitating alternative consent models. These limitations have become increasingly apparent as medical complexity continues to grow exponentially [ 16 ].

The advent of electronic health records (EHRs) has heralded a new era in clinical research by facilitating the digitization of health care systems [ 17 ]. In this era of data science, a more integrated approach can be adopted, using machine learning (ML) algorithms to tackle the complexity of critical illness [ 18 ]. Open-access databases (OADs), such as the Medical Information Mart for Intensive Care (MIMIC) database [ 19 ] and the Philips eICU Collaborative Research Database (eICU-CRD) [ 20 ], have played a transformative role by enabling free data sharing.

The concept of free and open databases plays a pivotal role in promoting data sharing and advancing medical knowledge in accordance with the findable, accessible, interoperable, and reusable (FAIR) guiding principle. The FAIR principles, which emphasize that data should be findable, accessible, interoperable, and reusable, are essential for fostering a collaborative and transparent scientific research environment [ 21 , 22 ]. By removing barriers to access, free, and open databases allow researchers, regardless of their affiliations or resources, to contribute to and benefit from the collective pool of information. Accessibility fosters inclusivity and diversity in research, promoting a broader range of perspectives and approaches to medical challenges. This democratization of knowledge leads to a more equitable distribution of information. Researchers can now leverage these vast repositories of information for ML and artificial intelligence studies, marking a departure from traditional intensive care (TIC) research approaches.

Conducting a literature review [ 23 ] to investigate the disparities between traditional ICU research and studies based on open-access data sets holds significant importance as it provides a comprehensive understanding of the strengths and limitations of the latter. However, conventional methods of literature reviews and bibliometric analysis have their limitations, especially when dealing with large-scale literature due to computational complexity and the labor-intensive nature of manual interpretations [ 24 - 26 ]. To address these challenges, natural language processing (NLP) offers a promising avenue, while topic modeling techniques can be used to extract various topic themes from extensive data sets [ 27 , 28 ].

Built on the foundations of bidirectional encoder representations from transformers (BERT), BERTopic introduces a novel approach to topic modeling [ 29 , 30 ]. Unlike traditional unsupervised models like latent Dirichlet allocation, which rely on “bag-of-words” model [ 31 ], BERTopic overcomes the problem of semantic information loss, significantly enhancing the accuracy of generated topics, and providing more interpretable compositions for each topic, which greatly facilitates the classification of topics.

With the aid of BERTopic, this study aims to shed light on the disparities and commonalities between studies conducted through OADs and TIC research. By analyzing the overall trends and patterns in these 2 groups, we seek to identify knowledge gaps and explore avenues for complementary contributions between these research approaches.

Data Filtering

We performed an ML-based analysis of research abstracts in the Web of Science (WoS) database to automatically categorize the research papers to conduct this literature mapping analysis. There was no limit to the year of publication of the articles. The search query consisted of the following keyword to identify all the studies that were published under the umbrella of intensive care: (“ICU” OR “intensive care”). The search terms were deliberately left to be broad to cover broad spectrums of journals in the field.

The inclusion criteria were as follows: (1) written in English, (2) articles that had keywords related to intensive care, (3) articles that had the article type of “article” or “review.” We excluded articles with incomplete data fields (eg, title, abstract, publication year, and paper citation). The articles included were then further processed to identify if they were studies using OADs. These articles were labeled as “open-access database,” while the rest of the articles extracted were labeled as “traditional intensive care.”

The search used for this study was performed on January 18, 2023, from WoS. This generated 227,893 search results, which were subsequently reselected using Python. An advanced search from PubMed was done based on the broad search terms of ICU studies used from previous Cochrane ICU literature review [ 32 ] to ensure the accuracy of the results. The numbers corroborated with a discrepancy of 4.9% (227,893 WoS keyword search vs 239,748 PubMed ICU keyword search).

Selection Criteria for OADs

A title search using keywords from all currently existing OADs was conducted to identify OAD studies. These include (1) MIMIC [ 19 ], (2) eICU-CRD [ 20 ], (3) Amsterdam University Medical Centers Database (AmsterdamUMCdb) [ 33 ], (4) High Time Resolution ICU Dataset (HiRID) [ 34 ], and (5) Pediatric Intensive Care database [ 35 ]. We avoided including only keywords in the search and restricted the search years by the year that the OAD was made publicly available to reduce the inadvertent inclusion of incorrect articles due to keywords. For instance, the search term for OADs published with the MIMIC database included title keyword search with the terms (“MIMIC-IV” OR “MIMIC-III” OR “MIMIC-II” OR “MIMIC Dataset” OR “medical information mart for intensive care” OR “MIMIC IV” OR “MIMIC III” OR “MIMIC II”) in studies that were published after 2003. The title keyword search for the searches and the year of cutoff for each OAD are presented in Multimedia Appendix 1 .

Furthermore, to ensure the accuracy of the supervised keyword classification, a manual review of the classification by 2 critical care physicians was done for 100 articles from each category that were randomly selected. The review was done independently with the physicians labeling the extract publications into OAD and TICs. An accuracy of 99% was achieved on independent reviews, and full agreement was achieved after discussion on the discrepancy. The final results were matched with the supervised keyword classification.

We performed a bibliometric analysis by directly extracting publication details from the WoS database using Python (Python Software Foundation). The analysis involved assessing the number of articles published per year, calculating total citation counts, and identifying the top journals that published intensive care-related articles. Comprehensive results are presented in Multimedia Appendix 2 .

Data Analysis

Uniform manifold approximation and projection.

Uniform manifold approximation and projection (UMAP) is a manifold learning technique for dimension reduction, which can identify key structures in high-dimensional data space and map them to low-dimensional space to accomplish dimensionality reduction. Compared to other dimensionality reduction algorithms, such as principal component analysis [ 36 ], UMAP can retain more global features [ 37 ]. In this paper, we constructed a corpus consisting of abstract words from all studies. However, due to the massive size of the corpus, visualizing and analyzing the high-dimensional data to explore the differences in the vocabulary patterns between the OAD and TIC studies is a challenge. The UMAP package in Python, which implements the UMAP algorithm, was used to project the high-dimensional corpus to 4 dimensions. By cross plotting each dimension, we were able to investigate underlying differences in corpus distribution between OAD and TIC studies.

Topic modeling can help us explore the similarities and differences between research topics in OAD and TIC studies. Unlike conventional topic modeling models, BERTopic uses the BERT framework for embeddings, enabling a deeper understanding of semantic relationships [ 30 ]. The BERTopic model was implemented by the BERTopic package in Python and divided 146,727 studies into 30 topic IDs. We also performed latent Dirichlet allocation topic modeling through Python’s LdaModel package for comparison. Through the review of topic keywords by 2 critical care physicians, BERTopic exhibited superior accuracy and sophistication in topic identification, with enhanced interpretability and scientific rigor.

Consequently, the BERTopic model was used for the final analysis. Each of these topics was given a corresponding clinical research category. The overlapping categories were merged into topic families for easier comparisons. By using these advanced techniques, we were able to uncover hidden patterns and relationships within the literature and provide insights into the current state of intensive care research.

A total of 227,893 records were identified from the WoS database on January 18, 2023, of which 195,463 full records were subsequently processed. Records were excluded if they are not “article” or “review” or if they do not contain keywords related to intensive care. After exclusions, 145,426 articles were identified as TIC studies and 1301 articles were categorized as OAD ( Figure 1 ).

research paper on speaker identification

We examined the number of articles published per year to analyze the trends in TIC and OAD studies ( Figure 2 ). Over the past 2 decades, TIC studies have experienced exponential growth, culminating in a peak of 16,378 articles in 2021. A subsequent decline in the number of publications occurred in 2022, likely attributable to delayed indexing within the WoS database and a reduction in COVID-19–related studies as the pandemic stabilized [ 38 ]. In contrast, the first OAD study emerged in 2003, with its popularity experiencing a consistent upsurge since 2018. Nonetheless, the number of OAD publications remains markedly lower in comparison to TIC publications.

research paper on speaker identification

The OAD studies were published most frequently in new open-access journals such as Frontiers in Medicine , Frontiers in Cardiovascular Medicine , and Scientific Reports while the TIC studies were published most frequently in established journals like Critical Care Medicine , Intensive Care Medicine , and Critical Care ( Multimedia Appendix 2 ). Further analysis of keywords from the abstracts showed 2.4% (3492/145,426) TIC studies were meta-analyses or systematic reviews, while only 0.08% (1/1301) OAD study was in this category. There were 5.61% (73/1301) OAD studies, and 7.43% (10,799/145,426) TIC studies that had the keyword of “cost.” Examples of the data fields that are available within OADs such as MIMIC and eICU-CRD are listed in Textbox 1 . Some information fields such as end-of-life goals and values and health care provider psychology are not available within the current EHRs extracted for OADs.

Examples of information that is available in current OADs

  • Patient information: demographics and social set-up
  • Hospital context: admission time and discharge time, intensive care unit (ICU) and hospital admissions, and pre-ICU admission
  • Diagnosis: physician-curated ICU diagnosis and data-driven phenotypes
  • Intervention: medications, procedures, and organ support
  • Diagnostics: blood test, microbiology, and scans
  • Clinical texts: clinical notes and diagnostic reports
  • Physiological monitoring: basic monitoring and waveforms

Examples of information that is not readily available in current OADs

  • Patient information: family set up and visiting, financial information, and special populations
  • Hospital context: post-ICU discharge details, delayed admission or discharge, and health personnel psychology
  • Diagnosis: pre-ICU history and diagnosis requiring clinical symptoms
  • Intervention: indications for interventions, complications, and intraoperative and postoperative
  • Diagnostics: pathology photographs, imaging, and molecular or genetic studies
  • Clinical texts: patient narratives, end-of-life goals and patient value, and health personnel behavior
  • Physiological monitoring: advanced monitoring

The UMAP algorithm was used to project the high-dimension corpus to 4 dimensions and allowed exploration of the vocabulary patterns between the OAD and TIC studies ( Figure 3 ). The projection values are represented by the x-axis, while the densities are represented by the y-axis. The presence of considerable overlap between TIC studies and OAD studies suggests that they share a substantial number of common terminologies, which may correspond to similar research topics. Nonetheless, TIC studies exhibit a more extensive coverage than OAD studies, which may stem from broader research scope and extended research duration.

research paper on speaker identification

Subsequently, the BERTopic model was then used to generate 30 topic IDs ( Figure 4 ). The internal commonalities of each topic ID were reviewed by critical care physicians and assigned a specific subtopic in intensive care research. The model was able to automatically classify the topics with high interpretability and the topic components were interpreted with ease. For instance, components in topic ID 5 consist of, in decreasing order of weightage: “learning,” “model,” “machine,” “machine learning,” “models,” “data,” “prediction,” and “performance.” This topic was consequently labeled “predictive model” (topic ID 5 in Multimedia Appendix 3 ).

research paper on speaker identification

The overall topic distribution in TIC studies was more uniform, while the OAD studies tended to be concentrated on several topics including topic ID 2 (kidney injury), 5 (predictive model), and 13 (sepsis). Some topics that were missing in OAD studies included 6 (pediatrics care), 21 (viral infections), 23 (health personnel and psychology), and 28 (nutrition and rehabilitation).

The similarity matrix shows that there was little overlap between the topics ( Multimedia Appendix 4 ). To facilitate the interpretability of the categories, the overlapping topic IDs were merged to form the final 22 topic families ( Multimedia Appendix 3 ).

Topics such as “healthcare associated infection,” “thoracic surgeries,” and “pregnancy related” research were among the more frequently discussed 15 topics in TIC studies but have limited publications in OAD studies. The topics of “predictive model,” “obesity,” and “fungal infections” were popular in OAD studies but not the TIC studies. Overall, the topic distributions of the TIC studies were distributed more evenly with the topic family of sepsis accounting for a quarter of the studies, while publications in the OAD studies were heavily skewed toward the predictive model (>40%) and sepsis (>30%; Figure 5 ).

research paper on speaker identification

Principal Results

This study conducted a comprehensive review and bibliometric analysis of OAD and TIC studies. NLP was used to facilitate this large-scale literature review. Studies using OADs mainly concentrated on a few topics, such as predictive modeling, while TIC studies covered a wider range of topics with a more balanced distribution.

Advantages of OAD Studies

OAD studies offer several advantages that have contributed to their increasing popularity in intensive care research. The granularity of data and easy access to large-cohort databases, such as MIMIC [ 39 ], has enabled researchers to perform predictive modeling and conduct various secondary analyses efficiently [ 40 , 41 ]. This accessibility has provided valuable opportunities for exploring specific aspects of patient care, evident in studies investigating phenomena like “weekend effects” and circadian rhythms in ICU patients before discharge [ 42 - 46 ]. The vast amount of longitudinal and time series data available in OADs has also facilitated the implementation of complex ML and deep learning methods [ 47 ].

Limitations of OAD Studies

However, it is crucial to acknowledge the retrospective nature of OAD data, which inherently limits the assessment of confounding factors and the ability to draw strong causal conclusions. The observational design of OAD studies may result in lower-quality evidence according to the GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) framework [ 48 , 49 ], and thus, the research from OAD studies has yet to be fully integrated into existing evidence-based guidelines, as exemplified by the omission of OAD studies in the 2021 sepsis guidelines [ 50 ]. Nevertheless, OADs remain a valuable resource for supplementing and complementing TIC studies, providing unique insights and enhanced predictive scores for intensive care settings.

Furthermore, approximately 50% of the studies using OADs published focused on predictive modeling. The increased usage of ML methods in predictive modeling has not been without critique. Some medical prediction problems inherently possess linear characteristics, and the selection of features may predominantly focus on already known strong predictors, leading to limited improvements in prediction accuracy with ML [ 51 ]. Additionally, interstudy heterogeneity poses a challenge in comparing results obtained from different ML models applied to the same data sets [ 52 ]. The ethical implications of relying solely on ML models to make high-risk health care decisions instead of involving clinical expertise are also relevant considerations [ 51 , 53 ].

While OADs provide comprehensive patient data, there are certain limitations in their ability to capture specific information essential for certain critical care research areas. Notably, data fields related to qualitative aspects such as ethics and end-of-life care [ 54 , 55 ], and health care personnel psychology [ 56 ] may be challenging, if not impossible, to obtain through OADs generated from EHRs. Consequently, TIC studies have played a pivotal role in addressing these limitations by capturing critical information that is integral to understanding ethical considerations, patient experiences, and health care provider psychology in intensive care [ 57 , 58 ].

Synergy Between OAD and TIC Studies

The synergy between OAD and TIC studies is a promising approach to enhance the comprehensiveness and robustness of intensive care research. OADs, with their large cohort sizes, can serve as external validation cohorts for ML models developed from TIC studies, potentially reducing the sample sizes required for prospective research. Furthermore, OAD studies can corroborate the results of TIC studies, benefiting from larger sample sizes and real-world data, thus providing more practical insights for implementation in intensive care settings [ 43 ]. The integration of OAD and TIC studies presents an opportunity to bridge the gaps in data availability and research methodologies, ultimately enriching the understanding and practice of critical care medicine.

Potential Impact of NLP

The usage of large language models such as BERTopic has proven to be a valuable tool for large-scale literature review and topic extraction [ 58 ]. This approach has enabled accurate, reliable, and granular topic generation, offering clinicians a more effective means of interpreting data compared to traditional bag-of-words models [ 59 ]. The potential of NLP to analyze scientific articles and identify trends and knowledge gaps holds promise for shaping the future of research in critical care medicine. As the volume of publications in critical care continues to grow and large language modeling continues to advance in health care [ 60 ], AI technology will be crucial in efficiently identifying and predicting emerging trends.

Future Directions

Future research in the field of critical care can explore novel applications of ML beyond predictive modeling. For instance, using ML to study patterns in how papers are cited, shared, and discussed on the web could help predict their potential impact on the scientific community. This analysis can aid in identifying highly influential papers and understanding the factors that contribute to their recognition. Additionally, investigations into methods for enhancing the interpretability and transparency of ML algorithms in critical care research would further facilitate the ethical and responsible use of AI technologies.

Strengths and Limitations

The study’s application of NLP-driven in analyzing scientific articles and identifying trends highlights the potential impact of AI technologies in streamlining literature reviews and identifying emerging trends more efficiently.

Another notable strength of this study is the usage of the WoS database, the world’s oldest and most extensively used repository of research publications and citations, encompassing approximately 34,000 journals [ 61 ]. The comprehensiveness of this database provides a robust representation of the literature in the field of intensive care research. Nevertheless, it is essential to acknowledge that some articles published in nonindexed journals might not have been captured, and future studies could benefit from considering additional databases to supplement our findings.

One other limitation lies in the classification of OAD and TIC studies, which may be subject to variations in the interpretation of keywords. However, we optimized the keyword combinations during the search process in the WoS database and implemented Python filtering techniques, resulting in a relatively high level of accuracy in our classifications. The number of studies was further corroborated with a manual search on PubMed and a review of the classifications of the studies was done by critical care physicians.

Although there were no specific language restrictions, the nature of the search term being in English inadvertently excluded valuable contributions from non-English research. This may potentially limit the generalizability of our findings to a broader international audience. In future investigations, the inclusion of articles from various languages could offer a more comprehensive and diverse perspective on intensive care research.

Conclusions

This study has provided valuable insights into the expanding landscape of intensive care research through a comprehensive bibliometric analysis of a large number of publications by leveraging NLP technologies. While OAD studies have demonstrated significant promise, it is essential to view them as a complementary approach rather than a replacement for TIC studies. The unique strength of TIC studies lies in their ability to capture crucial qualitative information, which is essential for comprehensive and ethical decision-making. The integration of both OAD and TIC studies offers a synergistic approach to enriching our understanding of critical care medicine and advancing patient care outcomes. As NLP technology continues to advance, it holds the potential to offer a feasible and transformative alternative for literature review and bibliometric analysis.

Acknowledgments

We thank Dr Nicholas Brian Shannon for assistance with the manual review of the supervised keyword classification. This work was supported by the Duke-NUS Signature Research Programme, funded by the Ministry of Health, Singapore.

Data Availability

The data sets generated during and/or analyzed during this study are available from the corresponding author on reasonable request. The complete set of code used in this study is readily available for download on GitHub [ 62 ].

Authors' Contributions

YK and NL played key roles in the conceptualization of the project. RY was responsible for formalizing the methodology and conducting data curation with the advisory of YK. YK contributed to the validation of the data, ensuring its relevance to the research objectives. RY took the lead in visualizing the data. Both YK and RY drafted the original manuscript. NL served as the project supervisor, overseeing the implementation, and providing valuable input in the writing, review, and editing phases.

Conflicts of Interest

None declared.

Search terms for open-access database (OAD) studies with the cutoff by the years of publications.

Top 20 journals ranked by total citation in which the open-access database and traditional intensive care studies were published. The average citation per article was obtained with the total citation/total number of articles. The citation counts were obtained from Web of Science.

Topic ID and topic family and the components and weightage in each of the categories.

Similarity matrix of 30 topics.

  • Kelly FE, Fong K, Hirsch N, Nolan JP. Intensive care medicine is 60 years old: the history and future of the intensive care unit. Clin Med (Lond). 2014;14(4):376-379. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cook D, Brower R, Cooper J, Brochard L, Vincent JL. Multicenter clinical research in adult critical care. Crit Care Med. 2002;30(7):1636-1643. [ CrossRef ] [ Medline ]
  • Rosenberg AL, Tripathi RS, Blum J. The most influential articles in critical care medicine. J Crit Care. 2010;25(1):157-170. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Granholm A, Alhazzani W, Derde LPG, Angus DC, Zampieri FG, Hammond NE, et al. Randomised clinical trials in critical care: past, present and future. Intensive Care Med. 2022;48(2):164-178. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Markey KA, Ottridge R, Mitchell JL, Rick C, Woolley R, Ives N, et al. Assessing the efficacy and safety of an 11β-hydroxysteroid dehydrogenase type 1 inhibitor (AZD4017) in the idiopathic intracranial hypertension drug trial, IIH:DT: clinical methods and design for a phase II randomized controlled trial. JMIR Res Protoc. 2017;6(9):e181. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Verdonk F, Feyaerts D, Badenes R, Bastarache JA, Bouglé A, Ely W, et al. Upcoming and urgent challenges in critical care research based on COVID-19 pandemic experience. Anaesth Crit Care Pain Med. 2022;41(5):101121. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gurnani PK, Patel GP, Crank CW, Vais D, Lateef O, Akimov S, et al. Impact of the implementation of a sepsis protocol for the management of fluid-refractory septic shock: a single-center, before-and-after study. Clin Ther. 2010;32(7):1285-1293. [ CrossRef ] [ Medline ]
  • Wang JL, Chin CS, Chang MC, Yi CY, Shih SJ, Hsu JY, et al. Key process indicators of mortality in the implementation of protocol-driven therapy for severe sepsis. J Formos Med Assoc. 2009;108(10):778-787. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Levy MM, Evans LE, Rhodes A. The surviving sepsis campaign bundle: 2018 update. Crit Care Med. 2018;46(6):997-1000. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fan E, Del Sorbo L, Goligher EC, Hodgson CL, Munshi L, Walkey AJ, et al. An official American Thoracic Society/European Society of Intensive Care Medicine/Society of Critical Care Medicine clinical practice guideline: mechanical ventilation in adult patients with acute respiratory distress syndrome. Am J Respir Crit Care Med. 2017;195(9):1253-1263. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Goldfrad C, Vella K, Bion JF, Rowan KM, Black NA. Research priorities in critical care medicine in the UK. Intensive Care Med. 2000;26(10):1480-1488. [ CrossRef ] [ Medline ]
  • Moore TJ, Heyward J, Anderson G, Alexander GC. Variation in the estimated costs of pivotal clinical benefit trials supporting the US approval of new therapeutic agents, 2015-2017: a cross-sectional study. BMJ Open. 2020;10(6):e038863. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Umscheid CA, Margolis DJ, Grossman CE. Key concepts of clinical trials: a narrative review. Postgrad Med. 2011;123(5):194-204. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Maré IA, Kramer B, Hazelhurst S, Nhlapho MD, Zent R, Harris PA, et al. Electronic data capture system (REDCap) for health care research and training in a resource-constrained environment: technology adoption case study. JMIR Med Inform. 2022;10(8):e33402. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Hearn K, Gibson J, Krewulak K, Porteous R, Saigle V, Sampson M, et al. Consent models in Canadian critical care randomized controlled trials: a scoping review. Can J Anaesth. 2022;69(4):513-526. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ghassemi M, Celi LA, Stone DJ. State of the art review: the data revolution in critical care. Crit Care. 2015;19(1):118. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood). 2014;33(7):1123-1131. [ CrossRef ] [ Medline ]
  • Mlodzinski E, Wardi G, Viglione C, Nemati S, Crotty Alexander L, Malhotra A. Assessing barriers to implementation of machine learning and artificial intelligence-based tools in critical care: web-based survey study. JMIR Perioper Med. 2023;6:e41056. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG, Badawi O. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci Data. 2018;5:180178. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Inau ET, Sack J, Waltemath D, Zeleke AA. Initiatives, concepts, and implementation practices of FAIR (findable, accessible, interoperable, and reusable) data principles in health data stewardship practice: protocol for a scoping review. JMIR Res Protoc. 2021;10(2):e22505. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bahadoran Z, Mirmiran P, Kashfi K, Ghasemi A. Importance of systematic reviews and meta-analyses of animal studies: challenges for animal-to-human translation. J Am Assoc Lab Anim Sci. 2020;59(5):469-477. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Haidich AB. Meta-analysis in medical research. Hippokratia. 2010;14(Suppl 1):29-37. [ FREE Full text ] [ Medline ]
  • Thompson DF, Walker CK. A descriptive and historical review of bibliometrics with applications to medical sciences. Pharmacotherapy. 2015;35(6):551-559. [ CrossRef ] [ Medline ]
  • Donthu N, Kumar S, Mukherjee D, Pandey N, Lim WM. How to conduct a bibliometric analysis: an overview and guidelines. J Bus Res. 2021;133:285-296. [ FREE Full text ] [ CrossRef ]
  • Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, et al. A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics. 2015;16(Suppl 13):S8. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Doanvo A, Qian X, Ramjee D, Piontkivska H, Desai A, Majumder M. Machine learning maps research needs in COVID-19 literature. Patterns (N Y). 2020;1(9):100123. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv. Preprint posted online on May 24 2019. 2018. [ CrossRef ]
  • Grootendorst M. BERTopic: neural topic modeling with a class-based TF-IDF procedure. ArXiv. Preprint posted online on March 11 2022. 2022. [ CrossRef ]
  • Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993-1022. [ FREE Full text ]
  • Delaney A, Bagshaw SM, Ferland A, Manns B, Laupland KB, Doig CJ. A systematic evaluation of the quality of meta-analyses in the critical care literature. Crit Care. 2005;9(5):R575-R582. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Thoral PJ, Peppink JM, Driessen RH, Sijbrands EJG, Kompanje EJO, Kaplan L, et al. Sharing ICU patient data responsibly under the Society of Critical Care Medicine/European Society of Intensive Care Medicine joint data science collaboration: the Amsterdam university medical centers database (AmsterdamUMCdb) example. Crit Care Med. 2021;49(6):e563-e577. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Faltys M, Zimmermann M, Lyu X, Hüser M, Hyland S, Rätsch G, et al. HiRID, a high time-resolution ICU dataset. PhysioNet. 2021. URL: https://physionet.org/content/hirid/1.1.1/ [accessed 2024-04-02]
  • Zeng X, Yu G, Lu Y, Tan L, Wu X, Shi S, et al. PIC, a paediatric-specific intensive care database. Sci Data. 2020;7(1):14. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Maćkiewicz A, Ratajczak W. Principal components analysis (PCA). Comput Geosci. 1993;19(3):303-342. [ CrossRef ]
  • McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. ArXiv. Preprint posted online on September 18 2020. 2018. [ CrossRef ]
  • Murray CJL. COVID-19 will continue but the end of the pandemic is near. Lancet. 2022;399(10323):417-419. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mark R. The story of MIMIC. In: Secondary Analysis of Electronic Health Records. Cham, Switzerland. Springer International Publishing; 2016;43-49.
  • Alghatani K, Ammar N, Rezgui A, Shaban-Nejad A. Predicting intensive care unit length of stay and mortality using patient vital signs: machine learning model development and validation. JMIR Med Inform. 2021;9(5):e21347. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liu D, Zheng M, Sepulveda NA. Using artificial neural network condensation to facilitate adaptation of machine learning in medical settings by reducing computational burden: model design and evaluation study. JMIR Form Res. 2021;5(12):e20767. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang Z, Ho KM, Hong Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit Care. 2019;23(1):112. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Meyer A, Zverinski D, Pfahringer B, Kempfert J, Kuehne T, Sündermann SH, et al. Machine learning for real-time prediction of complications in critical care: a retrospective study. Lancet Respir Med. 2018;6(12):905-914. [ CrossRef ] [ Medline ]
  • Chen H, Zhu Z, Zhao C, Guo Y, Chen D, Wei Y, et al. Central venous pressure measurement is associated with improved outcomes in septic patients: an analysis of the MIMIC-III database. Crit Care. 2020;24(1):433. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Faust L, Feldman K, Chawla NV. Examining the weekend effect across ICU performance metrics. Crit Care. 2019;23(1):207. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Davidson S, Villarroel M, Harford M, Finnegan E, Jorge J, Young D, et al. Vital-sign circadian rhythms in patients prior to discharge from an ICU: a retrospective observational analysis of routinely recorded physiological data. Crit Care. 2020;24(1):181. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Xie F, Yuan H, Ning Y, Ong MEH, Feng M, Hsu W, et al. Deep learning for temporal data representation in electronic health records: a systematic review of challenges and methodologies. J Biomed Inform. 2022;126:103980. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Guyatt GH, Oxman AD, Kunz R, Vist GE, Falck-Ytter Y, Schünemann HJ, et al. GRADE Working Group. What is "quality of evidence" and why is it important to clinicians? BMJ. 2008;336(7651):995-998. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924-926. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Evans L, Rhodes A, Alhazzani W, Antonelli M, Coopersmith CM, French C, et al. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock 2021. Crit Care Med. 2021;49(11):e1063-e1143. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Volovici V, Syn NL, Ercole A, Zhao JJ, Liu N. Steps to avoid overuse and misuse of machine learning in clinical research. Nat Med. 2022;28(10):1996-1999. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fleuren LM, Klausch TLT, Zwager CL, Schoonmade LJ, Guo T, Roggeveen LF, et al. Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. 2020;46(3):383-400. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yoon CH, Torrance R, Scheinerman N. Machine learning in medicine: should the pursuit of enhanced interpretability be abandoned? J Med Ethics. 2022;48(9):581-585. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gillett GR. Intensive care unit research ethics and trials on unconscious patients. Anaesth Intensive Care. 2015;43(3):309-312. [ CrossRef ] [ Medline ]
  • Aulisio MP, Chaitin E, Arnold RM. Ethics and palliative care consultation in the intensive care unit. Crit Care Clin. 2004;20(3):505-523, x-xi. [ CrossRef ] [ Medline ]
  • Raudenská J, Steinerová V, Javůrková A, Urits I, Kaye AD, Viswanath O, et al. Occupational burnout syndrome and post-traumatic stress among healthcare professionals during the novel coronavirus disease 2019 (COVID-19) pandemic. Best Pract Res Clin Anaesthesiol. 2020;34(3):553-560. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Davidson JE, Jones C, Bienvenu OJ. Family response to critical illness: postintensive care syndrome-family. Crit Care Med. 2012;40(2):618-624. [ CrossRef ] [ Medline ]
  • White DB, Angus DC, Shields AM, Buddadhumaruk P, Pidro C, Paner C, et al. A randomized trial of a family-support intervention in intensive care units. N Engl J Med. 2018;378(25):2365-2375. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Popoff B, Occhiali É, Grangé S, Bergis A, Carpentier D, Tamion F, et al. Trends in major intensive care medicine journals: a machine learning approach. J Crit Care. 2022;72:154163. [ CrossRef ] [ Medline ]
  • Yang R, Tan TF, Lu W, Thirunavukarasu AJ, Ting DSW, Liu N. Large language models in health care: development, applications, and challenges. Health Care Sci. 2023;2(4):255-263. [ FREE Full text ] [ CrossRef ]
  • Birkle C, Pendlebury DA, Schnell J, Adams J. Web of science as a data source for research on scientific and scholarly activity. Quant Sci Stud. 2020;1(1):363-376. [ FREE Full text ] [ CrossRef ]
  • GitHub. URL: https://github.com/YangRui525/Comparing-OAD-and-TIC-Studies

Abbreviations

Edited by A Mavragani; submitted 19.04.23; peer-reviewed by D Chrimes, S Pesälä; comments to author 14.07.23; revised version received 01.08.23; accepted 14.01.24; published 17.04.24.

©Yuhe Ke, Rui Yang, Nan Liu. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 17.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

Enhancing BEV Energy Management: Neural Network-Based System Identification for Thermal Control Strategies 2024-01-3005

Modeling thermal systems in Battery Electric Vehicles (BEVs) is crucial for enhancing energy efficiency through predictive control strategies, thereby extending vehicle range. A major obstacle in this modeling is the often limited availability of detailed system information. This research introduces a methodology using neural networks for system identification, a powerful technique capable of approximating the physical behavior of thermal systems with minimal data requirements. By employing black-box models, this approach supports the creation of optimization-based operational strategies, such as Model Predictive Control (MPC) and Reinforcement Learning-based Control (RL). The system identification process is executed using MATLAB Simulink, with virtual training data produced by validated Simulink models to establish the method's feasibility. The neural networks utilized for system identification are implemented in MATLAB code. This study conducts a comparative analysis between validated white-box models and the generated black-box models, focusing on their precision and computational speed, to highlight the trade-offs and advantages inherent to each modeling approach. The findings from this study suggest that employing neural network-based black-box models can enhance the development of advanced control strategies, such as model predictive control (MPC) and reinforcement learning-based control in BEVs. As a forward-looking perspective, the research outlines a specific approach for the integration of these models into control strategy development. Furthermore, it discusses the potential for methodological enhancements and the application of the system identification process to additional thermal system components, with the overall goal of enhancing energy management in BEVs.

SAE MOBILUS

Subscribers can view annotate, and download all of SAE's content. Learn More »

Access SAE MOBILUS »

Skip Navigation

  • Scientific Research
  • Professional Development
  • Career Paths

Image of three blue squares stacked vertically to look like pages.

  • About Neuronline
  • Community Leaders Program
  • Write for Us
  • Community Guidelines

Neuronline logo

  • COLLECTIONS

Prefrontal Regulation of Safety Learning during Ethologically Relevant Thermal Threat

  • Event Policies
  • Stay Connected

research paper on speaker identification

Join this interactive session as Anthony Burgos-Robles and Ada Felix-Ortiz discuss their paper, “ Prefrontal Regulation of Safety Learning during Ethologically Relevant Thermal Threat ”, with eNeuro Editor-in-Chief Christophe Bernard. Attendees can submit questions at registration and live during the webinar.

Below is the significance statement of the paper published on January 25, 2024 in eNeuro and authored by Ada C. Felix-Ortiz, Jaelyn M. Terrell, Carolina Gonzalez, Hope D. Msengi, Miranda B. Boggan, Angelica R. Ramos, Gabrielle Magalhães, and Anthony Burgos-Robles

This study provides new insights on the role of prefrontal cortical processing for threat and safety learning during thermal challenge. For this, a novel behavioral paradigm was implemented in which laboratory mice learned that a particular spatial zone was associated with either a noxious cold temperature (“thermal threat”) or a pleasant warm temperature (“thermal safety”). Manipulations of neuronal activity revealed that the prelimbic and infralimbic subregions of the medial prefrontal cortex bidirectionally regulated memory formation for the thermal safety zone, but not for the thermal threat zone. In addition, the influence of these cortical regions during safety memory formation was altered when mice underwent a stress treatment to produce a disease-like state.

Registration is now open for all upcoming webinars. The webinars are complimentary for SfN members and $15 for nonmembers. Activate your account to receive member access to webinars.

Ada Felix-Ortiz

Who can attend these webinars? All webinars in this series are complimentary to SfN members. Join or renew for access. This webinar is $15 for nonmembers.

Will the webinars be available on-demand? Yes, all webinars will be available to watch on demand after the live broadcast.

How do I access the conference on the live day? After registering, you will receive a confirmation email with the event link and the option to download calendar reminders.

What are the technology requirements for attending?  These webinars are hosted on Zoom Webinar. Instructions for joining and participating in a webinar can be found  here .

Can I ask the presenters questions?  Yes! You can submit any questions before the webinar through the registration form. During the webinar, you can submit questions through the Q&A box.

Will a certificate of attendance be offered for this event? No, SfN does not provide certificates of attendance for webinars. 

I have other questions not answered here. Email [email protected]  with any other questions. 

Review SfN’s Code of Conduct , rules for virtual events in the Digital Learning Community Guidelines , and communications policies regarding dissemination of unpublished scientific data, listed below. SfN asks that conference attendees respect the sensitivity of information and data being presented that are not yet available to the public by following these guidelines:

  • Do not capture or publicly share details of any unpublished data presented.
  • If you are unsure whether data is unpublished, check with the presenter.
  • Respect presenters' wishes if they indicate that the information presented is not to be shared.

Webinar Refund Policy

What is the cancellation/refund policy for webinars?

  • If SfN changes fundamental details of the webinar (date, time, or speakers), nonmember registrants may request a registration refund.
  • To request a refund, please email [email protected] at least 48 hours before the event. Otherwise, refunds are not provided. All webinars are complimentary to SfN members.
  • SfN webinars can be watched on-demand if someone is unable to attend the live broadcast.

research paper on speaker identification

More in Scientific Research

Data and coding. Photo by Bibek ghosh

  • Accessibility Policy
  • Privacy Policy
  • Manage Cookies

SfN logo with "SfN" in a blue box next to Society for Neuroscience in red text and the SfN tag line that reads "Advancing the understanding of the brain and nervous system"

Copyright © 2019 Society for Neuroscience

Help | Advanced Search

Computer Science > Sound

Title: speaker identification using speech recognition.

Abstract: The audio data is increasing day by day throughout the globe with the increase of telephonic conversations, video conferences and voice messages. This research provides a mechanism for identifying a speaker in an audio file, based on the human voice biometric features like pitch, amplitude, frequency etc. We proposed an unsupervised learning model where the model can learn speech representation with limited dataset. Librispeech dataset was used in this research and we were able to achieve word error rate of 1.8.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

X

Institute of Advanced Studies (IAS)

  • Think Pieces
  • Calls & Funding
  • Early Career Network

Menu

Call for Papers 'Precarity in Urban China: Surviving in Capitalist Ruins'

17 April 2024

We are inviting research papers for 15-minute presentations as part of an in-person only workshop at the Institute of Advanced Studies on 21st June, 2024. Deadline for submissions: 15th May, 2024

a street in China with neon signs and shops

Keynote speakers

Prof Margaret Hillenbrand, University of Oxford Dr Carwyn Morris, University of Leiden

The Chinese city now exists in a time and space where the economy slows, work intensifies, and Xi Jinping’s “Chinese Dream” of social mobility seems to dim. In this context, surviving and thriving in the city has become increasingly resource intensive and experiences of precarity have diversified. As Margaret Hillenbrand (2023) has recently demonstrated, states of precarity in China’s urban spaces have been largely underexplored by scholars. Yet exploring precarity in Chinese cities can help us scrutinise the “global city” (Saskia Sassen, 1991) with a local eye: international capitalism under state-managed conditions has created particular pressures and responses which call for academic investigation. 

Funded by the IAS Critical Area Studies Fund , this half-day workshop uses Anna Tsing’s (2015) The Mushroom at the End of the World as a gateway to invite participants from the humanities and social sciences to explore these local conditions, particularly in connection to the idea that meaningful lives and meaning are pieced together in the “ruins” of capitalism. The concept of capitalist ruins invokes images of what is left behind in the wake of capitalist advancement and reminds us that capitalism has boundaries and externality, domains of non-capitalist experience from which capitalism itself scavenges. Using Tsing’s work as an entry point, this workshop invites researchers to think of their work in China’s cities in connection to these notions of “salvage accumulation,” and to explore the “landscapes of unintentional design” that rapid development leaves behind, while also drawing attention to the global pull of supply chains and markets. Ranging from lived experiences of precarity and informal work, the gig economy and social media livelihoods, to urban exploration, to urban design and planning policy, through to play and rebellion in the city, this workshop aims to highlight the Chinese city as both a space of precarity and a space made out of the creative response to that precarity. 

We warmly welcome contributors from researchers at all career stages to participate in two panels of 3-4 papers and discussions. To apply, please submit the following information:

  • Presentation title and 300-word abstract
  • 100-word bio
  • Email to the organisers, Dr Alison Lamont (IOE, UCL) and Dr Annabella Massey (IOE, UCL) at  [email protected] by 5pm BST on Wednesday 15th May, 2024

Please note this workshop will take place in-person at UCL, London. Refreshments will be provided during the day, and a speaker’s dinner will be provided after the event at a local restaurant.

IMAGES

  1. (PDF) Speaker Identification

    research paper on speaker identification

  2. (PDF) A Novel Approach for Text-Independent Speaker Identification

    research paper on speaker identification

  3. (PDF) Robust Speaker Identification Incorporating High Frequency Features

    research paper on speaker identification

  4. (PDF) A Histogram Based Speaker Identification Technique

    research paper on speaker identification

  5. (PDF) Neural Network based Speaker Identification using Hybrid Features

    research paper on speaker identification

  6. (PDF) Multimodal Speaker Identification Based on Text and Speech

    research paper on speaker identification

VIDEO

  1. Collection of Paper and Fabric Speakers (slideshow)

  2. Paper Speakers (first working demo!!!)

  3. DIY Paper Speaker with Tube bulb resistor

  4. Various Shaped Paper Speakers (conductive fabric as coil)

  5. IB Chemistry Paper 1 HL May 2023 TZ 1 (M23 Chem P2 May HL TZ 1)

  6. Making Easy Speaker at Home

COMMENTS

  1. A review on speaker recognition: Technology and challenges

    This paper provides a comprehensive review of the literature on speaker recognition. It discusses the advances made in the last decade, including the challenges in this area of research. This paper also highlights the system and structure of speaker recognition as well as its feature extraction and classifiers.

  2. Speaker recognition based on deep learning: An overview

    Speaker recognition is a task of identifying persons from their voices. Recently, deep learning has dramatically revolutionized speaker recognition. However, there is lack of comprehensive reviews on the exciting progress. In this paper, we review several major subtasks of speaker recognition, including speaker verification, identification ...

  3. An investigation towards speaker identification using a single-sound

    Using Speaker Recognition (SR) technologies to identify the speaker from a given utterance by comparing voice biometrics of the given speaker is known as automatic Speaker Identification (SI) [].Particularly, it is the process to compare one user voice profile against many profiles and find the best or exact match [].The most important aspect of using SI systems is for automating processes ...

  4. Deep Learning-Based End-to-End Speaker Identification Using ...

    Research on speaker identification started in the early 1960s [].There has been a lot of work done and substantial progress made. Hidden Markov models [] and Gaussian mixture models (GMMs) [] are the most prominent stochastic techniques for speaker identification.The common template-based modeling approaches considered by researchers include vector quantization (VQ) and dynamic time warping ...

  5. Analysis and Investigation of Speaker Identification Problems ...

    The rapid momentum of deep neural networks (DNNs) in recent years has yielded state-of-the-art performance in various machine-learning tasks using speaker identification systems. Speaker identification is based on the speech signals and the features that can be extracted from them. In this article, we proposed a speaker identification system using the developed DNNs models. The system is based ...

  6. Speaker identification through artificial intelligence techniques: A

    The contributions of this paper are as follows: ... The details of different benchmark speech databases utilized by various research studies for speaker identification is presented below. VoxCeleb (Nagrani et al., 2017) is a large-scale SI database derived from YouTube videos. VoxCeleb is an English language-based speech database that includes ...

  7. A Deep Neural Network Model for Speaker Identification

    Speaker identification is a classification task which aims to identify a subject from a given time-series sequential data. Since the speech signal is a continuous one-dimensional time series, most of the current research methods are based on convolutional neural network (CNN) or recurrent neural network (RNN). Indeed, these methods perform well in many tasks, but there is no attempt to combine ...

  8. PDF arXiv:2012.00931v2 [eess.AS] 4 Apr 2021

    speaker recognition to a new level, even in wild environments [15, 16]. In this survey article, we give a comprehensive overview to the deep learning based speaker recognition methods in terms of the vital subtasks and research topics, including speaker ver-ification, identification, diarization, and robust speaker recogni-tion.

  9. Deep learning methods in speaker recognition: a review

    This paper summarizes the applied deep learning practices in the field of speaker recognition, both verification and identification. Speaker recognition has been a widely used field topic of ... Many research works have been carried out and little progress has been achieved in the past 5-6 years. However, as deep learning techniques do advance ...

  10. End-to-end speaker identification research based on multi ...

    Deep learning has improved the performance of speaker identification systems in recent years, but it has also presented significant challenges. Typically, data-driven modeling approaches based on DNNs rely on large-scale training data, but due to environmental constraints, large amounts of user speech data are not obtainable. As a result, this work proposes a new SincGAN speaker identification ...

  11. Deep Speaker Recognition: Process, Progress, and Challenges

    Abstract: Speaker recognition is related to human biometrics dealing with the identification of speakers from their speech. Speaker recognition is an active research area and being widely investigated using artificially intelligent mechanisms. Though speaker recognition systems were previously constructed using handcrafted statistical means of machine learning, currently it is being shifted to ...

  12. A review on Deep Learning approaches in Speaker Identification

    This paper is motivated to reduce this knowledge gap and to promote the research of implementing deep learning techniques for speaker identification. In this paper, we present a review of the DL methodologies used for speaker identification and surveys important DL algorithms that can potentially be explored for future works.

  13. (PDF) Analysis of Methods and Techniques Used for Speaker

    In this paper, research-related publications of the past 25 years (from 1996 to 2020) were studied and analysed. Our main focus was on speaker identification, speaker recognition, and speaker ...

  14. Speaker Recognition Based on Deep Learning: An Overview

    Speaker recognition is a task of identifying persons from their voices. Recently, deep learning has dramatically revolutionized speaker recognition. However, there is lack of comprehensive reviews on the exciting progress. In this paper, we review several major subtasks of speaker recognition, including speaker verification, identification, diarization, and robust speaker recognition, with a ...

  15. A Survey of Speaker Recognition: Fundamental Theories, Recognition

    Specifically, this literature survey gives a concise introduction to ASR and provides an overview of the general architectures dealing with speaker recognition technologies, and upholds the past, present, and future research trends in this area. This paper briefly describes all the main aspects of ASR, such as speaker identification ...

  16. (PDF) A Survey of Speaker Recognition: Fundamental Theories

    This paper briefly describes all the main aspects of ASR, such as speaker identification, verification, diarization etc. Further, the performance of current speaker recognition systems are ...

  17. Speaker identification through artificial intelligence techniques: A

    The contributions of this paper are as follows: ... we present a survey of different pre-processing techniques that were utilized by researchers in various research areas of speaker identification. Speech signals pre-processing is very critical phase in the systems where background-noise or silence is completely undesirable. Systems like SI and ...

  18. Speaker Identification using Speech Recognition

    This paper will mainly focus on speaker identification, and in some cases, speaker verification as well. This research can later on serve as a basis for multi-speaker identification as well, where we identify multiple speakers in an audio. Abdul Basit Mughal Department of Computer Science School of Engineering and Applied Sciences (SEAS) Bahria ...

  19. An Overview of Speaker Identification: Accuracy and Robustness Issues

    This paper presents the main paradigms for speaker identification, and recent work on missing data methods to increase robustness. The feature extraction, speaker modeling and system classification are discussed. Evaluations of speaker identification performance subject to environmental noise are presented. While performance is impressive in clean speech conditions, there is rapid degradation ...

  20. Speaker Identification Using Ensemble Learning With Deep ...

    Abstract. Speaker identification (SI) is an emerging area of research in the domain of digital speech processing. SI is the process of classification of speakers based on their speech features extracted from the speech utterances. After the recent developments of deep learning (DL) models, deep convolutional neural networks (DCNNs) have been ...

  21. (PDF) Speaker Identification

    In speaker identification the aim is to match input voice. sample with available voice s amples. And in spea ker verification, from available voice sample t o. determine the person who is claiming ...

  22. Journal of Medical Internet Research

    Background: Intensive care research has predominantly relied on conventional methods like randomized controlled trials. However, the increasing popularity of open-access, free databases in the past decade has opened new avenues for research, offering fresh insights. Leveraging machine learning (ML) techniques enables the analysis of trends in a vast number of studies.

  23. Enhancing BEV Energy Management: Neural Network-Based System

    As a forward-looking perspective, the research outlines a specific approach for the integration of these models into control strategy development. Furthermore, it discusses the potential for methodological enhancements and the application of the system identification process to additional thermal system components, with the overall goal of ...

  24. Prefrontal Regulation of Safety Learning during Ethologically Relevant

    Join this interactive session as Anthony Burgos-Robles and Ada Felix-Ortiz discuss their paper, "Prefrontal Regulation of Safety Learning during Ethologically Relevant Thermal Threat", with eNeuro Editor-in-Chief Christophe Bernard. Attendees can submit questions at registration and live during the webinar. Below is the significance statement of the paper published on January 25, 2024 in ...

  25. Speaker identification features extraction methods: A systematic review

    Research identification: initial search using speaker recognition as the keyword without any filters resulted in 1710 papers from IEEE Xplore, 34,045 from ScienceDirect, 6678 from ACM digital library, and 36,676 from Springer Verlag. Resources with combined results from multiple sources like Google Scholar and DBLP resulted in 1160,000 and 764 ...

  26. Research on Financial Fraud Identification Model Based on Privacy

    J. L. Wang. 2022, Exploration of corporate financial statement fraud identification method based on random forest algorithm. Investment and Entrepreneurship, 33(24):58-60. Google Scholar; Martin S, Ronan Z, Lukas F, Machine Learning Facial Emotion Classifiers in Psychotherapy Research: A Proof-of-Concept Study. Psychopathology, 2023, 11-10.

  27. [2205.14649] Speaker Identification using Speech Recognition

    Speaker Identification using Speech Recognition. Syeda Rabia Arshad, Syed Mujtaba Haider, Abdul Basit Mughal. The audio data is increasing day by day throughout the globe with the increase of telephonic conversations, video conferences and voice messages. This research provides a mechanism for identifying a speaker in an audio file, based on ...

  28. Call for Papers 'Precarity in Urban China: Surviving in ...

    We are inviting research papers for 15-minute presentations as part of an in-person only workshop at the Institute of Advanced Studies on 21st June, 2024. Deadline for submissions: 15th May, 2024 ... Keynote speakers. Prof Margaret Hillenbrand, University of Oxford