corpus linguistics research topics

Announcements

Open call for papers.

Articles falling within one of the four categories published in RiCL are welcome through the whole year and will be evaluated according to the journal's editorial policies

Current Issue

Issue editor: Sara Laviosa

Book Reviews

Issn: 2243-4712, abstracting & indexing.

Google Scholar

Index Copernicus International

Internet Archive Scholar

Linguistic Bibliography Online

MLA International Bibliography

Norwegian List

OASPA 

Publication Forum

ScienceGate

Scimago Journal Rank

Ulrich's Periodicals Directory

  • For Readers
  • For Authors
  • For Librarians

Asociación Española de Lingüística de Corpus /  Spanish Association for Corpus Linguistics Departamento de Filología Inglesa Facultad de Letras | Campus de La Merced Universidad de Murcia, 30003 Murcia, Spain

About this Publishing System

Northern Arizona University Logo

Research in Corpus Linguistics

Research output : Chapter in Book/Report/Conference proceeding › Chapter

Corpus linguistics is a research approach that has developed over the past few decades to support empirical investigations of language variation and use, resulting in research findings that are have much greater generalizability and validity than would otherwise be feasible. Corpus linguistics is not in itself a model of language. Rather, it can be regarded as primarily a methodological approach; it is empirical, analyzing the actual patterns of use in natural texts. It utilizes a large and principled collection of natural texts, known as a corpus, as the basis for analysis. At the same time, corpus linguistics is more than a methodological approach, because these methodological innovations have enabled researchers to ask fundamentally different kinds of research questions, sometimes resulting in radically different perspectives on language variation and use from those taken in previous research. Corpus linguistic research offers strong support for the view that language variation is systematic and can be described using empirical, quantitative methods.

  • Corpus linguistics
  • Empirical methods
  • Language variation
  • Natural texts
  • Quantitative methods

ASJC Scopus subject areas

  • General Arts and Humanities
  • General Social Sciences

Access to Document

  • 10.1093/oxfordhb/9780195384253.013.0038

Other files and links

  • Link to publication in Scopus

Fingerprint

  • Corpus Linguistics Arts & Humanities 100%
  • Language Variation Arts & Humanities 80%
  • linguistics Social Sciences 52%
  • Language Use Arts & Humanities 40%
  • Quantitative Methods Arts & Humanities 25%
  • quantitative method Social Sciences 19%
  • research approach Social Sciences 18%
  • Innovation Arts & Humanities 14%

T1 - Research in Corpus Linguistics

AU - Biber, Douglas

AU - Reppen, Randi

AU - Friginal, Eric

N1 - Publisher Copyright: © 2010 by Oxford University Press, Inc. All rights reserved.

PY - 2012/9/18

Y1 - 2012/9/18

N2 - Corpus linguistics is a research approach that has developed over the past few decades to support empirical investigations of language variation and use, resulting in research findings that are have much greater generalizability and validity than would otherwise be feasible. Corpus linguistics is not in itself a model of language. Rather, it can be regarded as primarily a methodological approach; it is empirical, analyzing the actual patterns of use in natural texts. It utilizes a large and principled collection of natural texts, known as a corpus, as the basis for analysis. At the same time, corpus linguistics is more than a methodological approach, because these methodological innovations have enabled researchers to ask fundamentally different kinds of research questions, sometimes resulting in radically different perspectives on language variation and use from those taken in previous research. Corpus linguistic research offers strong support for the view that language variation is systematic and can be described using empirical, quantitative methods.

AB - Corpus linguistics is a research approach that has developed over the past few decades to support empirical investigations of language variation and use, resulting in research findings that are have much greater generalizability and validity than would otherwise be feasible. Corpus linguistics is not in itself a model of language. Rather, it can be regarded as primarily a methodological approach; it is empirical, analyzing the actual patterns of use in natural texts. It utilizes a large and principled collection of natural texts, known as a corpus, as the basis for analysis. At the same time, corpus linguistics is more than a methodological approach, because these methodological innovations have enabled researchers to ask fundamentally different kinds of research questions, sometimes resulting in radically different perspectives on language variation and use from those taken in previous research. Corpus linguistic research offers strong support for the view that language variation is systematic and can be described using empirical, quantitative methods.

KW - Corpus linguistics

KW - Empirical methods

KW - Language variation

KW - Natural texts

KW - Quantitative methods

KW - Research

UR - http://www.scopus.com/inward/record.url?scp=84923292922&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923292922&partnerID=8YFLogxK

U2 - 10.1093/oxfordhb/9780195384253.013.0038

DO - 10.1093/oxfordhb/9780195384253.013.0038

M3 - Chapter

AN - SCOPUS:84923292922

SN - 9780195384253

BT - The Oxford Handbook of Applied Linguistics, (2 Ed.)

PB - Oxford University Press

University of Portsmouth logo

Corpus linguistics

Two students at a seminar desk

Corpus linguistics research

Explore our work in corpus linguistics

In corpus linguistics, we look at the way language is used in different regions, genres and situations.

Our research is based on huge datasets of natural language – often many billions of words – and we're asking questions about words and language choices, such as the implications and factors that lead to journalists referring to men and women differently in tabloid newspapers, and what the word 'we' tells us about how online groups come together.

Through our work, we're investigating pressing issues and helping to solve problems. For example, our work into the root causes and impact of online bullying, which analyses what causes the abuse, and why.

We are also looking at the language that's used when making business transactions. By researching this language, we can develop teaching materials to help organisations conduct these transactions more effectively.

The outputs from our research are frequently published in leading academic journals, such as Corpora and the International Journal of Management.

Our research covers the following topics

Corpus-assisted discourse studies, corpus pragmatics, corpus stylistics, lexical priming, lexical selection, metaphor analysis, our members.

Glenn Stewart Hadikin Portrait

Dr Glenn Hadikin

Senior Lecturer

[email protected]

School of Education, Languages and Linguistics

Faculty of Humanities and Social Sciences

PhD Supervisor

Mario Saraceni Portrait

Dr Mario Saraceni

Associate Professor in English Language and Linguistics

[email protected]

John Williams Portrait

Mr John Williams

[email protected]

Alessia Tranchese Portrait

Media ready expert

Dr Alessia Tranchese

[email protected]

Methods and facilities

Different methods can be combined with corpus linguistics. Once the data set – or corpus – is built, we read concordance lines, run collocation analysis, keyword analysis and move between quantitative and qualitative techniques. We also have site licences for Sketch Engine and Lexis Nexis, which can be used to build corpora. Staff also have experience of developing bespoke tools, such as scraping online texts and converting files. We have also developed the world’s largest corpus of online discussions about citizen science – with over 10 million words.

Students and staff at the University of Portsmouth are offered free access to the following resources.

  • We have produced a guide to commonly needed  Sketch Engine tasks .
  • CoCA (Corpus of Contemporary American English)
  • CoHA (Corpus of Historical American English)
  • TIME (TIME Magazine Corpus of American English)
  • BNC (BYU interface to the British National Corpus)
  • Corpus doPortuguês
  • Corpus delEspañol
  • Michigan Corpus of Academic Spoken English (MICASE)  – Another very useful resource for those interested in EAP.
  • Webcorp  – An interface that lets you analyse the web using corpus linguistic tools

Free software for corpus creation, annotation and interrogation

  • AntConc  – Free concordance program for Windows, Macintosh OS X, and Linux.Will run on text only files and quite user-friendly.
  • XAIRA  – Open source software package which supports indexing and analysis of large XML textual resources. This is a more powerful tool for concordancing and collocate analysis but only runs on XML texts.
  • BootCaT  – Free software for creating web corpora. Very easy to use.
  • UAM CorpusTool  – A free environment for annotation (and interrogation)of text corpora.Runs under Windows and MacOSX.
  • International Journal of Corpus Linguistics
  • Corpus linguistics and linguistic theory

Online conference proceedings

  • Corpus Linguistics Conference – This is the archive for the six conferences - full papers are available for many of the presentations
  • Proceedings of The International Symposium on Using Corpora in Contrastive and Translation Studies 2010. Edited by R. Xiao. Full papers are available for several of the presentations.
  • David Lee's bookmarks for corpus-based linguistics  – is a great resource with over 1000 links

Project highlights

Language in citizen science forums, online misogyny: new media, old attitudes exploring online misogyny and how to fight against it, life solved podcast - the language of violence with dr alessia tranchese, discover our areas of expertise.

Corpus linguistics is one of our six areas of expertise within our Linguistics research area. Explore the others below.

Translation

We're exploring how texts are translated and the practices around the translation of texts, including professional training, the use of technologies, and non-professional translation communities.

Male translator in speaking into microphone

Discourse analysis

We're researching how ideas, concepts and people are represented through language, and exploring how language is used in real-life contexts.

Young man in conversation with older man

Professional communication

Our research in professional communication explores how spoken and written language is used in workplaces to develop relationships and achieve institutional objectives.

Smiling professional communication student seated at table

Sociolinguistics

Through our work in sociolinguistics, we're studying the ways in which language can affect, and is affected, by social phenomena.

Researchers discuss sociolinguistics text

Teaching English to speakers of other languages (TESOL)

We're focusing on the learning and teaching of English as a second or foreign language, in primary, secondary and adult learning contexts.

Two women studying and speaking

Interested in a PhD in Languages and Linguistics?

Browse our postgraduate research degrees – including PhDs and MPhils – at our  Languages and Linguistics  postgraduate research degrees page.

Book cover

A Practical Handbook of Corpus Linguistics pp 647–659 Cite as

Writing up a Corpus-Linguistic Paper

  • Stefan Th. Gries   ORCID: orcid.org/0000-0002-6497-3958 3 , 4 &
  • Magali Paquot   ORCID: orcid.org/0000-0001-5687-5074 5  
  • First Online: 05 May 2021

1858 Accesses

1 Citations

In this chapter, we provide a brief characterization of what we consider the best and most common structure that empirical corpus-linguistic papers can and should have. In particular, we first introduce the four major parts of a corpus linguistics paper: “Introduction”, “Methods”, “Results”, and “Discussion”. Since the nature of corpus data and corpus techniques makes the two sections very field-specific, we then focus more particularly on the “Methods” and “Discussion” sections of a typical quantitative corpus linguistic paper. We provide recommendations that span the research cycle from data description to analyzing the dataset and reporting the results of statistical tests.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

This is also a means of bringing credit and recognition to all those involved in corpus compilation.

See Gries ( in press ) for more information about how to carry out the tasks of retrieval and annotation discussed above.

American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.

Google Scholar  

Berez-Kroeker, A., Gawne, L., Kung, S., et al. (2017). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56 (1), 1–18.

Article   Google Scholar  

BNC Consortium. (2001). The British National Corpus, version 2 (BNC World) . Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/ . Accessed 30 August 2019.

Branco, A., Cohen, K. B., Vossen, P., Ide, N., & Calzolari, N. (2017). Replicability and reproducibility of research results for human language technology : Introducing an LRE special section. Language Resources and Evaluation, 51 (1), 1–5.

Cleveland, W., & McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific data. Science, 229 (4716), 828–833.

Fox, J. (2003). Effect displays in R for generalised linear models. Journal of Statistical Software, 8 (15), 1–27.

Fox, J., & Hong, J. (2009). Effect displays in R for multinomial and proportional-odds logit models: Extensions to the effects package. Journal of Statistical Software, 32 (1), 1–24.

Fuoli, M., & Hommerberg, C. (2015). Optimising transparency, reliability and replicability: Annotation principles and inter-coder agreement in the quantification of evaluation expressions. Corpora, 10 (3), 315–349.

Gries, S. Th. (2013). Statistics for linguistics with R (2nd rev. & ext. ed.). Boston/New York: De Gruyter Mouton.

Book   Google Scholar  

Gries, S. Th. (2016a). Variationist analysis: Variability due to random effects and autocorrelation. In P. Baker & J. A. Egbert (Eds.), Triangulating methodological approaches in corpus linguistic research (pp. 108–123). New York: Routledge, Taylor and Francis.

Gries, S. Th. (2016b). Quantitative corpus linguistics with R. 2nd rev. & ext. ed. New York & London: Routledge, Taylor & Francis Group.

Gries, S. Th. (in press). Managing synchronic corpus data with the British National Corpus (BNC). In A.L. Berez-Kroeker, B. McDonnell, E. Koller, & L. Collister (Eds.), MIT open handbook of linguistic data management . Cambridge, MA: The MIT Press

Kuhn, M., & Johnson, K. (2013). Applied predictive modeling . Berlin/New York: Springer.

Loewen, S., & Plonsky, L. (2015). An A-Z of applied linguistics research methods . New York: Palgrave.

Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS repository: Advancing research practice and methodology. In A. Mackey & E. Marsden (Eds.), Advancing methodology and practice: The IRIS repository of instruments for research into second languages (pp. 1–21). New York: Routledge.

Paquot, M., & Plonsky, L. (2017). Quantitative research methods and study quality in learner corpus research. International Journal of Learner Corpus Research, 3 (1), 61–94.

Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35 (4), 655–687.

Porte, G. (2012). Replication research in applied linguistics . Cambridge: Cambridge University Press.

Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. Proceedings of international conference on new methods in language processing , Manchester, UK.

Spooren, W., & Degand, L. (2010). Coding coherence relations: Reliability and validity. Corpus Linguistics and Linguistic Theory, 6 (2), 241–266.

Tufte, E. (2001). The visual display of quantitative information (2nd ed.). Graphics Press: Cheshire, CT.

Wilkinson, L., & The Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54 (8), 594–604.

Wulff, S., Gries, S. Th., & Lester, N. A. (2018). Optional that in complementation by German and Spanish learners: Where and how German and Spanish learners differ from native speakers. In A. Tyler, L. Huan, & H. Jan (Eds.), What does applied cognitive linguistics look like? Answers from the L2 classroom and SLA studies (pp. 97–118). Berlin & Boston: De Gruyter Mouton.

Zuur, A. F., Ieno, E. N., & Elphick, C. S. (2010). A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1 (1), 3–14.

Download references

Author information

Authors and affiliations.

University of California, Santa Barbara, Santa Barbara, CA, USA

Stefan Th. Gries

Justus Liebig University Giessen, Giessen, Germany

FNRS - Université catholique de Louvain, Centre for English Corpus Linguistics Louvain-la-Neuve, Belgium

Magali Paquot

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Stefan Th. Gries .

Editor information

Editors and affiliations.

FNRS Centre for English Corpus Linguistics, Language and Communication Institute, UCLouvain, Louvain-la-Neuve, Belgium

Department of Linguistics, University of California, Santa Barbara, CA, USA

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Cite this chapter.

Gries, S.T., Paquot, M. (2020). Writing up a Corpus-Linguistic Paper. In: Paquot, M., Gries, S.T. (eds) A Practical Handbook of Corpus Linguistics. Springer, Cham. https://doi.org/10.1007/978-3-030-46216-1_26

Download citation

DOI : https://doi.org/10.1007/978-3-030-46216-1_26

Published : 05 May 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-46215-4

Online ISBN : 978-3-030-46216-1

eBook Packages : Religion and Philosophy Philosophy and Religion (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

ORIGINAL RESEARCH article

Trends and hot topics in linguistics studies from 2011 to 2021: a bibliometric analysis of highly cited papers.

Sheng Yan

  • School of Foreign Languages, Central China Normal University, Wuhan, China

High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators was retrieved and used to identify and analyze influential contributors at the levels of journals, authors, and countries. The most frequently explored topics were identified by corpus analysis and manual checking. The retrieved topics can be grouped into five general categories: multilingual-related , language teaching , and learning related , psycho/pathological/cognitive linguistics-related , methods and tools-related , and others . Topics such as bi/multilingual(ism) , translanguaging , language/writing development , models , emotions , foreign language enjoyment (FLE) , cognition , anxiety are among the most frequently explored. Multilingual and positive trends are discerned from the investigated HCPs. The findings inform linguistic researchers of the publication characteristics of the HCPs in the linguistics field and help them pinpoint the research trends and directions to exert their efforts in future studies.

1. Introduction

Citations, as a rule, exhibit a skewed distributional pattern over the academic publications: a few papers accumulate an overwhelming large citations while the majority are rarely, if ever, cited. Correspondingly, the highly cited papers (HCPs) receive the greatest amount of attention in the academia as citations are commonly regarded as a strong indicator of research excellence. For academic professionals, following HCPs is an efficient way to stay current with the developments in a field and to make better informed decisions regarding potential research topics and directions to exert their efforts. For academic institutions, government and private agencies, and generally the science policy makers, they keep a close eye on and take advantage of this visible indicator, citations, to make more informed decisions on research funding allocation and science policy formulation. Under the backdrop of ever-growing academic outputs, there is noticeable attention shift from publication quantity to publication quality. Many countries are developing research policies to identify “excellent” universities, research groups, and researchers ( Danell, 2011 ). In a word, HCPs showcase high-quality research, encompass significant themes, and constitute a critical reference point in a research field as they are “gold bullion of science” ( Smith, 2007 ).

2. Literature review

Bibliometrics, a term coined by Pritchard (1969) , refers to the application of mathematical methods to the analysis of academic publications. Essentially this is a quantitative method to depict publication patterns within a given field based on a body of literature. There are many bibliometric studies on natural and social sciences in general ( Hsu and Ho, 2014 ; Zhu and Lei, 2022 ) and on various specific disciplines such as management sciences ( Liao et al., 2018 ), biomass research ( Chen and Ho, 2015 ), computer sciences ( Xie and Willett, 2013 ), and sport sciences ( Mancebo et al., 2013 ; Ríos et al., 2013 ), etc. In these studies, researchers tracked developments, weighed research impacts, and highlighted emerging scientific fronts with bibliometric methods. In the field of linguistics, bibliometric studies all occurred in the past few years ( van Doorslaer and Gambier, 2015 ; Lei and Liao, 2017 ; Gong et al., 2018 ; Lei and Liu, 2018 , 2019 ). These bibliometric studies mostly examined a sub-area of linguistics, such as corpus linguistics ( Liao and Lei, 2017 ), translation studies ( van Doorslaer and Gambier, 2015 ), the teaching of Chinese as a second/foreign language ( Gong et al., 2018 ), academic journals like System ( Lei and Liu, 2018 ) or Porta Linguarum ( Sabiote and Rodríguez, 2015 ), etc. Although Lei and Liu (2019) took the entire discipline of linguistics under investigation, their research is exclusively focused on applied linguistics and restricted in a limited number of journals (42 journals in total), leaving publications in other linguistics disciplines and qualified journals unexamined.

Over the recent years, a number of studies have been concerned with “excellent” papers or HCPs. For example, Small (2004) surveyed the HCPs authors’ opinions on why their papers are highly cited. The strong interest, the novelty, the utility, and the high importance of the work were among the most frequently mentioned. Most authors also considered that their selected HCPs are indeed based on their most important work in their academic career. Aksnes (2003) investigated the characteristics of HCPs and found that they were generally authored by a large number of scientists, often involving international collaboration. Some researchers even attempted to predict the HCPs by building mathematical models, implying “the first mover advantage in scientific publication” ( Newman, 2008 , 2014 ). In other words, papers published earlier in a field generally are more likely to accumulate more citations than those published later. Although many papers addressed HCPs from different perspectives, they held a common belief that HCPs are very different from less or zero cited papers and thus deserve utmost attention in academic research ( Aksnes, 2003 ; Blessinger and Hrycaj, 2010 ; Yan et al., 2022 ).

Although an increased focus on research quality can be observed in different fields, opinions diverge on the range and the inclusion criterion of excellent papers. Are they ‘highly cited’, ‘top cited’, or ‘most frequently cited’ papers? Aksnes (2003) noted two different approaches to define a highly cited article, involving absolute or relative thresholds, respectively. An absolute threshold stipulates a minimum number of citations for identifying excellent papers while a relative threshold employs the percentile rank classes, for example, the top 10% most highly cited papers in a discipline or in a publication year or in a publication set. It is important to note that citations differ significantly in different fields and disciplines. A HCP in natural sciences generally accumulates more citations than its counterpart in social sciences. Thus, it is necessary to investigate HCPs from different fields separately or adopt different inclusion criterion to ensure a valid comparison.

The present study has been motivated by two considerations. First, the sizable number of publications of varied qualities in a scientific field makes it difficult or even impossible to conduct any reliable and effective literature research. Focusing on the quality publications, the HCPs in particular, might lend more credibility to the findings on trends. Second, HCPs can serve as a great platform to discover potentially important information for the development of a discipline and understand the past, present, and future of the scientific structure. Therefore, the present study aims to investigate the hot topics and publication trends in the Web of Science category of linguistics or language & linguistics (shortened as linguistics in later references) with bibliometric methods. The study aims to answer the following three questions:

1. Who are the most productive and impactful contributors of the HCPs in WoS category of linguistics or language & linguistics in terms of publication venues, authors, and countries?

2. What are the most frequently explored topics in HCPs?

3. What are the general research trends revealed from the HCPs?

3. Materials and methods

Different from previous studies which used an arbitrary inclusion threshold (e.g., Blessinger and Hrycaj, 2010 ; Hsu and Ho, 2014 ), we rely on Essential Science Indicator (ESI) to identify the HCPs. Developed by Clarivate, a leading company in the areas of bibliometrics and scientometrics, ESI reveals emerging science trends as well as influential individuals, institutions, papers, journals, and countries in any scientific fields of inquiry by drawing on the complete WoS databases. ESI has been chosen for the following three reasons. First, ESI adopts a stricter inclusion criterion for HCPs identification. That is, a paper is selected as a HCP only when its citations exceed the top 1% citation threshold in each of the 22 ESI subject categories. Second, ESI is widely used and recognized for its reliability and authority in identifying the top-charting work, generating “excellent” metrics including hot and highly cited papers. Third, ESI automatically updates its database to generate the most recent HCPs, especially suitable for trend studies for a specified timeframe.

3.1. Data source

The data retrieval was completed at the portal of our university library on June 20, 2022. The methods to retrieve the data are described in Table 1 . The bibliometric indicators regarding the important contributors at journal/author/country levels were obtained. Specifically, after the research was completed, we clicked the “Analyze Results” bar on the result page for the detailed descriptive analysis of the retrieved bibliometric data.

www.frontiersin.org

Table 1 . Retrieval strategies.

Several points should be noted about the search strategies. First, we searched the bibliometric data from two sub-databases of WoS core collection: Social Science Citation Index (SSCI) and Arts & Humanities Citation Index (A&HCI). There is no need to include the sub-database of Science Citation Index Expanded (SCI-EXPANDED) because publications in the linguistics field are almost exclusively indexed in SSCI and A&HCI journals. WoS core collection was chosen as the data source because it boasts one of the most comprehensive and authoritative databases of bibliometric information in the world. Many previous studies utilized WoS to retrieve bibliometric data. van Oorschot et al. (2018) and Ruggeri et al. (2019) even indicated that WoS meets the highest standards in terms of impact factor and citation counts and hence guarantees the validity of any bibliometric analysis. Second, we do not restrict the document types as HCPs selection informed by ESI only considers articles and reviews. Third, we do not set the date range as the dataset of ESI-HCPs is automatically updated regularly to include the most recent 10 years of publications.

The aforementioned query obtained a total of 143 HCPs published in 48 journals contributed by 352 authors of 226 institutions. We then downloaded the raw bibliometric parameters of the 143 HCPs for follow-up analysis including publication years, authors, publication titles, countries, affiliations, abstracts, citation reports, etc. A complete list of the 143 HCPs can be found in the Supplementary Material . We collected the most recent impact factor (IF) of each journal from the 2022 Journal Citation Reports (JCR).

3.2. Data analysis

3.2.1. citation analysis.

A citation threshold is the minimum number of citations obtained by ranking papers in a research field in descending order by citation counts and then selecting the top fraction or percentage of papers. In ESI, the highly cited threshold reveals the minimum number of citations received by the top 1% of papers from each of the 10 database years. In other words, a paper has to meet the minimum citation threshold that varies by research fields and by years to enter the HCP list. Of the 22 research fields in ESI, Social Science, General is a broad field covering a number of WoS categories including linguistics and language & linguistics . We checked the ESI official website to obtain the yearly highly cited thresholds in the research field of Social Science , General as shown in Figure 1 ( https://esi.clarivate.com/ThresholdsAction.action ). As we can see, the longer a paper has been published, the more citations it has to receive to meet the threshold. We then divided the raw citation numbers of HCPs with the Highly Cited Thresholds in the corresponding year to obtain the normalized citations for each HCP.

www.frontiersin.org

Figure 1 . Highly cited thresholds in the research field of Social Sciences, General.

3.2.2. Corpus analysis and manual checking

To determine the most frequently explored topics in these HCPs, we used both corpus-based analysis of word frequency and manual checking. Specifically, the more frequently a word or phrase occurs in a specifically designed corpus, the more likely it constitutes a research topic. In this study, we built an Abstract corpus with all the abstracts of the 143 HCPs, totaling 24,800 tokens. The procedures to retrieve the research topics in the Abstract corpus were as follows. First, the 143 pieces of abstracts were saved as separate.txt files in one folder. Second, AntConc ( Anthony, 2022 ), a corpus analysis tool for concordancing and text analysis, was employed to extract lists of n-grams (2–4) in decreasing order of frequency. We also generated a list of individual nouns because sometimes individual nouns can also constitute research topics. Considering our small corpus data, we adopted both frequency (3) and range criteria (3) for topic candidacy. That is, a candidate n-gram must occur at least 3 times and in at least 3 different abstract files. The frequency threshold guarantees the importance of the candidate topics while the range threshold guarantees that the topics are not overly crowded in a few number of publications. In this process, we actually tested the frequency and range thresholds several rounds for the inclusion of all the potential topics. In total, we obtained 531 nouns, 1,330 2-grams, 331 3-grams, and 81 4-grams. Third, because most of the retrieved n-grams cannot function as meaningful research topics, we manually checked all the candidate items and discussed extensively to decide their roles as potential research topics until full agreements were reached. Finally, we read all the abstracts of the 143 HCPs to further validate their roles as research topics. In the end, we got 118 topic items in total.

4.1. Main publication venues of HCPs

Of the 48 journals which published the 143 HCPs, 17 journals have contributed at least 3 HCPs ( Table 2 ), around 71.33% of the total examined HCPs (102/143), indicating that HCPs tend to be highly concentrated in a limited number of journals. The three largest publication outlets of HCPs are Bilingualism Language and Cognition (16), International Journal of Bilingual Education and Bilingualism (11), and Modern Language Journal (10). Because each journal varies greatly in the number of papers published per year and the number of HCPs is associated with journal circulations, we divided the total number of papers (TP) in the examined years (2011–2021) with the number of the HCPs to acquire the HCP percentage for each journal (HCPs/TP). The three journals with the highest HCPs/TP percentage are Annual Review of Applied Linguistics (2.26), Modern Language Journal (2.08), and Bilingualism Language and Cognition (1.74), indicating that papers published in these journals have a higher probability to enter the HCPs list.

www.frontiersin.org

Table 2 . Top 17 publication venues of HCPs.

In terms of the general impact of the HCPs from each journal, we divided the number of HCPs with their total citations (TC) to obtain the average citations for each HCP (TC/HCP). The three journals with the highest TC/HCP are Journal of Memory and Language (837.86), Computational Linguistics (533.75), and Journal of Pragmatics (303.75). It indicates that even in the same WoS category, HCPs in different journals have strikingly different capability to accumulate citations. For example, the TC/HCP in System is as low as 31.73, which is even less than 4% of the highest TC/HCP in Journal of Memory and Language .

In regards to the latest journal impact factor (IF) in 2022, the top four journals with the highest IF are Computational Linguistics (7.778) , Modern Language Journal (7.5), Computer Assisted Language Learning (5.964), and Language Learning (5.24). According to the Journal Citation Reports (JCR) quantile rankings in WoS category of linguistics , all the journals on the list belong to the Q 1 (the top 25%), indicating that contributors are more likely to be attracted to contribute and cite papers in these prestigious high impact journals.

4.2. Authors of HCPs

A total of 352 authors had their names listed in the 143 HCPs, of whom 33 authors appeared in at least 2 HCPs as shown in Table 3 . We also provided in Table 3 other indicators to evaluate the authors’ productivity and impact including the total number of citations (TC), the number of citations per HCP, and the number of First author or Corresponding author HCPs (FA/CA). The reason we include the FA/CA indicator is that first authors and corresponding authors are usually considered to contribute the most and should receive greater proportion of credit in academic publications ( Marui et al., 2004 ; Dance, 2012 ).

www.frontiersin.org

Table 3 . Authors with at least 2 HCPs.

In terms of the number of HCPs, Dewaele JM from Birkbeck Univ London tops the list with 7 HCPs with total citations of 492 (TC = 492), followed by Li C from Huazhong Univ Sci & Technol (#HCPs = 5; TC = 215) and Saito K from UCL (#HCPs = 5; TC = 576). It is to be noted that both Li C and Saito K have close academic collaborations with Dewaele JM . For example, 3 of the 5 HCPs by Li C are co-authored with Dewaele JM . The topics in their co-authored HCPs are mostly about foreign language learning emotions such as boredom , anxiety , enjoyment , the measurement , and positive psychology .

In regards to TC, Li, W . from UCL stands out as the most influential scholar among all the listed authors with total citations of 956 from 2 HCPs, followed by Norton B from Univ British Columbia (TC = 915) and Vasishth S from Univ Potsdam (TC = 694). The average citations per HCP from them are also the highest among the listed authors (478, 305, 347, respectively). It is important to note that Li, W.’ s 2 HCPs are his groundbreaking works on translanguaging which almost become must-reads for anyone who engages in translanguaging research ( Li, 2011 , 2018 ). Besides, Li, W. single authors his 2 HCPs, which is extremely rare as HCPs are often the results from multiple researchers. Norton B ’s HCPs are exploring some core issues in applied linguistics such as identity and investment , language learning , and social change that are considered the foundational work in its field ( Norton and Toohey, 2011 ; Darvin and Norton, 2015 ).

From the perspective of FA/CA papers, Li C from Huazhong Univ Sci and Technol is prominent because she is the first author of all her 5 HCPs. Her research on language learning emotions in the Chinese context is gaining widespread recognition ( Li et al., 2018 , 2019 , 2021 ; Li, 2019 , 2021 ). However, as a newly emerging researcher, most of her HCPs are published in the very recent years and hence accumulate relatively fewer citations (TC = 215). Mondada L from Univ Basel follows closely and single authors her 3 HCPs. Her work is mostly devoted to conversation analysis , multimodality , and social interaction ( Mondada, 2016 , 2018 , 2019 ).

We need to mention the following points regarding the productive authors of HCPs. First, when we calculated the number of HCPs from each author, only the papers published in the journals indexed in the investigated WoS categories were taken in account ( linguistics; language & linguistics ), which came as a compromise to protect the linguistics oriented nature of the HCPs. For example, Brysbaert M from Ghent University claimed a total of 8 HCPs at the time of the data retrieval, of which 6 HCPs were published in WoS category of psychology and more psychologically oriented, hence not included in our study. Besides, all the authors on the author list were treated equally when we calculated the number of HCPs, disregarding the author ordering. That implies that some influential authors may not be able to enter the list as their publications are comparatively fewer. Second, as some authors reported different affiliations at their different career stages, we only provide their most recent affiliation for convenience. Third, it is highly competitive to have one’s work selected as HCPs. The fact that a majority of the HCPs authors do not appear in our productive author list does not diminish their great contributions to this field. The rankings in Table 3 does not necessarily reflect the recognition authors have earned in academia at large.

4.3. Productive countries of HCPs

In total, the 143 HCPs originated from 33 countries. The most productive countries that contributed at least three HCPs are listed in Table 4 . The USA took an overwhelming lead with 59 HCPs, followed distantly by England with 31 HCPs. They also boasted the highest total citations (TC = 15,770; TC = 9,840), manifesting their high productivity and strong influence as traditional powerhouses in linguistics research. In regards to the average citations per HCP, Germany , England and the USA were the top three countries (TC/HCP = 281.67, 281.14, and 267.29, respectively). Although China held the third position with 19 HCPs published, its TC/HCP is the third from the bottom (TC/HCP = 66.84). One of the important reasons is that 13 out of the 19 HCPs contributed by scholars in China are published in the year of 2020 or 2021. The newly published HCPs may need more time to accumulate citations. Besides, 18 out of the 19 HCPs in China are first author and/or corresponding authors, indicating that scholars in China are becoming more independent and gaining more voice in English linguistics research.

www.frontiersin.org

Table 4 . Top 18 countries with at least 3 HCPs.

Two points should be noted here as to the productive countries. First, we calculated the HCP contributions from the country level instead of the region level. In other words, HCP contributions from different regions of the same country will be combined in the calculation. For example, HCPs from Scotland were added to the HCPs from England . HCPs from Hong Kong , Macau , and Taiwan are put together with the HCPs from Mainland China . In this way, a clear picture of the HCPs on the country level can be painted. Second, we manually checked the address information of the first author and corresponding author for each HCP. There are some cases where the first author or the corresponding author may report affiliations from more than one country. In this case, every country in their address list will be treated equally in the FA/CA calculation. In other word, a HCP may be classified into more than one country because of the different country backgrounds of the first and/or the corresponding author.

4.4. Top 20 HCPs

The top 20 HCPs with the highest normed citations are listed in decreasing order in Table 5 . The top cited publications can guide us to better understand the development and research topics in recent years.

www.frontiersin.org

Table 5 . Top 20 HCPs.

By reading the titles and the abstracts of these top HCPs, we categorized the topics of the 20 HCPs into the following five groups: (i) statistical and analytical methods in (psycho)linguistics such as sentimental analysis, sentence simplification techniques, effect sizes, linear mixed models (#1, 3, 4, 6, 9, 14), (ii) language learning/teaching emotions such enjoyment, anxiety, boredom, stress (#11, 15, 16, 18, 19), (iii) translanguaging or multilinguilism (#5, 13, 20, 17), (iv) language perception (#2, 7, 10), (v) medium of instruction (#8, 12). It is no surprise that 6 out of the top 20 HCPs are about statistical methods in linguistics because language researchers aspire to employ statistics to make their research more scientific. Besides, we noticed that the papers on language teaching/learning emotions on the list are all published in the year of 2020 and 2021, indicating that these emerging topics may deserve more attention in future research. We also noticed two Covid-19 related articles (#16, 19) explored the emotions teachers and students experience during the pandemic, a timely response to the urgent need of the language learning and teaching community.

It is of special interest to note that papers from the journals indexed in multiple JCR categories seem to accumulate more citations. For example, Journal of Memory and Language , American Journal of Speech-Language Pathology , and Computational Linguistics are indexed both in SSCI and SCIE and contribute the top 4 HCPs, manifesting the advantage of these hybrid journals in amassing citations compared to the conventional language journals. Besides, different to findings from Yan et al. (2022) that most of the top HCPs in the field of radiology are reviews in document types, 19 out of the top 20 HCPs are research articles instead of reviews except Macaro et al. (2018) .

4.5. Most frequently explored topics of HCPs

After obtaining the corpus based topic items, we read all the titles and abstracts of the 143 HCPs to further validate their roles as research topics. Table 6 presents the top research topics with the observed frequency of 5 or above. We grouped these topics into five broad categories: bilingual-related, language learning/teaching-related, psycho/pathological/cognitive linguistics-related, methods and tools-related, and others . The observed frequency count for each topic in the abstract corpus were included in the brackets. We found that about 34 of the 143 HCPs are exploring bilingual related issues, the largest share among all the categorized topics, testifying its academic popularity in the examined timespan. Besides, 30 of the 143 HCPs are investigating language learning/teaching-related issues, with topics ranging from learners (e.g., EFL learners, individual difference) to multiple learning variables (e.g., learning strategy, motivation, agency). The findings here will be validated by the analysis of the keywords.

www.frontiersin.org

Table 6 . Categorization of the most explored research topics.

Several points should be mentioned regarding the topic candidacy. First, for similar topic expressions, we used a cover term and added the frequency counts. For example, multilingualism is a cover term for bilinguals, bilingualism, plurilingualism, and multilingualism . Second, for nouns of singular and plural forms (e.g., emotion and emotions ) or for items with different spellings (e.g., meta analysis and meta analyses ), we combined the frequency counts. Third, we found that some longer items (3 grams and 4 grams) could be subsumed to short ones (2 grams or monogram) without loss of essential meaning (e.g., working memory from working memory capacity ). In this case, the shorter ones were kept for their higher frequency. Fourth, some highly frequent terms were discarded because they were too general to be valuable topics in language research, for example, applied linguistics , language use , second language .

5. Discussion and implications

Based on 143 highly cited papers collected from the WoS categories of linguistics , the present study attempts to present a bird’s eye view of the publication landscape and the most updated research themes reflected from the HCPs in the linguistics field. Specifically, we investigated the important contributors of HCPs in terms of journals, authors and countries. Besides, we spotlighted the research topics by corpus-based analysis of the abstracts and a detailed analysis of the top HCPs. The study has produced several findings that bear important implications.

The first finding is that the HCPs are highly concentrated in a limited journals and countries. In regards to journals, those in the spheres of bilingualism and applied linguistics (e.g., language teaching and learning) are likely to accumulate more citations and hence to produce more HCPs. Journals that focus on bilingualism from a linguistic, psycholinguistic, and neuroscientific perspective are the most frequent outlets of HCPs as evidenced by the top two productive journals of HCPs, Bilingualism Language and Cognition and International Journal of Bilingual Education and Bilingualism . This can be explained by the multidisciplinary nature of bilingual-related research and the development of cognitive measurement techniques. The merits of analyzing publication venues of HCPs are two folds. One the one hand, it can point out which sources of high-quality publications in this field can be inquired for readers as most of the significant and cutting-edge achievements are concentrated in these prestigious journals. On the other hand, it also provides essential guidance or channels for authors or contributors to submit their works for higher visibility.

In terms of country distributions, the traditional powerhouses in linguistics research such as the USA and England are undoubtedly leading the HCP publications in both the number and the citations of the HCPs. However, developing countries are also becoming increasing prominent such as China and Iran , which could be traceable in the funding and support of national language policies and development policies as reported in recent studies ( Ping et al., 2009 ; Lei and Liu, 2019 ). Take China as an example. Along with economic development, China has given more impetus to academic outputs with increased investment in scientific research ( Lei and Liao, 2017 ). Therefore, researchers in China are highly motivated to publish papers in high-quality journals to win recognition in international academia and to deal with the publish or perish pressure ( Lee, 2014 ). These factors may explain the rise of China as a new emerging research powerhouse in both natural and social sciences, including English linguistics research.

The second finding is the multilingual trend in linguistics research. The dominant clustering of topics regarding multilingualism can be understood as a timely response to the multilingual research fever ( May, 2014 ). 34 out of the 143 HCPs have such words as bilingualism, bilingual, multilingualism , translanguaging , etc., in their titles, reflecting a strong multilingual tendency of the HCPs. Multilingual-related HCPs mainly involve three aspects: multilingualism from the perspectives of psycholinguistics and cognition (e.g., Luk et al., 2011 ; Leivada et al., 2020 ); multilingual teaching (e.g., Schissel et al., 2018 ; Ortega, 2019 ; Archila et al., 2021 ); language policies related to multilingualism (e.g., Shen and Gao, 2018 ). As a pedagogical process initially used to describe the bilingual classroom practice and also a frequently explored topic in HCPs, translanguaging is developed into an applied linguistics theory since Li’s Translanguaging as a Practical Theory of Language ( Li, 2018 ). The most common collocates of translanguaging in the Abstract corpus are pedagogy/pedagogies, practices, space/spaces . There are two main reasons for this multilingual turn. First, the rapid development of globalization, immigration, and overseas study programs greatly stimulate the use and research of multiple languages in different linguistic contexts. Second, in many non-English countries, courses are delivered through languages (mostly English) besides their mother tongue ( Clark, 2017 ). Students are required to use multiple languages as resources to learn and understand subjects and ideas. The burgeoning body of English Medium Instruction literature in higher education is in line with the rising interest in multilingualism. Due to the innate multidisciplinary nature, it is to be expected that, multilingualism, the topic du jour, is bound to attract more attention in the future.

The third finding is the application of Positive Psychology (PP) in second language acquisition (SLA), that is, the positive trend in linguistic research. In our analysis, 20 out of 143 HCPs have words or phrases such as emotions, enjoyment, boredom, anxiety , and positive psychology in their titles, which might signal a shift of interest in the psychology of language learners and teachers in different linguistic environments. Our study shows Foreign language enjoyment (FLE) is the most frequently explored emotion, followed by foreign language classroom anxiety (FLCA), the learners’ metaphorical left and right feet on their journey to acquiring the foreign language ( Dewaele and MacIntyre, 2016 ). In fact, the topics of PP are not entirely new to SLA. For example, studies of language motivations, affections, and good language learners all provide roots for the emergence of PP in SLA ( Naiman, 1978 ; Gardner, 2010 ). In recent years, both research and teaching applications of PP in SLA are building rapidly, with a diversity of topics already being explored such as positive education and PP interventions. It is to be noted that SLA also feeds back on PP theories and concepts besides drawing inspirations from it, which makes it “an area rich for interdisciplinary cross-fertilization of ideas” ( Macintyre et al., 2019 ).

It should be noted that subjectivity is involved when we decide and categorize the candidate topic items based on the Abstract corpus. However, the frequency and range criteria guarantee that these items are actually more explored in multiple HCPs, thus indicating topic values for further investigation. Some high frequent n-grams are abandoned because they are too general or not meaningful topics. For example, applied linguistics is too broad to be included as most of the HCPs concern issues in this research line instead of theoretical linguistics. By meaningful topics, we mean that the topics can help journal editors and readers quickly locate their interested fields ( Lei and Liu, 2019 ), as the author keywords such as bilingualism , emotions , and individual differences . The examination of the few 3/4-grams and monograms (mostly nouns) revealed that most of them were either not meaningful topics or they could be subsumed in the 2-grams. Besides, there is inevitably some overlapping in the topic categorizations. For example, some topics in the language teaching and learning category are situated and discussed within the context of multilingualism. The merits of topic categorizations are two folds: to better monitor the overlapping between the Abstract corpus-based topic items and the keywords; to roughly delineate the research strands in the HCPs for future research.

It should also be noted that all the results were based on the retrieved HCPs only. The study did not aim to paint a comprehensive and full picture of the whole landscape of linguistic research. Rather, it specifically focused on the most popular literature in a specified timeframe, thus generating the snapshots or trends in linguistic research. One of the important merits of this methodology is that some newly emerging but highly cited researchers can be spotlighted and gain more academic attention because only the metrics of HCPs are considered in calculation. On the contrary, the exclusion of some other highly cited researchers in general such as Rod Ellis and Ken Hyland just indicates that their highly cited publications are not within our investigated timeframe and cannot be interpreted as their diminishing academic influence in the field. Besides, the study does not consider the issue of collaborators or collaborations in calculating the number of HCPs for two reasons. First, although some researchers are regular collaborators such as Li CC and Dewaele JM, their individual contribution can never be undermined. Second, the study also provides additional information about the number of the FA/CA HCPs from each listed author, which may aid readers in locating their interested research.

We acknowledge that our study has some limitations that should be addressed in future research. First, our study focuses on the HCPs extracted from WoS SSCI and A&HCI journals, the alleged most celebrated papers in this field. Future studies may consider including data from other databases such as Scopus to verify the findings of the present study. Second, our Abstract corpus-based method for topic extraction involved human judgement. Although the final list was the result of several rounds of discussions among the authors, it is difficult or even impossible to avoid subjectivity and some worthy topics may be unconsciously missed. Therefore, future research may consider employing automatic algorithms to extract topics. For example, a dependency-based machine learning approach can be used to identify research topics ( Zhu and Lei, 2021 ).

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ supplementary material .

Author contributions

SY: conceptualization and methodology. SY and LZ: writing-review and editing and writing-original draft. All authors contributed to the article and approved the submitted version.

This work was supported by Humanities and Social Sciences Youth Fund of China MOE under the grant 20YJC740076 and 18YJC740141.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1052586/full#supplementary-material

Aksnes, D. W. (2003). Characteristics of highly cited papers. Res. Eval. 12, 159–170. doi: 10.3152/147154403781776645

CrossRef Full Text | Google Scholar

Anthony, L. (2022). AntConc (version 4.0.5) Tokyo, Japan: Waseda University. Available at: https://www.laurenceanthony.net/software (Accessed June 20, 2022).

Google Scholar

Archila, P. A., Molina, J., and Truscott de Mejía, A.-M. (2021). Fostering bilingual scientific writing through a systematic and purposeful code-switching pedagogical strategy. Int. J. Biling. Educ. Biling. 24, 785–803. doi: 10.1080/13670050.2018.1516189

Blessinger, K., and Hrycaj, P. (2010). Highly cited articles in library and information science: an analysis of content and authorship trends. Libr. Inf. Sci. Res. 32, 156–162. doi: 10.1016/j.lisr.2009.12.007

Chen, H., and Ho, Y. S. (2015). Highly cited articles in biomass research: a bibliometric analysis. Renew. Sust. Energ. Rev. 49, 12–20. doi: 10.1016/j.rser.2015.04.060

Clark, S. (2017). Translanguaging in higher education: beyond monolingual ideologies. Int. J. Biling. Educ. Biling. 22, 1048–1051. doi: 10.1080/13670050.2017.1322568

Dance, A. (2012). Authorship: Who’s on first? Nature 489, 591–593. doi: 10.1038/nj7417-591a

PubMed Abstract | CrossRef Full Text | Google Scholar

Danell, R. (2011). Can the quality of scientific work be predicted using information on the author’s track record? J. Am. Soc. Inf. Sci. Technol. 62, 50–60. doi: 10.1002/asi.21454

Darvin, R., and Norton, B. (2015). Identity and a model of Investment in Applied Linguistics. Annu. Rev. Appl. Linguist. 35, 36–56. doi: 10.1017/S0267190514000191

Dewaele, J.-M., and MacIntyre, P. D. (2016). “Foreign language enjoyment and foreign language classroom anxiety: the right and left feet of the language learner” in Positive psychology in SLA . eds. D. M. Peter, G. Tammy, and M. Sarah (Bristol, Blue Ridge Summit: Multilingual Matters), 215–236.

Gardner, R. (2010). Motivation and second language acquisition: The socio-educational model . New York: Peter Lang.

Gong, Y., Lyu, B., and Gao, X. (2018). Research on teaching Chinese as a second or foreign language in and outside mainland China: a bibliometric analysis. Asia Pac. Educ. Res. 27, 277–289. doi: 10.1007/s40299-018-0385-2

Hsu, Y., and Ho, Y. S. (2014). Highly cited articles in health care sciences and services field in science citation index Expanded. Methods Inf. Med. 53, 446–458. doi: 10.3414/ME14-01-0022

Lee, I. (2014). Publish or perish: the myth and reality of academic publishing. Lang. Teach. 47, 250–261. doi: 10.1017/S0261444811000504

Lei, L., and Liao, S. (2017). Publications in linguistics journals from mainland China, Hong Kong, Taiwan, and Macau (2003–2012): a bibliometric analysis. J. Quant. Ling. 24, 54–64. doi: 10.1080/09296174.2016.1260274

Lei, L., and Liu, D. (2018). The research trends and contributions of System’s publications over the past four decades (1973–2017): a bibliometric analysis. System 80, 1–13. doi: 10.1016/j.system.2018.10.003

Lei, L., and Liu, D. (2019). Research trends in applied linguistics from 2005 to 2016: a bibliometric analysis and its implications. Appl. Linguis. 40, 540–561. doi: 10.1093/applin/amy003

Leivada, E., Westergaard, M., Duabeitia, J. A., and Rothman, J. (2020). On the phantom-like appearance of bilingualism effects on neurocognition: (how) should we proceed? Biling. Lang. Congn. 24, 197–210. doi: 10.1017/S1366728920000358

Li, W. (2011). Moment analysis and translanguaging space: discursive construction of identities by multilingual Chinese youth in Britain. Energy Fuel 43, 1222–1235. doi: 10.1016/j.pragma.2010.07.035

Li, W. (2018). Translanguaging as a practical theory of language. Appl. Linguis. 39, 9–30. doi: 10.1093/applin/amx039

Li, C. (2019). A positive psychology perspective on Chinese EFL students’ trait emotional intelligence, foreign language enjoyment and EFL learning achievement. J. Multiling. Multicult. Dev. 41, 246–263. doi: 10.1080/01434632.2019.1614187

Li, C. (2021). A control-value theory approach to boredom in English classes among university students in China. Mod. Lang. J. 105, 317–334. doi: 10.1111/modl.12693

Li, C., Dewaele, J. M., and Hu, Y. (2021). Foreign language learning boredom: conceptualization and measurement. Appl. Ling. Rev. doi: 10.1515/applirev-2020-0124

Li, C., Dewaele, J. M., and Jiang, G. (2019). The complex relationship between classroom emotions and EFL achievement in China. Appl. Ling. Rev. 11, 485–510. doi: 10.1515/applirev-2018-0043

Li, C., Jiang, G., and Jean-Marc, D. (2018). Understanding Chinese high school students’ foreign language enjoyment: validation of the Chinese version of the foreign language enjoyment scale. System 76, 183–196. doi: 10.1016/j.system.2018.06.004

Liao, S., and Lei, L. (2017). What we talk about when we talk about corpus: a bibliometric analysis of corpus-related research in linguistics (2000-2015). Glottometrics 38, 1–20.

Liao, H., Tang, M., Li, Z., and Lev, B. (2018). Bibliometric analysis for highly cited papers in operations research and management science from 2008 to 2017 based on essential science indicators. Omega 88, 223–236. doi: 10.1016/j.omega.2018.11.005

Luk, G., Sa, E. D., and Bialystok, E. (2011). Is there a relation between onset age of bilingualism and enhancement of cognitive control? Biling. Lang. Cogn. 14, 588–595. doi: 10.1017/S1366728911000010

Macaro, E., Curle, S., Pun, J., and Dearden, J. (2018). A systematic review of English medium instruction in higher education. Lang. Teach. 51, 36–76. doi: 10.1017/S0261444817000350

Macintyre, P., Gregersen, T., and Mercer, S. (2019). Setting an agenda for positive psychology in SLA: theory, practice, and research. Mod. Lang. J. 103, 262–274. doi: 10.1111/modl.12544

Mancebo, F. P., Sapena, A. F., Herrera, M. V., González, L., Toca, H., and Benavent, R. A. (2013). Scientific literature analysis of judo in web of science. Arch. Budo 9, 81–91. doi: 10.12659/AOB.883883

Marui, M., Bozikov, J., Katavi, V., Hren, D., Kljakovi-Gapi, M., and Marui, A. (2004). Authorship in a small medical journal: a study of contributorship statements by corresponding authors. Sci. Eng. Ethics 10, 493–502. doi: 10.1007/s11948-004-0007-7

May, S. (2014). The multilingual turn: Implications for SLA, TESOL and bilingual education . New York: Routledge.

Mondada, L. (2016). Challenges of multimodality: language and the body in social interaction. J. Socioling. 20, 336–366. doi: 10.1111/josl.1_12177

Mondada, L. (2018). Multiple temporalities of language and body in interaction: challenges for transcribing multimodality. Res. Lang. Soc. Interact. 51, 85–106. doi: 10.1080/08351813.2018.1413878

Mondada, L. (2019). Contemporary issues in conversation analysis: embodiment and materiality, multimodality and multisensoriality in social interaction. J. Pragmat. 145, 47–62. doi: 10.1016/j.pragma.2019.01.016

Naiman, N. (1978). The good language learner . Clevedon, UK: Multilingual Matters.

Newman, M. (2008). The first-mover advantage in scientific publication. Eplasty 86, 68001–68006. doi: 10.1209/0295-5075/86/68001

Newman, M. (2014). Prediction of highly cited papers. Eplasty 105:28002. doi: 10.1209/0295-5075/105/28002

Norton, B., and Toohey, K. (2011). Identity, language learning, and social change. Lang. Teach. 44, 412–446. doi: 10.1017/S0261444811000309

Ortega, L. (2019). SLA and the study of equitable multilingualism. Mod. Lang. J. 103, 23–38. doi: 10.1111/modl.12525

Ping, Z., Thijs, B., and Glnzel, W. (2009). Is China also becoming a giant in social sciences? Scientometrics 79, 593–621. doi: 10.1007/s11192-007-2068-x

Pritchard, A. (1969). Statistical bibliography or bibliometrics. J. Doc. 25, 348–349.

Ríos, L. J. C., Tamao, I. M., and Olmos, J. (2013). Bibliometric study (1922-2009) on rugby articles in research journals. South Afr. J. Res. Sport Phys. Educ. Rec. 17, 313–109. doi: 10.3176/tr.2013.3.06

Ruggeri, G., Orsi, L., and Corsi, S. (2019). A bibliometric analysis of the scientific literature on Fairtrade labelling. Int. IJC 43, 134–152. doi: 10.1111/ijcs.12492

Sabiote, C. R., and Rodríguez, J. A. (2015). Bibliometric study and methodological quality indicators of the journal porta Linguarum during six year period 2008-2013. Porta Ling. 24, 135–150. doi: 10.30827/Digibug.53866

Schissel, J. L., De Korne, H., and López-Gopar, M. E. (2018). Grappling with translanguaging for teaching and assessment in culturally and linguistically diverse contexts: teacher perspectives from Oaxaca, Mexico. Int. J. Biling. Educ. Biling. 24, 340–356. doi: 10.1080/13670050.2018.1463965

Shen, Q., and Gao, X. (2018). Multilingualism and policy making in greater China: ideological and implementational spaces. Lang. Policy 18, 1–16. doi: 10.1007/s10993-018-9473-7

Small, H. (2004). Why authors think their papers are highly cited. Scientometrics 60, 305–316. doi: 10.1023/B:SCIE.0000034376.55800.18

Smith, D. R. (2007). The New Zealand timber economy, 1840–1935. N. Z. Med. J. 120, U2871–U2313. doi: 10.1016/0305-7488(90)90044-C

van Doorslaer, L., and Gambier, Y. (2015). Measuring relationships in translation studies. On affiliations and keyword frequencies in the translation studies bibliography. Perspectives 23, 305–319. doi: 10.1080/0907676X.2015.1026360

van Oorschot, J. A. W. H., Hofman, E., and Halman, J. (2018). A bibliometric review of the innovation adoption literature. Technol. Forecast. Soc. Chang. 134, 1–21. doi: 10.1016/j.techfore.2018.04.032

Xie, Z., and Willett, P. (2013). The development of computer science research in the People’s republic of China 2000–2009: a bibliometric study. Inf. Dev. 29, 251–264. doi: 10.1177/0266666912458515

Yan, S., Zhang, H., and Wang, J. (2022). Trends and hot topics in radiology, nuclear medicine and medical imaging from 2011–2021: a bibliometric analysis of highly cited papers. Jpn. J. Radiol. 40, 847–856. doi: 10.1007/s11604-022-01268-z

Zhu, H., and Lei, L. (2021). A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies. Lib. Hi Tech 40, 495–515. doi: 10.1108/LHT-01-2021-0051

Zhu, H., and Lei, L. (2022). The research trends of text classification studies (2000–2020): a bibliometric analysis. SAGE Open 12, 215824402210899–215824402210816. doi: 10.1177%2F21582440221089963

Keywords: bibliometric analysis, linguistics, highly cited papers, corpus analysis, research trends

Citation: Yan S and Zhang L (2023) Trends and hot topics in linguistics studies from 2011 to 2021: A bibliometric analysis of highly cited papers. Front. Psychol . 13:1052586. doi: 10.3389/fpsyg.2022.1052586

Received: 24 September 2022; Accepted: 23 December 2022; Published: 11 January 2023.

Reviewed by:

Copyright © 2023 Yan and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Le Zhang, ✉ [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Architecture and Design
  • Asian and Pacific Studies
  • Business and Economics
  • Classical and Ancient Near Eastern Studies
  • Computer Sciences
  • Cultural Studies
  • Engineering
  • General Interest
  • Geosciences
  • Industrial Chemistry
  • Islamic and Middle Eastern Studies
  • Jewish Studies
  • Library and Information Science, Book Studies
  • Life Sciences
  • Linguistics and Semiotics
  • Literary Studies
  • Materials Sciences
  • Mathematics
  • Social Sciences
  • Sports and Recreation
  • Theology and Religion
  • Publish your article
  • The role of authors
  • Promoting your article
  • Abstracting & indexing
  • Publishing Ethics
  • Why publish with De Gruyter
  • How to publish with De Gruyter
  • Our book series
  • Our subject areas
  • Your digital product at De Gruyter
  • Contribute to our reference works
  • Product information
  • Tools & resources
  • Product Information
  • Promotional Materials
  • Orders and Inquiries
  • FAQ for Library Suppliers and Book Sellers
  • Repository Policy
  • Free access policy
  • Open Access agreements
  • Database portals
  • For Authors
  • Customer service
  • People + Culture
  • Journal Management
  • How to join us
  • Working at De Gruyter
  • Mission & Vision
  • De Gruyter Foundation
  • De Gruyter Ebound
  • Our Responsibility
  • Partner publishers

corpus linguistics research topics

Your purchase has been completed. Your documents are now available to view.

Stefania M. Maci, Michele Sala: Book Review on Corpus Linguistics and Translation Tools for Digital Humanities: Research Methods and Applications

Fulu Liang is a PhD candidate in Translation Studies at College of Foreign Languages, Nankai University. With a strong background in in-house translation spanning five years, he has gained extensive translation experience in industries such as metallurgy, automobiles, and wind power. His research interests are diverse, with a particular focus on the established field of technical translation and cutting-edge topics such as computational translation studies, digital translation studies, and language technology.

Reviewed Publication:

Book Review on Corpus Linguistics and Translation Tools for Digital Humanities: Research Methods and Applications, by Stefania M. Maci Michele Sala Bloomsbury, 2022, xiv+249 pp.

1 General introduction

As digital humanities (DH) gradually moves from the niche to the mainstream, its impact has been felt by an increasing number of disciplines in the humanities – including corpus linguistics and corpus-based translation studies. Although both DH and corpus linguistics or corpus-based translation studies involve the use of computers, they have developed independently of each other with little interaction until after 2010. The second decade of the 21st century witnessed the boom of disruptive technologies such as artificial intelligence, big data, cloud computing, blockchain, and virtual reality, resulting in heightened awareness of applying computer technologies to humanities research (e.g., Zheng et al., 2022 ). This transformation offered an impetus to DH, which then flourished worldwide. Many disciplines hastened to embrace it with a view to borrowing computational methods from DH to foster interdisciplinary activities, improve the digital literacy and data literacy of the humanities, or enhance computational thinking in the field. Arguably, DH will shed light on the latter two and be reinforced in return. However, there is little consensus on the best practices of DH-informed corpus linguistics and corpus-based translation studies that can help us clarify where we are, where to go, and how to go. Fortunately, Corpus Linguistics and Translation Tools for Digital Humanities: Research Methods and Applications , edited by Stefania M. Maci and Michele Sala, was published at an opportune time and will hopefully lay the foundation on which future research can be based.

2 Book introduction

This book brings together the three strands DH, corpus linguistics, and corpus-based translation studies. It mainly comprises case studies of various research topics from a variety of research fields. In 10 chapters, this book begins with an introductory chapter (Chapter 1), which is followed by Part 1 (Chapters 2–5), which focuses on corpus linguistics and DH, and Part 2 (Chapters 6–10), which focuses on corpus-based translation studies and DH.

In Chapter 1, the editors justify their reasons for choosing such a theme for the book by viewing corpus linguistics and DH as being in a part–whole relationship after a critical review of their differences and similarities. According to the editors, “DH is the overarching term for the macro-area of research which analyses texts” (p. 3), while corpus linguistics refers precisely to the ‘plethora of methods’ mentioned in DH, namely, “the set of principled approaches and tools” (p. 3). The editors then provide a brief introduction to Part 1 (the connection between DH and corpora) and Part 2 (the connection between corpora and translation studies). The rest of this chapter is devoted to briefing readers on the content of each chapter in order to enable them to better follow the thoughts of the authors.

Furthering the discussion, Chapter 2 by Paola Catenaccio discusses two main strands of DH – i.e., (i) the study of computer-mediated communication (CMS) in its various forms and (ii) the use of computer-based techniques for text analysis. It highlights that the traditional theories of CMS do not fully account for emergent technology-derived issues, such as multimodality and multisemiotics. Therefore, Catenaccio puts forward an “adaptive theory approach” to DH, which means that theory development in DH should be adaptive not only to capture the evolution of the object of analysis but also in the sense that it must rely on evidence emerging from corpus-driven (or data-driven) investigation.

In Chapter 3, Marina Bondi demonstrates the use of corpora in cross-cultural genre studies with a case study of Corporate Social Responsibility (CSR) reports. The author first lays the foundation for further discussion by necessitating the integration of lexical categories with semantic and functional, pragmatic perspectives and the employment of corpus linguistics. After a critical review of the cross-cultural analysis of CSR reports aiming to elicit the research question and determine the type of corpora (in this case, full corpus and comparable subcorpora) to be adopted, Bondi reports on the language, size, representativeness, source, and comparability of the corpus. For the detailed analysis, a top-down lexico-grammatical analysis of the generic structure of CSR reports is adopted, followed by a bottom-up semantic and pragmatic analysis using keywords and concordance.

In Chapter 4, Miguel Fuster-Márquez discusses the application of corpus to the extraction and operationalization of lexical bundles (LBs), which broke grounds with the compilation of Biber et al.’s Longman Grammar of Spoken and Written English (1999). Fuster-Márquez distinguishes between the phraseological approach and the probabilistic approach in studies on LBs and then focuses on the latter. He highlights that the probabilistic approach is an inductive bottom-up approach to the identification of LBs, which relies entirely on corpus techniques. Further, the core features for the identification of LBs are reported. Fuster-Márquez ends the chapter with a discussion of bundle size, frequency threshold, and dispersion, which are shared criteria for the two main operational approaches to LB identification: frequency-defined bundles and association-defined bundles.

Furthermore, Chapter 5 by Stefania M. Maci investigates the dissemination of the ketogenic diet (KD) discourse on Twitter. After reviewing related literature on the KD, Maci provides a detailed description of the methodology. The data generated during a designated period of time were collected by searching for keywords and hashtags using Social Bearing, a free Twitter analytics application. Then, quantitative-based analysis was performed on the data with Sketch Engine to identify typical linguistic characteristics. To triangulate the data, another quantitative-based analysis was performed on the data with WMatrix 4 to determine the semantic domains. The research findings exhibit Twitter users’ understanding of and attitudes towards KD, shedding light on the dissemination of e-health discourse on digital platforms.

Moving on to corpus linguistics and translation studies, Chapter 6 by Patrizia Anesa examines the use of digital corpora for professional legal translation from the perspective of DH. To begin with, Anesa extends DH from academics to the professional setting of legal translation by situating the area of practice between legilinguistics, translation studies, and corpus linguistics within the overarching concept of DH. She then overviews some existing legal corpora employed in legilinguistics and legal translation and offers a glimpse into the evolving relationship between corpora and specialized legal translation. At the conclusion of this chapter, she discusses in detail the use of corpus in translation tools and processes, translation practice, and translator training.

In Chapter 7, Cinzia Spinzi and Anouska Zummo present a comparative study of emotive language in English and Italian migrant narratives to assess the intention and effect of linguistic choices. To be specific, they adopted the Appraisal Theory and focus on its Affect dimension, which comprises five semantic domains for emotions: un/happiness, in/security, dis/satisfaction, surprise, and dis/inclination. After acknowledging the contributions of DH to the availability of data and software, among others, Spinzi and Zummo report on the data collection from digital museums and the design of the corpus. The interrogation of the corpus was conducted with AntConc (version 3.5.9), focusing on the polarity and strategy of emotive language. The conclusion was drawn by comparing and interpreting the results from the English subcorpus and the Italian subcorpus.

Then, Chapter 8 by Francesca Bianchi et al. introduces us to the set of terminology management affordances with built-in learning analytics for interpreter training. Bianchi et al. first identify the needs of a glossary tool linked to monitoring and self-monitoring tools (for teachers and students, respectively) and possibly supported by learning analytics technologies. Then, they provide an overview of the existing tools supporting terminology management. According to Bianchi et al., the affordances tailored to their needs comprise a glossary tool, web search tracking and logging functions, and a learning analytics system. Bianchi et al. then demonstrated the performance of the affordances in interpreter training at the University of Salento. They end the chapter with insights into the possible uses of the affordances in the future.

Chapter 9 by Gianmarco Vignozzi applies corpus linguistics to the analysis of the construction and translation of characters in the four English original films of Little Women and their Italian dubbed versions. Vignozzi paves the way for further exploration by reflecting on the efficacy of corpus linguistics in assessing the translation of multimedia texts. Following an overview of the big-screen adaptations of Little Women , he proceeds to detail the development of the corpus using Sketch Engine. Further, the analysis of the March sisters’ speech was conducted in two stages, with a focus on the implicit textual cues identified by Culpeper’s characterization model. Initially, a quantitative character-based analysis was performed by extracting keywords whose results were then subjected to a qualitative analysis. Besides, the concordance lines were examined to evaluate the translation of the dialogues.

In the last chapter of this book, Alessandra Rizzo investigates the linguistic features of dialogues and subtitles of TV crime dramas and their translations. She begins the chapter by situating this study within DH. A parallel corpus made from three episodes of three different TV crime dramas set in different geographical locations was compiled for the analysis. She then undertook a two-level analysis of the linguistic features of orality: one centred on examining language choices drawing on the theoretical framework of Halliday’s Systemic Functional Linguistics (SFL) and the other on the translation strategies of linguistic features. Rizzo finally concludes with an interpretation of the outcomes and provides concluding remarks on the constraints and prospects of this study.

3 Critical evaluation

This book presents corpus linguistics and corpus-based translation studies as being within the purview of DH. It provides not only theoretical reflections on the burgeoning fields of research nested in DH, corpus linguistics, and translation studies with computers as pivotal components but also concrete case studies covering a wide range of research topics. Answering the call for the humanities to engage with digitalization, this seminal work touches the nerves of those seeking to operate in this interdisciplinary or even transdisciplinary realm by paving the way for further discussion.

The biggest merit of the book is that it brings to the fore the nexus between corpus and DH and promises to consolidate the area. Although “corpus” and “corpora” are widely used in the literature on DH, it is surprising that corpus linguistics has been slow to embrace DH. A simple search with the keywords “digital humanities” in the SSCI & AHCI journals International Journal of Corpus Linguistics and Corpus Linguistics and Linguistic Theory indexed in CNKI Scholar academic database returns no results. It was only in the third decade of the 21st century that the relevance of DH to corpus linguistics began to be recognized. As a matter of fact, The Routledge Handbook of Corpus Linguistics (2020) dedicates a new chapter to corpora and DH, while in the 2010 edition, no instance of “digital humanities” is found.

In fact, this trend is also true of corpus-based translation studies. Tanasescu (2021) astutely highlights that it was only as recently as 2018–2019 that DH-inflected research started to gain more and more ground (in Translation Studies). Her observation coincides with the situation in China. Hu (2018) wrote an introductory article titled “Progress and Prospects of Translation Studies from the Perspective of Digital Humanities”. In the same year, the Research Center for Digital Humanities led by Hongwu Qin, another distinguished scholar in corpus-based translation studies in China, was founded at Qufu Normal University. This edited volume resonates with the academic circle and will undoubtedly promote related research.

Another major merit is that it provides some case studies against which future research can be benchmarked. First, the wide-ranging case studies indicate what can be counted as DH. For example, Chapter 8 demonstrates the use of learning analytics in interpreter training, suggesting that the quantitative analysis of students’ learning data is also a relevant component of DH. As online learning is becoming one of the most significant trends in educational settings ( Mei et al., 2022 ), viewing learning analytics as being part of DH will shed light on the analysis of students’ online learning data. Second, it provides procedural guidelines for this line of research, which usually comprise corpus design, data collection, data processing, concordance, and the interpretation of results with relevant theories, among others. Further, issues encountered during these processes and their corresponding solutions will be of reference value. For example, in Chapter 9, a mixed method of quantitative analysis and qualitative analysis is adopted to compensate for the limitations of each method. Third, it brings together many different areas of research such as legilinguistics, e-health, and films, allowing cross-referencing between these areas to adapt tools, methods, theories, and topics to their specific requirements. In summary, since the article by Jensen (2014) was published, other articles have touched upon the relationship between corpus linguistics and DH; however, books dedicated to this topic with specific case studies have been few and far between. From this perspective, this book is a pioneering scholarly work that promises to inspire the adoption of DH.

Despite the aforementioned merits, this book is not without certain flaws. First, the title is slightly confusing. It seems to suggest that translation tools are included, but in fact, by tools, it means corpus tools for translation studies rather than computer-aided translation tools. Second, the relevance of DH to each chapter should be articulated explicitly. While some chapters provide discussions on the significance of DH, there are several chapters whose connections with DH are not explicitly stated, resulting in a certain degree of disjointedness between respective chapters and the volume as a whole; this is especially true of Part 1. Third, the tools and methods used in this book are still very limited and mostly confined to traditional corpus linguistics, as opposed to kaleidoscopic data processing and analysis tools and methods in DH. In the future, importance should be placed on adapting advanced tools and methods from DH and computational linguistics, among others, to meet the needs of this interdisciplinary field of research. Fourth, the connection between case studies and DH needs to be deepened. To be specific, DH has evolved into a field of research with domain-specific discourse comprising research topics and terms, among others. Future research carried out with DH in mind should incorporate DH discourse for cross-fertilization. Nevertheless, this book combines the corpus linguistics approach and the DH approach with humanities research, making it a ground-breaking seminal work for scholars of humanities in the digital age.

About the author

Hu, K. (2018). 数字人文视域下翻译研究的进展与前景 [Progress and prospects of translation studies from the perspective of digital humanities]. Chinese Translators Journal , 39(6), 24–26. https://doi.org/10.21037/cco.2018.05.03 Search in Google Scholar

Jensen, K. E. (2014). Linguistics and the digital humanities: (Computational) corpus linguistics. Journal of Media and Communication Research , 30(57), 115–134. 10.7146/mediekultur.v30i57.15968 Search in Google Scholar

Mei, F., Lu, Y., & Ma, Q. (2022). Online language education courses: A Chinese case from an ecological perspective. Journal of China Computer-Assisted Language Learning , 2(2), 228–256. https://doi.org/10.1515/jccall-2022-0017 Search in Google Scholar

Tanasescu, R. (2021). Complexity and the place of translation in digital humanities: Post-disciplinary communities of practice in the translation studies network. In K. Marais, & R. Meylaerts (Eds), Exploring the implications of complexity thinking for translation studies (pp. 30–72). Routledge. 10.4324/9781003105114-3 Search in Google Scholar

Zheng, C., Yu, M., Guo, Z., Liu, H., Gao, M., & Chai, C. (2022). Review of the application of virtual reality in language education from 2010 to 2020. Journal of China Computer-Assisted Language Learning , 2(2), 299–335. https://doi.org/10.1515/jccall-2022-0014 Search in Google Scholar

© 2023 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

  • X / Twitter

Supplementary Materials

Please login or register with De Gruyter to order this product.

Journal of China Computer-Assisted Language Learning

Journal and Issue

Articles in the same issue.

corpus linguistics research topics

corpus linguistics research topics

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

  •  We're Hiring!
  •  Help Center

Corpus Linguistics

  • Most Cited Papers
  • Most Downloaded Papers
  • Newest Papers
  • Save to Library
  • Last »
  • Linguistics Follow Following
  • Languages and Linguistics Follow Following
  • Historical Linguistics Follow Following
  • Cognitive Linguistics Follow Following
  • Language Variation and Change Follow Following
  • Linguistic Typology Follow Following
  • Grammaticalization Follow Following
  • Syntax Follow Following
  • Sociolinguistics Follow Following
  • Semantics Follow Following

Enter the email address you signed up with and we'll email you a reset link.

  • Academia.edu Publishing
  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Do a more advanced search »

Search for dissertations about: "corpus linguistics"

Showing result 1 - 5 of 86 swedish dissertations containing the words corpus linguistics .

1. Morphosyntactic Corpora and Tools for Persian

Author : Mojgan Seraji ; Joakim Nivre ; Carina Jahani ; Jan Hajic ; Uppsala universitet ; [] Keywords : NATURVETENSKAP ; NATURAL SCIENCES ; Persian ; language technology ; corpus ; treebank ; preprocessing ; segmentation ; part-of-speech tagging ; dependency parsing ; Computational Linguistics ; Datorlingvistik ;

Abstract : This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources consist of an improved part-of-speech tagged corpus and a dependency treebank, as well as tools for text normalization, sentence segmentation, tokenization, part-of-speech tagging, and dependency parsing for Persian. READ MORE

2. Engagement in Medical Research Discourse: A Multisemiotic Discourse-Semantic Study of Dialogic Positioning

Author : Daniel Lees Fryer ; Göteborgs universitet ; [] Keywords : HUMANIORA ; HUMANITIES ; HUMANIORA ; HUMANITIES ; engagement ; medical research discourse ; social semiotics ; systemic functional linguistics ; dialogic theory ; linguistics ; semiosis ; multisemiosis ; multimodality ; intersemiosis ; intermodality ; corpus linguistics ; genre ; disciplinarity ; ideology ;

Abstract : This study investigates how medical researchers engage with a background of prior and anticipated utterances in a collection of highly cited English-language medical research articles. Taking a multisemiotic, systemic-functional approach, I examine the verbal, visual, and mathematical resources used by medical research writers to construe, engage with, and position themselves in relation to a dialogic background of different voices, positions, and propositions. READ MORE

3. Clefts in English and Swedish: A contrastive study of IT-clefts and WH-clefts in original texts and translations

Author : Mats Johansson ; Engelska ; [] Keywords : HUMANIORA ; HUMANITIES ; Engelska språk och litteratur ; English language and literature ; information structure ; ground ; focus ; discourse topic ; topic ; theme ; discourse ; fronting ; wh-clefts ; it-clefts ; pseudo-cleft constructions ; cleft constructions ; bidirectional translation corpus ; translation ; corpus linguistics ; contrastive linguistics ; Swedish ; English ; Scandinavian languages and literature ; Nordiska språk språk och litteratur ; Linguistics ; Lingvistik ;

Abstract : This study investigates the use of cleft constructions in English and Swedish on the basis of a bidirectional translation corpus consisting of original English and Swedish texts and their translations into the other language. This design minimizes the problems inherent in corpora of original texts alone, viz. READ MORE

4. The Balochi Language of Turkmenistan : A corpus-based grammatical description

Author : Serge Axenov ; Carina Jahani ; Åke Viberg ; Elena Bashir ; Uppsala universitet ; [] Keywords : Iranian languages ; Balochi ; dialectology ; phonology ; morphology ; syntax ; descriptive linguistics ; sociolinguistics ; unwritten languages ; fieldwork ; Iranian languages ; Iranska språk - allmänt ; Linguistics ; lingvistik ;

Abstract : This dissertation is a synchronic description of the Balochi language as spoken in Turkmenistan. The dissertation consists of three main parts: sound structure, word and phraselevel morphosyntax and clause structure. READ MORE

5. Learning Idiomaticity : A Corpus-Based Study of Idiomatic Expressions in Learners' Written Production

Author : Maria Wiktorsson ; Engelska ; [] Keywords : HUMANIORA ; HUMANITIES ; corpus linguistics ; idiom principle ; open choice principle ; construction grammar ; compositionality ; conventionalisation ; idiom ; formulae ; collocation ; prefab ; Swedish learners of English ; L2 ; Idiomaticity ; L1 ; English language and literature ; Engelska språk och litteratur ; Linguistics ; Lingvistik ;

Abstract : The aim of this study is to investigate how Swedish learners of English (at different levels of proficiency) master idiomaticity in their target language. I argue that idiomaticity can be related to the storage and use of multi-word expressions that are preferred by native speakers. READ MORE

Searchphrases right now

  • Network properties
  • Intervention Studies
  • QUALITY IN HEALTH CARE
  • biocatalysis thesis
  • cross-sensitivity
  • permafrost soils
  • the standard model

Popular searches

  • protein biosynthesis
  • microwave-assisted chemistry
  • Reperfusion
  • Thermal response
  • migration model
  • University dissertation from Luleå
  • A. Andersson
  • evaluating system
  • Plasmodium falciparum
  • Uppsala universitet

Popular dissertations yesterday (2024-04-01)

  • Gatekeepers of democracy? : a comparative study of elite support for democracy in Russia and the Baltic States
  • De Inconexis Continuum - A Study of the Late Antique Latin Wedding Centos
  • Eliciting admissions from suspects in criminal investigations
  • Approaches to descriptions and analyses of problem-solving processes : the 8-puzzle
  • Trends and current clinical aspects of complicated gallstone disease - with special reference to endoscopic treatment
  • Two dimensions of student ownership of learning during small-group work with miniprojects and context rich problems in physics
  • Towards Automated CFD for Engineering Methods in Aircraft Design
  • Painting from Within : Developing and Evaluating a Manual-based Art therapy for Patients with Depression
  • Development and Application of Non-linear Mid-infrared Laser Spectroscopy for Combustion Diagnostics
  • Pore pressure in thawing soil
  • Popular complementary terms: essays, phd thesis, master thesis, papers, importance, trend, impact, advantages, disadvantages, role of, example, case study.

See yesterday's most popular searches here . Dissertations.se is the english language version of Avhandlingar.se .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

Trends and hot topics in linguistics studies from 2011 to 2021: A bibliometric analysis of highly cited papers

Associated data.

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ supplementary material .

High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators was retrieved and used to identify and analyze influential contributors at the levels of journals, authors, and countries. The most frequently explored topics were identified by corpus analysis and manual checking. The retrieved topics can be grouped into five general categories: multilingual-related , language teaching , and learning related , psycho/pathological/cognitive linguistics-related , methods and tools-related , and others . Topics such as bi/multilingual(ism) , translanguaging , language/writing development , models , emotions , foreign language enjoyment (FLE) , cognition , anxiety are among the most frequently explored. Multilingual and positive trends are discerned from the investigated HCPs. The findings inform linguistic researchers of the publication characteristics of the HCPs in the linguistics field and help them pinpoint the research trends and directions to exert their efforts in future studies.

1. Introduction

Citations, as a rule, exhibit a skewed distributional pattern over the academic publications: a few papers accumulate an overwhelming large citations while the majority are rarely, if ever, cited. Correspondingly, the highly cited papers (HCPs) receive the greatest amount of attention in the academia as citations are commonly regarded as a strong indicator of research excellence. For academic professionals, following HCPs is an efficient way to stay current with the developments in a field and to make better informed decisions regarding potential research topics and directions to exert their efforts. For academic institutions, government and private agencies, and generally the science policy makers, they keep a close eye on and take advantage of this visible indicator, citations, to make more informed decisions on research funding allocation and science policy formulation. Under the backdrop of ever-growing academic outputs, there is noticeable attention shift from publication quantity to publication quality. Many countries are developing research policies to identify “excellent” universities, research groups, and researchers ( Danell, 2011 ). In a word, HCPs showcase high-quality research, encompass significant themes, and constitute a critical reference point in a research field as they are “gold bullion of science” ( Smith, 2007 ).

2. Literature review

Bibliometrics, a term coined by Pritchard (1969) , refers to the application of mathematical methods to the analysis of academic publications. Essentially this is a quantitative method to depict publication patterns within a given field based on a body of literature. There are many bibliometric studies on natural and social sciences in general ( Hsu and Ho, 2014 ; Zhu and Lei, 2022 ) and on various specific disciplines such as management sciences ( Liao et al., 2018 ), biomass research ( Chen and Ho, 2015 ), computer sciences ( Xie and Willett, 2013 ), and sport sciences ( Mancebo et al., 2013 ; Ríos et al., 2013 ), etc. In these studies, researchers tracked developments, weighed research impacts, and highlighted emerging scientific fronts with bibliometric methods. In the field of linguistics, bibliometric studies all occurred in the past few years ( van Doorslaer and Gambier, 2015 ; Lei and Liao, 2017 ; Gong et al., 2018 ; Lei and Liu, 2018 , 2019 ). These bibliometric studies mostly examined a sub-area of linguistics, such as corpus linguistics ( Liao and Lei, 2017 ), translation studies ( van Doorslaer and Gambier, 2015 ), the teaching of Chinese as a second/foreign language ( Gong et al., 2018 ), academic journals like System ( Lei and Liu, 2018 ) or Porta Linguarum ( Sabiote and Rodríguez, 2015 ), etc. Although Lei and Liu (2019) took the entire discipline of linguistics under investigation, their research is exclusively focused on applied linguistics and restricted in a limited number of journals (42 journals in total), leaving publications in other linguistics disciplines and qualified journals unexamined.

Over the recent years, a number of studies have been concerned with “excellent” papers or HCPs. For example, Small (2004) surveyed the HCPs authors’ opinions on why their papers are highly cited. The strong interest, the novelty, the utility, and the high importance of the work were among the most frequently mentioned. Most authors also considered that their selected HCPs are indeed based on their most important work in their academic career. Aksnes (2003) investigated the characteristics of HCPs and found that they were generally authored by a large number of scientists, often involving international collaboration. Some researchers even attempted to predict the HCPs by building mathematical models, implying “the first mover advantage in scientific publication” ( Newman, 2008 , 2014 ). In other words, papers published earlier in a field generally are more likely to accumulate more citations than those published later. Although many papers addressed HCPs from different perspectives, they held a common belief that HCPs are very different from less or zero cited papers and thus deserve utmost attention in academic research ( Aksnes, 2003 ; Blessinger and Hrycaj, 2010 ; Yan et al., 2022 ).

Although an increased focus on research quality can be observed in different fields, opinions diverge on the range and the inclusion criterion of excellent papers. Are they ‘highly cited’, ‘top cited’, or ‘most frequently cited’ papers? Aksnes (2003) noted two different approaches to define a highly cited article, involving absolute or relative thresholds, respectively. An absolute threshold stipulates a minimum number of citations for identifying excellent papers while a relative threshold employs the percentile rank classes, for example, the top 10% most highly cited papers in a discipline or in a publication year or in a publication set. It is important to note that citations differ significantly in different fields and disciplines. A HCP in natural sciences generally accumulates more citations than its counterpart in social sciences. Thus, it is necessary to investigate HCPs from different fields separately or adopt different inclusion criterion to ensure a valid comparison.

The present study has been motivated by two considerations. First, the sizable number of publications of varied qualities in a scientific field makes it difficult or even impossible to conduct any reliable and effective literature research. Focusing on the quality publications, the HCPs in particular, might lend more credibility to the findings on trends. Second, HCPs can serve as a great platform to discover potentially important information for the development of a discipline and understand the past, present, and future of the scientific structure. Therefore, the present study aims to investigate the hot topics and publication trends in the Web of Science category of linguistics or language & linguistics (shortened as linguistics in later references) with bibliometric methods. The study aims to answer the following three questions:

  • Who are the most productive and impactful contributors of the HCPs in WoS category of linguistics or language & linguistics in terms of publication venues, authors, and countries?
  • What are the most frequently explored topics in HCPs?
  • What are the general research trends revealed from the HCPs?

3. Materials and methods

Different from previous studies which used an arbitrary inclusion threshold (e.g., Blessinger and Hrycaj, 2010 ; Hsu and Ho, 2014 ), we rely on Essential Science Indicator (ESI) to identify the HCPs. Developed by Clarivate, a leading company in the areas of bibliometrics and scientometrics, ESI reveals emerging science trends as well as influential individuals, institutions, papers, journals, and countries in any scientific fields of inquiry by drawing on the complete WoS databases. ESI has been chosen for the following three reasons. First, ESI adopts a stricter inclusion criterion for HCPs identification. That is, a paper is selected as a HCP only when its citations exceed the top 1% citation threshold in each of the 22 ESI subject categories. Second, ESI is widely used and recognized for its reliability and authority in identifying the top-charting work, generating “excellent” metrics including hot and highly cited papers. Third, ESI automatically updates its database to generate the most recent HCPs, especially suitable for trend studies for a specified timeframe.

3.1. Data source

The data retrieval was completed at the portal of our university library on June 20, 2022. The methods to retrieve the data are described in Table 1 . The bibliometric indicators regarding the important contributors at journal/author/country levels were obtained. Specifically, after the research was completed, we clicked the “Analyze Results” bar on the result page for the detailed descriptive analysis of the retrieved bibliometric data.

Retrieval strategies.

Several points should be noted about the search strategies. First, we searched the bibliometric data from two sub-databases of WoS core collection: Social Science Citation Index (SSCI) and Arts & Humanities Citation Index (A&HCI). There is no need to include the sub-database of Science Citation Index Expanded (SCI-EXPANDED) because publications in the linguistics field are almost exclusively indexed in SSCI and A&HCI journals. WoS core collection was chosen as the data source because it boasts one of the most comprehensive and authoritative databases of bibliometric information in the world. Many previous studies utilized WoS to retrieve bibliometric data. van Oorschot et al. (2018) and Ruggeri et al. (2019) even indicated that WoS meets the highest standards in terms of impact factor and citation counts and hence guarantees the validity of any bibliometric analysis. Second, we do not restrict the document types as HCPs selection informed by ESI only considers articles and reviews. Third, we do not set the date range as the dataset of ESI-HCPs is automatically updated regularly to include the most recent 10 years of publications.

The aforementioned query obtained a total of 143 HCPs published in 48 journals contributed by 352 authors of 226 institutions. We then downloaded the raw bibliometric parameters of the 143 HCPs for follow-up analysis including publication years, authors, publication titles, countries, affiliations, abstracts, citation reports, etc. A complete list of the 143 HCPs can be found in the Supplementary Material . We collected the most recent impact factor (IF) of each journal from the 2022 Journal Citation Reports (JCR).

3.2. Data analysis

3.2.1. citation analysis.

A citation threshold is the minimum number of citations obtained by ranking papers in a research field in descending order by citation counts and then selecting the top fraction or percentage of papers. In ESI, the highly cited threshold reveals the minimum number of citations received by the top 1% of papers from each of the 10 database years. In other words, a paper has to meet the minimum citation threshold that varies by research fields and by years to enter the HCP list. Of the 22 research fields in ESI, Social Science, General is a broad field covering a number of WoS categories including linguistics and language & linguistics . We checked the ESI official website to obtain the yearly highly cited thresholds in the research field of Social Science , General as shown in Figure 1 ( https://esi.clarivate.com/ThresholdsAction.action ). As we can see, the longer a paper has been published, the more citations it has to receive to meet the threshold. We then divided the raw citation numbers of HCPs with the Highly Cited Thresholds in the corresponding year to obtain the normalized citations for each HCP.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-1052586-g001.jpg

Highly cited thresholds in the research field of Social Sciences, General.

3.2.2. Corpus analysis and manual checking

To determine the most frequently explored topics in these HCPs, we used both corpus-based analysis of word frequency and manual checking. Specifically, the more frequently a word or phrase occurs in a specifically designed corpus, the more likely it constitutes a research topic. In this study, we built an Abstract corpus with all the abstracts of the 143 HCPs, totaling 24,800 tokens. The procedures to retrieve the research topics in the Abstract corpus were as follows. First, the 143 pieces of abstracts were saved as separate .txt files in one folder. Second, AntConc ( Anthony, 2022 ), a corpus analysis tool for concordancing and text analysis, was employed to extract lists of n-grams (2–4) in decreasing order of frequency. We also generated a list of individual nouns because sometimes individual nouns can also constitute research topics. Considering our small corpus data, we adopted both frequency (3) and range criteria (3) for topic candidacy. That is, a candidate n-gram must occur at least 3 times and in at least 3 different abstract files. The frequency threshold guarantees the importance of the candidate topics while the range threshold guarantees that the topics are not overly crowded in a few number of publications. In this process, we actually tested the frequency and range thresholds several rounds for the inclusion of all the potential topics. In total, we obtained 531 nouns, 1,330 2-grams, 331 3-grams, and 81 4-grams. Third, because most of the retrieved n-grams cannot function as meaningful research topics, we manually checked all the candidate items and discussed extensively to decide their roles as potential research topics until full agreements were reached. Finally, we read all the abstracts of the 143 HCPs to further validate their roles as research topics. In the end, we got 118 topic items in total.

4.1. Main publication venues of HCPs

Of the 48 journals which published the 143 HCPs, 17 journals have contributed at least 3 HCPs ( Table 2 ), around 71.33% of the total examined HCPs (102/143), indicating that HCPs tend to be highly concentrated in a limited number of journals. The three largest publication outlets of HCPs are Bilingualism Language and Cognition (16), International Journal of Bilingual Education and Bilingualism (11), and Modern Language Journal (10). Because each journal varies greatly in the number of papers published per year and the number of HCPs is associated with journal circulations, we divided the total number of papers (TP) in the examined years (2011–2021) with the number of the HCPs to acquire the HCP percentage for each journal (HCPs/TP). The three journals with the highest HCPs/TP percentage are Annual Review of Applied Linguistics (2.26), Modern Language Journal (2.08), and Bilingualism Language and Cognition (1.74), indicating that papers published in these journals have a higher probability to enter the HCPs list.

Top 17 publication venues of HCPs.

N: the number of HCPs in each journal; N%: the percentage of HCPs in each journal in the total of 143 HCPs; TP: the total number of papers in the examined timespan (2011–2021); N/TP %: the percentage of HCPs in the total journal publications in the examined time span; TC/HCP: average citations of each HCP; R: journal ranking for the designated indicator; IF: Impact Factor in the year of 2022.

In terms of the general impact of the HCPs from each journal, we divided the number of HCPs with their total citations (TC) to obtain the average citations for each HCP (TC/HCP). The three journals with the highest TC/HCP are Journal of Memory and Language (837.86), Computational Linguistics (533.75), and Journal of Pragmatics (303.75). It indicates that even in the same WoS category, HCPs in different journals have strikingly different capability to accumulate citations. For example, the TC/HCP in System is as low as 31.73, which is even less than 4% of the highest TC/HCP in Journal of Memory and Language .

In regards to the latest journal impact factor (IF) in 2022, the top four journals with the highest IF are Computational Linguistics (7.778) , Modern Language Journal (7.5), Computer Assisted Language Learning (5.964), and Language Learning (5.24). According to the Journal Citation Reports (JCR) quantile rankings in WoS category of linguistics , all the journals on the list belong to the Q 1 (the top 25%), indicating that contributors are more likely to be attracted to contribute and cite papers in these prestigious high impact journals.

4.2. Authors of HCPs

A total of 352 authors had their names listed in the 143 HCPs, of whom 33 authors appeared in at least 2 HCPs as shown in Table 3 . We also provided in Table 3 other indicators to evaluate the authors’ productivity and impact including the total number of citations (TC), the number of citations per HCP, and the number of First author or Corresponding author HCPs (FA/CA). The reason we include the FA/CA indicator is that first authors and corresponding authors are usually considered to contribute the most and should receive greater proportion of credit in academic publications ( Marui et al., 2004 ; Dance, 2012 ).

Authors with at least 2 HCPs.

N: number of HCPs from each author; FA/CA: first author or corresponding author HCPs; TC: total citations of the HCPs from each author; C/HCP: average citations per HCP for each author.

In terms of the number of HCPs, Dewaele JM from Birkbeck Univ London tops the list with 7 HCPs with total citations of 492 (TC = 492), followed by Li C from Huazhong Univ Sci & Technol (#HCPs = 5; TC = 215) and Saito K from UCL (#HCPs = 5; TC = 576). It is to be noted that both Li C and Saito K have close academic collaborations with Dewaele JM . For example, 3 of the 5 HCPs by Li C are co-authored with Dewaele JM . The topics in their co-authored HCPs are mostly about foreign language learning emotions such as boredom , anxiety , enjoyment , the measurement , and positive psychology .

In regards to TC, Li, W . from UCL stands out as the most influential scholar among all the listed authors with total citations of 956 from 2 HCPs, followed by Norton B from Univ British Columbia (TC = 915) and Vasishth S from Univ Potsdam (TC = 694). The average citations per HCP from them are also the highest among the listed authors (478, 305, 347, respectively). It is important to note that Li, W.’ s 2 HCPs are his groundbreaking works on translanguaging which almost become must-reads for anyone who engages in translanguaging research ( Li, 2011 , 2018 ). Besides, Li, W. single authors his 2 HCPs, which is extremely rare as HCPs are often the results from multiple researchers. Norton B ’s HCPs are exploring some core issues in applied linguistics such as identity and investment , language learning , and social change that are considered the foundational work in its field ( Norton and Toohey, 2011 ; Darvin and Norton, 2015 ).

From the perspective of FA/CA papers, Li C from Huazhong Univ Sci and Technol is prominent because she is the first author of all her 5 HCPs. Her research on language learning emotions in the Chinese context is gaining widespread recognition ( Li et al., 2018 , 2019 , 2021 ; Li, 2019 , 2021 ). However, as a newly emerging researcher, most of her HCPs are published in the very recent years and hence accumulate relatively fewer citations (TC = 215). Mondada L from Univ Basel follows closely and single authors her 3 HCPs. Her work is mostly devoted to conversation analysis , multimodality , and social interaction ( Mondada, 2016 , 2018 , 2019 ).

We need to mention the following points regarding the productive authors of HCPs. First, when we calculated the number of HCPs from each author, only the papers published in the journals indexed in the investigated WoS categories were taken in account ( linguistics; language & linguistics ), which came as a compromise to protect the linguistics oriented nature of the HCPs. For example, Brysbaert M from Ghent University claimed a total of 8 HCPs at the time of the data retrieval, of which 6 HCPs were published in WoS category of psychology and more psychologically oriented, hence not included in our study. Besides, all the authors on the author list were treated equally when we calculated the number of HCPs, disregarding the author ordering. That implies that some influential authors may not be able to enter the list as their publications are comparatively fewer. Second, as some authors reported different affiliations at their different career stages, we only provide their most recent affiliation for convenience. Third, it is highly competitive to have one’s work selected as HCPs. The fact that a majority of the HCPs authors do not appear in our productive author list does not diminish their great contributions to this field. The rankings in Table 3 does not necessarily reflect the recognition authors have earned in academia at large.

4.3. Productive countries of HCPs

In total, the 143 HCPs originated from 33 countries. The most productive countries that contributed at least three HCPs are listed in Table 4 . The USA took an overwhelming lead with 59 HCPs, followed distantly by England with 31 HCPs. They also boasted the highest total citations (TC = 15,770; TC = 9,840), manifesting their high productivity and strong influence as traditional powerhouses in linguistics research. In regards to the average citations per HCP, Germany , England and the USA were the top three countries (TC/HCP = 281.67, 281.14, and 267.29, respectively). Although China held the third position with 19 HCPs published, its TC/HCP is the third from the bottom (TC/HCP = 66.84). One of the important reasons is that 13 out of the 19 HCPs contributed by scholars in China are published in the year of 2020 or 2021. The newly published HCPs may need more time to accumulate citations. Besides, 18 out of the 19 HCPs in China are first author and/or corresponding authors, indicating that scholars in China are becoming more independent and gaining more voice in English linguistics research.

Top 18 countries with at least 3 HCPs.

Two points should be noted here as to the productive countries. First, we calculated the HCP contributions from the country level instead of the region level. In other words, HCP contributions from different regions of the same country will be combined in the calculation. For example, HCPs from Scotland were added to the HCPs from England . HCPs from Hong Kong , Macau , and Taiwan are put together with the HCPs from Mainland China . In this way, a clear picture of the HCPs on the country level can be painted. Second, we manually checked the address information of the first author and corresponding author for each HCP. There are some cases where the first author or the corresponding author may report affiliations from more than one country. In this case, every country in their address list will be treated equally in the FA/CA calculation. In other word, a HCP may be classified into more than one country because of the different country backgrounds of the first and/or the corresponding author.

4.4. Top 20 HCPs

The top 20 HCPs with the highest normed citations are listed in decreasing order in Table 5 . The top cited publications can guide us to better understand the development and research topics in recent years.

Top 20 HCPs.

To save space, not full information about the HCPs is given. Some article titles have been abbreviated if they are too lengthy; for the authors, we report the first two authors and use “et al” if there are three authors or more; RC: raw citations; NC: normalized citations

By reading the titles and the abstracts of these top HCPs, we categorized the topics of the 20 HCPs into the following five groups: (i) statistical and analytical methods in (psycho)linguistics such as sentimental analysis, sentence simplification techniques, effect sizes, linear mixed models (#1, 3, 4, 6, 9, 14), (ii) language learning/teaching emotions such enjoyment, anxiety, boredom, stress (#11, 15, 16, 18, 19), (iii) translanguaging or multilinguilism (#5, 13, 20, 17), (iv) language perception (#2, 7, 10), (v) medium of instruction (#8, 12). It is no surprise that 6 out of the top 20 HCPs are about statistical methods in linguistics because language researchers aspire to employ statistics to make their research more scientific. Besides, we noticed that the papers on language teaching/learning emotions on the list are all published in the year of 2020 and 2021, indicating that these emerging topics may deserve more attention in future research. We also noticed two Covid-19 related articles (#16, 19) explored the emotions teachers and students experience during the pandemic, a timely response to the urgent need of the language learning and teaching community.

It is of special interest to note that papers from the journals indexed in multiple JCR categories seem to accumulate more citations. For example, Journal of Memory and Language , American Journal of Speech-Language Pathology , and Computational Linguistics are indexed both in SSCI and SCIE and contribute the top 4 HCPs, manifesting the advantage of these hybrid journals in amassing citations compared to the conventional language journals. Besides, different to findings from Yan et al. (2022) that most of the top HCPs in the field of radiology are reviews in document types, 19 out of the top 20 HCPs are research articles instead of reviews except Macaro et al. (2018) .

4.5. Most frequently explored topics of HCPs

After obtaining the corpus based topic items, we read all the titles and abstracts of the 143 HCPs to further validate their roles as research topics. Table 6 presents the top research topics with the observed frequency of 5 or above. We grouped these topics into five broad categories: bilingual-related, language learning/teaching-related, psycho/pathological/cognitive linguistics-related, methods and tools-related, and others . The observed frequency count for each topic in the abstract corpus were included in the brackets. We found that about 34 of the 143 HCPs are exploring bilingual related issues, the largest share among all the categorized topics, testifying its academic popularity in the examined timespan. Besides, 30 of the 143 HCPs are investigating language learning/teaching-related issues, with topics ranging from learners (e.g., EFL learners, individual difference) to multiple learning variables (e.g., learning strategy, motivation, agency). The findings here will be validated by the analysis of the keywords.

Categorization of the most explored research topics.

N: the number of the HCPs in each topic category; ELF: English as a lingua franca; CLIL: content and language integrated learning; FLE: foreign language enjoyment; FLCA: foreign language classroom anxiety

Several points should be mentioned regarding the topic candidacy. First, for similar topic expressions, we used a cover term and added the frequency counts. For example, multilingualism is a cover term for bilinguals, bilingualism, plurilingualism, and multilingualism . Second, for nouns of singular and plural forms (e.g., emotion and emotions ) or for items with different spellings (e.g., meta analysis and meta analyses ), we combined the frequency counts. Third, we found that some longer items (3 grams and 4 grams) could be subsumed to short ones (2 grams or monogram) without loss of essential meaning (e.g., working memory from working memory capacity ). In this case, the shorter ones were kept for their higher frequency. Fourth, some highly frequent terms were discarded because they were too general to be valuable topics in language research, for example, applied linguistics , language use , second language .

5. Discussion and implications

Based on 143 highly cited papers collected from the WoS categories of linguistics , the present study attempts to present a bird’s eye view of the publication landscape and the most updated research themes reflected from the HCPs in the linguistics field. Specifically, we investigated the important contributors of HCPs in terms of journals, authors and countries. Besides, we spotlighted the research topics by corpus-based analysis of the abstracts and a detailed analysis of the top HCPs. The study has produced several findings that bear important implications.

The first finding is that the HCPs are highly concentrated in a limited journals and countries. In regards to journals, those in the spheres of bilingualism and applied linguistics (e.g., language teaching and learning) are likely to accumulate more citations and hence to produce more HCPs. Journals that focus on bilingualism from a linguistic, psycholinguistic, and neuroscientific perspective are the most frequent outlets of HCPs as evidenced by the top two productive journals of HCPs, Bilingualism Language and Cognition and International Journal of Bilingual Education and Bilingualism . This can be explained by the multidisciplinary nature of bilingual-related research and the development of cognitive measurement techniques. The merits of analyzing publication venues of HCPs are two folds. One the one hand, it can point out which sources of high-quality publications in this field can be inquired for readers as most of the significant and cutting-edge achievements are concentrated in these prestigious journals. On the other hand, it also provides essential guidance or channels for authors or contributors to submit their works for higher visibility.

In terms of country distributions, the traditional powerhouses in linguistics research such as the USA and England are undoubtedly leading the HCP publications in both the number and the citations of the HCPs. However, developing countries are also becoming increasing prominent such as China and Iran , which could be traceable in the funding and support of national language policies and development policies as reported in recent studies ( Ping et al., 2009 ; Lei and Liu, 2019 ). Take China as an example. Along with economic development, China has given more impetus to academic outputs with increased investment in scientific research ( Lei and Liao, 2017 ). Therefore, researchers in China are highly motivated to publish papers in high-quality journals to win recognition in international academia and to deal with the publish or perish pressure ( Lee, 2014 ). These factors may explain the rise of China as a new emerging research powerhouse in both natural and social sciences, including English linguistics research.

The second finding is the multilingual trend in linguistics research. The dominant clustering of topics regarding multilingualism can be understood as a timely response to the multilingual research fever ( May, 2014 ). 34 out of the 143 HCPs have such words as bilingualism, bilingual, multilingualism , translanguaging , etc., in their titles, reflecting a strong multilingual tendency of the HCPs. Multilingual-related HCPs mainly involve three aspects: multilingualism from the perspectives of psycholinguistics and cognition (e.g., Luk et al., 2011 ; Leivada et al., 2020 ); multilingual teaching (e.g., Schissel et al., 2018 ; Ortega, 2019 ; Archila et al., 2021 ); language policies related to multilingualism (e.g., Shen and Gao, 2018 ). As a pedagogical process initially used to describe the bilingual classroom practice and also a frequently explored topic in HCPs, translanguaging is developed into an applied linguistics theory since Li’s Translanguaging as a Practical Theory of Language ( Li, 2018 ). The most common collocates of translanguaging in the Abstract corpus are pedagogy/pedagogies, practices, space/spaces . There are two main reasons for this multilingual turn. First, the rapid development of globalization, immigration, and overseas study programs greatly stimulate the use and research of multiple languages in different linguistic contexts. Second, in many non-English countries, courses are delivered through languages (mostly English) besides their mother tongue ( Clark, 2017 ). Students are required to use multiple languages as resources to learn and understand subjects and ideas. The burgeoning body of English Medium Instruction literature in higher education is in line with the rising interest in multilingualism. Due to the innate multidisciplinary nature, it is to be expected that, multilingualism, the topic du jour, is bound to attract more attention in the future.

The third finding is the application of Positive Psychology (PP) in second language acquisition (SLA), that is, the positive trend in linguistic research. In our analysis, 20 out of 143 HCPs have words or phrases such as emotions, enjoyment, boredom, anxiety , and positive psychology in their titles, which might signal a shift of interest in the psychology of language learners and teachers in different linguistic environments. Our study shows Foreign language enjoyment (FLE) is the most frequently explored emotion, followed by foreign language classroom anxiety (FLCA), the learners’ metaphorical left and right feet on their journey to acquiring the foreign language ( Dewaele and MacIntyre, 2016 ). In fact, the topics of PP are not entirely new to SLA. For example, studies of language motivations, affections, and good language learners all provide roots for the emergence of PP in SLA ( Naiman, 1978 ; Gardner, 2010 ). In recent years, both research and teaching applications of PP in SLA are building rapidly, with a diversity of topics already being explored such as positive education and PP interventions. It is to be noted that SLA also feeds back on PP theories and concepts besides drawing inspirations from it, which makes it “an area rich for interdisciplinary cross-fertilization of ideas” ( Macintyre et al., 2019 ).

It should be noted that subjectivity is involved when we decide and categorize the candidate topic items based on the Abstract corpus. However, the frequency and range criteria guarantee that these items are actually more explored in multiple HCPs, thus indicating topic values for further investigation. Some high frequent n-grams are abandoned because they are too general or not meaningful topics. For example, applied linguistics is too broad to be included as most of the HCPs concern issues in this research line instead of theoretical linguistics. By meaningful topics, we mean that the topics can help journal editors and readers quickly locate their interested fields ( Lei and Liu, 2019 ), as the author keywords such as bilingualism , emotions , and individual differences . The examination of the few 3/4-grams and monograms (mostly nouns) revealed that most of them were either not meaningful topics or they could be subsumed in the 2-grams. Besides, there is inevitably some overlapping in the topic categorizations. For example, some topics in the language teaching and learning category are situated and discussed within the context of multilingualism. The merits of topic categorizations are two folds: to better monitor the overlapping between the Abstract corpus-based topic items and the keywords; to roughly delineate the research strands in the HCPs for future research.

It should also be noted that all the results were based on the retrieved HCPs only. The study did not aim to paint a comprehensive and full picture of the whole landscape of linguistic research. Rather, it specifically focused on the most popular literature in a specified timeframe, thus generating the snapshots or trends in linguistic research. One of the important merits of this methodology is that some newly emerging but highly cited researchers can be spotlighted and gain more academic attention because only the metrics of HCPs are considered in calculation. On the contrary, the exclusion of some other highly cited researchers in general such as Rod Ellis and Ken Hyland just indicates that their highly cited publications are not within our investigated timeframe and cannot be interpreted as their diminishing academic influence in the field. Besides, the study does not consider the issue of collaborators or collaborations in calculating the number of HCPs for two reasons. First, although some researchers are regular collaborators such as Li CC and Dewaele JM, their individual contribution can never be undermined. Second, the study also provides additional information about the number of the FA/CA HCPs from each listed author, which may aid readers in locating their interested research.

We acknowledge that our study has some limitations that should be addressed in future research. First, our study focuses on the HCPs extracted from WoS SSCI and A&HCI journals, the alleged most celebrated papers in this field. Future studies may consider including data from other databases such as Scopus to verify the findings of the present study. Second, our Abstract corpus-based method for topic extraction involved human judgement. Although the final list was the result of several rounds of discussions among the authors, it is difficult or even impossible to avoid subjectivity and some worthy topics may be unconsciously missed. Therefore, future research may consider employing automatic algorithms to extract topics. For example, a dependency-based machine learning approach can be used to identify research topics ( Zhu and Lei, 2021 ).

Data availability statement

Author contributions.

SY: conceptualization and methodology. SY and LZ: writing-review and editing and writing-original draft. All authors contributed to the article and approved the submitted version.

This work was supported by Humanities and Social Sciences Youth Fund of China MOE under the grant 20YJC740076 and 18YJC740141.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1052586/full#supplementary-material

  • Aksnes D. W. (2003). Characteristics of highly cited papers . Res. Eval. 12 , 159–170. doi: 10.3152/147154403781776645 [ CrossRef ] [ Google Scholar ]
  • Anthony L. (2022). AntConc (version 4.0.5) Tokyo, Japan: Waseda University. Available at: https://www.laurenceanthony.net/software (Accessed June 20, 2022).
  • Archila P. A., Molina J., Truscott de Mejía A.-M. (2021). Fostering bilingual scientific writing through a systematic and purposeful code-switching pedagogical strategy . Int. J. Biling. Educ. Biling. 24 , 785–803. doi: 10.1080/13670050.2018.1516189 [ CrossRef ] [ Google Scholar ]
  • Blessinger K., Hrycaj P. (2010). Highly cited articles in library and information science: an analysis of content and authorship trends . Libr. Inf. Sci. Res. 32 , 156–162. doi: 10.1016/j.lisr.2009.12.007 [ CrossRef ] [ Google Scholar ]
  • Chen H., Ho Y. S. (2015). Highly cited articles in biomass research: a bibliometric analysis . Renew. Sust. Energ. Rev. 49 , 12–20. doi: 10.1016/j.rser.2015.04.060 [ CrossRef ] [ Google Scholar ]
  • Clark S. (2017). Translanguaging in higher education: beyond monolingual ideologies . Int. J. Biling. Educ. Biling. 22 , 1048–1051. doi: 10.1080/13670050.2017.1322568 [ CrossRef ] [ Google Scholar ]
  • Dance A. (2012). Authorship: Who’s on first? Nature 489 , 591–593. doi: 10.1038/nj7417-591a, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Danell R. (2011). Can the quality of scientific work be predicted using information on the author’s track record? J. Am. Soc. Inf. Sci. Technol. 62 , 50–60. doi: 10.1002/asi.21454 [ CrossRef ] [ Google Scholar ]
  • Darvin R., Norton B. (2015). Identity and a model of Investment in Applied Linguistics . Annu. Rev. Appl. Linguist. 35 , 36–56. doi: 10.1017/S0267190514000191 [ CrossRef ] [ Google Scholar ]
  • Dewaele J.-M., MacIntyre P. D. (2016). “ Foreign language enjoyment and foreign language classroom anxiety: the right and left feet of the language learner ” in Positive psychology in SLA . eds. Peter D. M., Tammy G., Sarah M. (Bristol, Blue Ridge Summit: Multilingual Matters; ), 215–236. [ Google Scholar ]
  • Gardner R. (2010). Motivation and second language acquisition: The socio-educational model . New York: Peter Lang. [ Google Scholar ]
  • Gong Y., Lyu B., Gao X. (2018). Research on teaching Chinese as a second or foreign language in and outside mainland China: a bibliometric analysis . Asia Pac. Educ. Res. 27 , 277–289. doi: 10.1007/s40299-018-0385-2 [ CrossRef ] [ Google Scholar ]
  • Hsu Y., Ho Y. S. (2014). Highly cited articles in health care sciences and services field in science citation index Expanded . Methods Inf. Med. 53 , 446–458. doi: 10.3414/ME14-01-0022, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lee I. (2014). Publish or perish: the myth and reality of academic publishing . Lang. Teach. 47 , 250–261. doi: 10.1017/S0261444811000504 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liao S. (2017). Publications in linguistics journals from mainland China, Hong Kong, Taiwan, and Macau (2003–2012): a bibliometric analysis . J. Quant. Ling. 24 , 54–64. doi: 10.1080/09296174.2016.1260274 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liu D. (2018). The research trends and contributions of System’s publications over the past four decades (1973–2017): a bibliometric analysis . System 80 , 1–13. doi: 10.1016/j.system.2018.10.003 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liu D. (2019). Research trends in applied linguistics from 2005 to 2016: a bibliometric analysis and its implications . Appl. Linguis. 40 , 540–561. doi: 10.1093/applin/amy003 [ CrossRef ] [ Google Scholar ]
  • Leivada E., Westergaard M., Duabeitia J. A., Rothman J. (2020). On the phantom-like appearance of bilingualism effects on neurocognition: (how) should we proceed? Biling. Lang. Congn. 24 , 197–210. doi: 10.1017/S1366728920000358 [ CrossRef ] [ Google Scholar ]
  • Li W. (2011). Moment analysis and translanguaging space: discursive construction of identities by multilingual Chinese youth in Britain . Energy Fuel 43 , 1222–1235. doi: 10.1016/j.pragma.2010.07.035 [ CrossRef ] [ Google Scholar ]
  • Li W. (2018). Translanguaging as a practical theory of language . Appl. Linguis. 39 , 9–30. doi: 10.1093/applin/amx039 [ CrossRef ] [ Google Scholar ]
  • Li C. (2019). A positive psychology perspective on Chinese EFL students’ trait emotional intelligence, foreign language enjoyment and EFL learning achievement . J. Multiling. Multicult. Dev. 41 , 246–263. doi: 10.1080/01434632.2019.1614187 [ CrossRef ] [ Google Scholar ]
  • Li C. (2021). A control-value theory approach to boredom in English classes among university students in China . Mod. Lang. J. 105 , 317–334. doi: 10.1111/modl.12693 [ CrossRef ] [ Google Scholar ]
  • Li C., Dewaele J. M., Hu Y. (2021). Foreign language learning boredom: conceptualization and measurement . Appl. Ling. Rev. doi: 10.1515/applirev-2020-0124 [ CrossRef ] [ Google Scholar ]
  • Li C., Dewaele J. M., Jiang G. (2019). The complex relationship between classroom emotions and EFL achievement in China . Appl. Ling. Rev. 11 , 485–510. doi: 10.1515/applirev-2018-0043 [ CrossRef ] [ Google Scholar ]
  • Li C., Jiang G., Jean-Marc D. (2018). Understanding Chinese high school students’ foreign language enjoyment: validation of the Chinese version of the foreign language enjoyment scale . System 76 , 183–196. doi: 10.1016/j.system.2018.06.004 [ CrossRef ] [ Google Scholar ]
  • Liao S., Lei L. (2017). What we talk about when we talk about corpus: a bibliometric analysis of corpus-related research in linguistics (2000-2015) . Glottometrics 38 , 1–20. [ Google Scholar ]
  • Liao H., Tang M., Li Z., Lev B. (2018). Bibliometric analysis for highly cited papers in operations research and management science from 2008 to 2017 based on essential science indicators . Omega 88 , 223–236. doi: 10.1016/j.omega.2018.11.005 [ CrossRef ] [ Google Scholar ]
  • Luk G., Sa E. D., Bialystok E. (2011). Is there a relation between onset age of bilingualism and enhancement of cognitive control? Biling. Lang. Cogn. 14 , 588–595. doi: 10.1017/S1366728911000010 [ CrossRef ] [ Google Scholar ]
  • Macaro E., Curle S., Pun J., Dearden J. (2018). A systematic review of English medium instruction in higher education . Lang. Teach. 51 , 36–76. doi: 10.1017/S0261444817000350 [ CrossRef ] [ Google Scholar ]
  • Macintyre P., Gregersen T., Mercer S. (2019). Setting an agenda for positive psychology in SLA: theory, practice, and research . Mod. Lang. J. 103 , 262–274. doi: 10.1111/modl.12544 [ CrossRef ] [ Google Scholar ]
  • Mancebo F. P., Sapena A. F., Herrera M. V., González L., Toca H., Benavent R. A. (2013). Scientific literature analysis of judo in web of science . Arch. Budo 9 , 81–91. doi: 10.12659/AOB.883883 [ CrossRef ] [ Google Scholar ]
  • Marui M., Bozikov J., Katavi V., Hren D., Kljakovi-Gapi M., Marui A. (2004). Authorship in a small medical journal: a study of contributorship statements by corresponding authors . Sci. Eng. Ethics 10 , 493–502. doi: 10.1007/s11948-004-0007-7, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • May S. (2014). The multilingual turn: Implications for SLA, TESOL and bilingual education . New York: Routledge. [ Google Scholar ]
  • Mondada L. (2016). Challenges of multimodality: language and the body in social interaction . J. Socioling. 20 , 336–366. doi: 10.1111/josl.1_12177 [ CrossRef ] [ Google Scholar ]
  • Mondada L. (2018). Multiple temporalities of language and body in interaction: challenges for transcribing multimodality . Res. Lang. Soc. Interact. 51 , 85–106. doi: 10.1080/08351813.2018.1413878 [ CrossRef ] [ Google Scholar ]
  • Mondada L. (2019). Contemporary issues in conversation analysis: embodiment and materiality, multimodality and multisensoriality in social interaction . J. Pragmat. 145 , 47–62. doi: 10.1016/j.pragma.2019.01.016 [ CrossRef ] [ Google Scholar ]
  • Naiman N. (1978). The good language learner . Clevedon, UK: Multilingual Matters. [ Google Scholar ]
  • Newman M. (2008). The first-mover advantage in scientific publication . Eplasty 86 , 68001–68006. doi: 10.1209/0295-5075/86/68001 [ CrossRef ] [ Google Scholar ]
  • Newman M. (2014). Prediction of highly cited papers . Eplasty 105 :28002. doi: 10.1209/0295-5075/105/28002 [ CrossRef ] [ Google Scholar ]
  • Norton B., Toohey K. (2011). Identity, language learning, and social change . Lang. Teach. 44 , 412–446. doi: 10.1017/S0261444811000309 [ CrossRef ] [ Google Scholar ]
  • Ortega L. (2019). SLA and the study of equitable multilingualism . Mod. Lang. J. 103 , 23–38. doi: 10.1111/modl.12525 [ CrossRef ] [ Google Scholar ]
  • Ping Z., Thijs B., Glnzel W. (2009). Is China also becoming a giant in social sciences? Scientometrics 79 , 593–621. doi: 10.1007/s11192-007-2068-x [ CrossRef ] [ Google Scholar ]
  • Pritchard A. (1969). Statistical bibliography or bibliometrics . J. Doc. 25 , 348–349. [ Google Scholar ]
  • Ríos L. J. C., Tamao I. M., Olmos J. (2013). Bibliometric study (1922-2009) on rugby articles in research journals . South Afr. J. Res. Sport Phys. Educ. Rec. 17 , 313–109. doi: 10.3176/tr.2013.3.06 [ CrossRef ] [ Google Scholar ]
  • Ruggeri G., Orsi L., Corsi S. (2019). A bibliometric analysis of the scientific literature on Fairtrade labelling . Int. IJC 43 , 134–152. doi: 10.1111/ijcs.12492 [ CrossRef ] [ Google Scholar ]
  • Sabiote C. R., Rodríguez J. A. (2015). Bibliometric study and methodological quality indicators of the journal porta Linguarum during six year period 2008-2013 . Porta Ling. 24 , 135–150. doi: 10.30827/Digibug.53866 [ CrossRef ] [ Google Scholar ]
  • Schissel J. L., De Korne H., López-Gopar M. E. (2018). Grappling with translanguaging for teaching and assessment in culturally and linguistically diverse contexts: teacher perspectives from Oaxaca, Mexico . Int. J. Biling. Educ. Biling. 24 , 340–356. doi: 10.1080/13670050.2018.1463965 [ CrossRef ] [ Google Scholar ]
  • Shen Q., Gao X. (2018). Multilingualism and policy making in greater China: ideological and implementational spaces . Lang. Policy 18 , 1–16. doi: 10.1007/s10993-018-9473-7 [ CrossRef ] [ Google Scholar ]
  • Small H. (2004). Why authors think their papers are highly cited . Scientometrics 60 , 305–316. doi: 10.1023/B:SCIE.0000034376.55800.18 [ CrossRef ] [ Google Scholar ]
  • Smith D. R. (2007). The New Zealand timber economy, 1840–1935 . N. Z. Med. J. 120 , U2871–U2313. doi: 10.1016/0305-7488(90)90044-C, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van Doorslaer L., Gambier Y. (2015). Measuring relationships in translation studies. On affiliations and keyword frequencies in the translation studies bibliography . Perspectives 23 , 305–319. doi: 10.1080/0907676X.2015.1026360 [ CrossRef ] [ Google Scholar ]
  • van Oorschot J. A. W. H., Hofman E., Halman J. (2018). A bibliometric review of the innovation adoption literature . Technol. Forecast. Soc. Chang. 134 , 1–21. doi: 10.1016/j.techfore.2018.04.032 [ CrossRef ] [ Google Scholar ]
  • Xie Z., Willett P. (2013). The development of computer science research in the People’s republic of China 2000–2009: a bibliometric study . Inf. Dev. 29 , 251–264. doi: 10.1177/0266666912458515 [ CrossRef ] [ Google Scholar ]
  • Yan S., Zhang H., Wang J. (2022). Trends and hot topics in radiology, nuclear medicine and medical imaging from 2011–2021: a bibliometric analysis of highly cited papers . Jpn. J. Radiol. 40 , 847–856. doi: 10.1007/s11604-022-01268-z, PMID: [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zhu H., Lei L. (2021). A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies . Lib. Hi Tech 40 , 495–515. doi: 10.1108/LHT-01-2021-0051 [ CrossRef ] [ Google Scholar ]
  • Zhu H., Lei L. (2022). The research trends of text classification studies (2000–2020): a bibliometric analysis . SAGE Open 12 , 215824402210899–215824402210816. doi: 10.1177%2F21582440221089963 [ Google Scholar ]

IMAGES

  1. (PDF) Corpus Linguistics What Is a Corpus

    corpus linguistics research topics

  2. (PDF) Corpus Linguistics and English Language Teaching Materials: A

    corpus linguistics research topics

  3. Practical Corpus Linguistics : An Introduction to Corpus-Based Language

    corpus linguistics research topics

  4. Corpus Linguistics for Pragmatics: A guide for research

    corpus linguistics research topics

  5. 144+ Awesome Research Topics in Linguistics

    corpus linguistics research topics

  6. Statistics in Corpus Linguistics Research (Routledge)

    corpus linguistics research topics

VIDEO

  1. Corpus Basics II (Analysis)

  2. Introduction to Corpus Linguistics

  3. Corpus linguistics. #linguistics #english #englishlinguistics #grammar

  4. Teaching Young Learners in Corpus linguistics explained in Urdu and Hindi

  5. Corpus Linguistics, Learner Corpora & SLA. Prof Tony McEnery, 5 October 2022

  6. Corpus Linguistics Workshop

COMMENTS

  1. Corpus Linguistics

    Corpus linguistics has contributed to several areas of applied linguistics. In addition to core contributions in the areas of lexicography and grammar, corpus linguistics has also provided insights into the areas of register variation (e.g., spoken versus written language, across academic disciplines, stylistic variation), language change over time using historical or diachronic corpora ...

  2. (PDF) Research Trends in Corpus Linguistics: A Bibliometric Analysis of

    This paper uses a bibliometric analysis to map the field of Corpus Linguistics (CL) research in arts and humanities over the last 20 years, while tracking changes in the popular CL research topics ...

  3. Research in Corpus Linguistics

    View All Issues. Research in Corpus Linguistics (RiCL, ISSN 2243-4712) is a scholarly peer-reviewed international scientific journal published annually, aiming at the publication of contributions which contain empirical analyses of data from different languages and from different theoretical perspectives and frameworks.

  4. Corpus Linguistics

    The series includes introductions to the main topic areas by experts in the field as well as accounts of the latest ideas and developments by leading researchers. Elements in this series. View all. ... The use of Corpus Linguistics in research disciplines other than Linguistics, including Political Science, Literary Studies, History, and ...

  5. Corpus Linguistics and Linguistic Theory

    Objective Corpus Linguistics and Linguistic Theory (CLLT) is a peer-reviewed journal publishing high-quality original corpus-based research focusing on theoretically relevant issues in all core areas of linguistic research, or other recognized topic areas. It provides a forum for researchers from different theoretical backgrounds and different areas of interest that share a commitment to the ...

  6. 38 Research in Corpus Linguistics

    Abstract. Corpus linguistics is a research approach that has developed over the past few decades to support empirical investigations of language variation and use, resulting in research findings that are have much greater generalizability and validity than would otherwise be feasible. Corpus linguistics is not in itself a model of language.

  7. Research in Corpus Linguistics

    Corpus linguistics is a research approach that has developed over the past few decades to support empirical investigations of language variation and use, resulting in research findings that are have much greater generalizability and validity than would otherwise be feasible. ... Dive into the research topics of 'Research in Corpus Linguistics ...

  8. Corpus Linguistics Area of Expertise

    Our research covers the following topics. Corpus linguistics Studying language in collections of real-world text, and the sets of rules that govern them, and how they relate to other languages; Corpus-assisted discourse studies Using collections of text to analyse written or vocal use of language, including writing, conversation and communication

  9. Featured topic: Corpus: Use and research

    Corpora in Applied Linguistics exams these and other questions related to this emerging field. It discusses these important issues and explores the techniques of investigating a corpus, as well as demonstrating the application of corpora in a wide variety of fields. It also outlines the impact corpus linguistics is having on how languages are ...

  10. Writing up a Corpus-Linguistic Paper

    Given that we prefer to see corpus linguistics as a method rather than a theory (see the special issue of the International Journal of Corpus Linguistics 15(3) for a debate of these two views), we believe outlining the methodological details of a corpus study in a way that is comprehensive enough is absolutely central. At a very high level of abstractness, there is really only one rule, which ...

  11. PDF An IntroductIon to corpus LInguIstIcs

    What Is CorPus LInguIstICs? So what exactly is corpus linguistics? Corpus linguistics approaches the study of language in use through corpora (singular: corpus). A corpus is a large, principled collection of naturally occurring examples of language stored electronically. In short, corpus linguistics serves to answer two fundamental research ...

  12. Editorial: Language, corpora, and technology in applied linguistics

    The aim of this Research Topic was to stimulate thinking and research on emergent and important topics across the intersecting domains of applied linguistics, corpus-linguistics, translation and technology and to serve as a showcase for inventive cross-disciplinary deployment of tools and frameworks.

  13. PDF Chapter 26 Writing up a Corpus-Linguistic Paper

    Chapter 26. Writing up a Corpus-Linguistic Paper. Stefan Th. Gries and Magali Paquot AbstractIn this chapter, we provide a brief characterization of what we consider the best and most common structure that empirical corpus-linguistic papers can and should have. In particular, we first introduce the four major parts of a corpus linguistics ...

  14. Frontiers

    The procedures to retrieve the research topics in the Abstract corpus were as follows. First, the 143 pieces of abstracts were saved as separate.txt files in one folder. ... S., and Lei, L. (2017). What we talk about when we talk about corpus: a bibliometric analysis of corpus-related research in linguistics (2000-2015). Glottometrics 38, 1-20.

  15. (PDF) CORPUS METHODS IN LANGUAGE STUDIES

    This chapter offers an introduction to corpus linguistics as a methodology for studying language, literature, and other fields in the humanities. It defines corpus linguistics, explores its ...

  16. Corpus Linguistics for Education A Guide for Research

    Corpus Linguistics for Education provides a practical and comprehensive introduction to the use of corpus research-methods in the field of education. Taking a hands-on approach to showcase the applications of corpora in the exploration of educationally relevant topics, this book: • covers 18 key skills including corpus building, the role of frequency, different corpus methods, transcription ...

  17. 102 questions with answers in CORPUS LINGUISTICS

    Answer. Applied linguistics includes teaching languages' the fist language & the second language', different kinds of written and spoken texts (corpus linguistics), style, sociolinguistics ...

  18. Corpus Linguistics and Corpus-Based Research and Its Implication in

    The variety of topics, the copra volume employed and languages studied to provide us a powerful tone that corpus and corpus-based research is a lifelike domain of research. Though we are not in a set to consider potential research trends, common types were regarded among the 20 studies.

  19. LING 2050 Special Topics in Linguistics: Corpus Linguistics

    Instructor: Na-Rae Han. Meetings: MW 9:30am -- 10:45am, 340 Cathedral of Learning. Description: This course is an introduction to the use of corpora in the study of language. In modern linguistics, the term "corpus" is used to refer to large collections of electronic texts which represent a sample of a particular variety of use of language (s).

  20. Stefania M. Maci, Michele Sala: Book Review on Corpus Linguistics and

    Article Stefania M. Maci, Michele Sala: Book Review on Corpus Linguistics and Translation Tools for Digital Humanities: Research Methods and Applications was published on December 1, 2023 in the journal Journal of China Computer-Assisted Language Learning (volume 3, issue 2).

  21. Corpus Linguistics Research Papers

    The concluding section makes suggestions for future research. Keywords: corpus linguistics, language and sexuality, queer linguistics, sexual descriptive adjectives, methodology, collocation ... The exit of the United Kingdom from the European Union was chosen because it is a widely covered topic both in the Italian and in the British press ...

  22. HEART metaphors in economic discourse corpora: conceptual evidence and

    Similar to most fields within modern linguistics, metaphor analysis has benefited tremendously from corpus approaches to discourse studies (Stefanowitsch & Gries, Citation 2006). Consequently, both monolingual and comparative investigations of body metaphors within political and economic discourse with corpus-based imputes have been quite common.

  23. Dissertations.se: CORPUS LINGUISTICS

    Showing result 1 - 5 of 86 swedish dissertations containing the words corpus linguistics . 1. Morphosyntactic Corpora and Tools for Persian. Abstract : This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts.

  24. Trends and hot topics in linguistics studies from 2011 to 2021: A

    High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators ...