• How It Works
  • PhD thesis writing
  • Master thesis writing
  • Bachelor thesis writing
  • Dissertation writing service
  • Dissertation abstract writing
  • Thesis proposal writing
  • Thesis editing service
  • Thesis proofreading service
  • Thesis formatting service
  • Coursework writing service
  • Research paper writing service
  • Architecture thesis writing
  • Computer science thesis writing
  • Engineering thesis writing
  • History thesis writing
  • MBA thesis writing
  • Nursing dissertation writing
  • Psychology dissertation writing
  • Sociology thesis writing
  • Statistics dissertation writing
  • Buy dissertation online
  • Write my dissertation
  • Cheap thesis
  • Cheap dissertation
  • Custom dissertation
  • Dissertation help
  • Pay for thesis
  • Pay for dissertation
  • Senior thesis
  • Write my thesis

211 Research Topics in Linguistics To Get Top Grades

research topics in linguistics

Many people find it hard to decide on their linguistics research topics because of the assumed complexities involved. They struggle to choose easy research paper topics for English language too because they think it could be too simple for a university or college level certificate.

All that you need to learn about Linguistics and English is sprawled across syntax, phonetics, morphology, phonology, semantics, grammar, vocabulary, and a few others. To easily create a top-notch essay or conduct a research study, you can consider this list of research topics in English language below for your university or college use. Note that you can fine-tune these to suit your interests.

Linguistics Research Paper Topics

If you want to study how language is applied and its importance in the world, you can consider these Linguistics topics for your research paper. They are:

  • An analysis of romantic ideas and their expression amongst French people
  • An overview of the hate language in the course against religion
  • Identify the determinants of hate language and the means of propagation
  • Evaluate a literature and examine how Linguistics is applied to the understanding of minor languages
  • Consider the impact of social media in the development of slangs
  • An overview of political slang and its use amongst New York teenagers
  • Examine the relevance of Linguistics in a digitalized world
  • Analyze foul language and how it’s used to oppress minors
  • Identify the role of language in the national identity of a socially dynamic society
  • Attempt an explanation to how the language barrier could affect the social life of an individual in a new society
  • Discuss the means through which language can enrich cultural identities
  • Examine the concept of bilingualism and how it applies in the real world
  • Analyze the possible strategies for teaching a foreign language
  • Discuss the priority of teachers in the teaching of grammar to non-native speakers
  • Choose a school of your choice and observe the slang used by its students: analyze how it affects their social lives
  • Attempt a critical overview of racist languages
  • What does endangered language means and how does it apply in the real world?
  • A critical overview of your second language and why it is a second language
  • What are the motivators of speech and why are they relevant?
  • Analyze the difference between the different types of communications and their significance to specially-abled persons
  • Give a critical overview of five literature on sign language
  • Evaluate the distinction between the means of language comprehension between an adult and a teenager
  • Consider a native American group and evaluate how cultural diversity has influenced their language
  • Analyze the complexities involved in code-switching and code-mixing
  • Give a critical overview of the importance of language to a teenager
  • Attempt a forensic overview of language accessibility and what it means
  • What do you believe are the means of communications and what are their uniqueness?
  • Attempt a study of Islamic poetry and its role in language development
  • Attempt a study on the role of Literature in language development
  • Evaluate the Influence of metaphors and other literary devices in the depth of each sentence
  • Identify the role of literary devices in the development of proverbs in any African country
  • Cognitive Linguistics: analyze two pieces of Literature that offers a critical view of perception
  • Identify and analyze the complexities in unspoken words
  • Expression is another kind of language: discuss
  • Identify the significance of symbols in the evolution of language
  • Discuss how learning more than a single language promote cross-cultural developments
  • Analyze how the loss of a mother tongue affect the language Efficiency of a community
  • Critically examine how sign language works
  • Using literature from the medieval era, attempt a study of the evolution of language
  • Identify how wars have led to the reduction in the popularity of a language of your choice across any country of the world
  • Critically examine five Literature on why accent changes based on environment
  • What are the forces that compel the comprehension of language in a child
  • Identify and explain the difference between the listening and speaking skills and their significance in the understanding of language
  • Give a critical overview of how natural language is processed
  • Examine the influence of language on culture and vice versa
  • It is possible to understand a language even without living in that society: discuss
  • Identify the arguments regarding speech defects
  • Discuss how the familiarity of language informs the creation of slangs
  • Explain the significance of religious phrases and sacred languages
  • Explore the roots and evolution of incantations in Africa

Sociolinguistic Research Topics

You may as well need interesting Linguistics topics based on sociolinguistic purposes for your research. Sociolinguistics is the study and recording of natural speech. It’s primarily the casual status of most informal conversations. You can consider the following Sociolinguistic research topics for your research:

  • What makes language exceptional to a particular person?
  • How does language form a unique means of expression to writers?
  • Examine the kind of speech used in health and emergencies
  • Analyze the language theory explored by family members during dinner
  • Evaluate the possible variation of language based on class
  • Evaluate the language of racism, social tension, and sexism
  • Discuss how Language promotes social and cultural familiarities
  • Give an overview of identity and language
  • Examine why some language speakers enjoy listening to foreigners who speak their native language
  • Give a forensic analysis of his the language of entertainment is different to the language in professional settings
  • Give an understanding of how Language changes
  • Examine the Sociolinguistics of the Caribbeans
  • Consider an overview of metaphor in France
  • Explain why the direct translation of written words is incomprehensible in Linguistics
  • Discuss the use of language in marginalizing a community
  • Analyze the history of Arabic and the culture that enhanced it
  • Discuss the growth of French and the influences of other languages
  • Examine how the English language developed and its interdependence on other languages
  • Give an overview of cultural diversity and Linguistics in teaching
  • Challenge the attachment of speech defect with disability of language listening and speaking abilities
  • Explore the uniqueness of language between siblings
  • Explore the means of making requests between a teenager and his parents
  • Observe and comment on how students relate with their teachers through language
  • Observe and comment on the communication of strategy of parents and teachers
  • Examine the connection of understanding first language with academic excellence

Language Research Topics

Numerous languages exist in different societies. This is why you may seek to understand the motivations behind language through these Linguistics project ideas. You can consider the following interesting Linguistics topics and their application to language:

  • What does language shift mean?
  • Discuss the stages of English language development?
  • Examine the position of ambiguity in a romantic Language of your choice
  • Why are some languages called romantic languages?
  • Observe the strategies of persuasion through Language
  • Discuss the connection between symbols and words
  • Identify the language of political speeches
  • Discuss the effectiveness of language in an indigenous cultural revolution
  • Trace the motivators for spoken language
  • What does language acquisition mean to you?
  • Examine three pieces of literature on language translation and its role in multilingual accessibility
  • Identify the science involved in language reception
  • Interrogate with the context of language disorders
  • Examine how psychotherapy applies to victims of language disorders
  • Study the growth of Hindi despite colonialism
  • Critically appraise the term, language erasure
  • Examine how colonialism and war is responsible for the loss of language
  • Give an overview of the difference between sounds and letters and how they apply to the German language
  • Explain why the placement of verb and preposition is different in German and English languages
  • Choose two languages of your choice and examine their historical relationship
  • Discuss the strategies employed by people while learning new languages
  • Discuss the role of all the figures of speech in the advancement of language
  • Analyze the complexities of autism and its victims
  • Offer a linguist approach to language uniqueness between a Down Syndrome child and an autist
  • Express dance as a language
  • Express music as a language
  • Express language as a form of language
  • Evaluate the role of cultural diversity in the decline of languages in South Africa
  • Discuss the development of the Greek language
  • Critically review two literary texts, one from the medieval era and another published a decade ago, and examine the language shifts

Linguistics Essay Topics

You may also need Linguistics research topics for your Linguistics essays. As a linguist in the making, these can help you consider controversies in Linguistics as a discipline and address them through your study. You can consider:

  • The connection of sociolinguistics in comprehending interests in multilingualism
  • Write on your belief of how language encourages sexism
  • What do you understand about the differences between British and American English?
  • Discuss how slangs grew and how they started
  • Consider how age leads to loss of language
  • Review how language is used in formal and informal conversation
  • Discuss what you understand by polite language
  • Discuss what you know by hate language
  • Evaluate how language has remained flexible throughout history
  • Mimicking a teacher is a form of exercising hate Language: discuss
  • Body Language and verbal speech are different things: discuss
  • Language can be exploitative: discuss
  • Do you think language is responsible for inciting aggression against the state?
  • Can you justify the structural representation of any symbol of your choice?
  • Religious symbols are not ordinary Language: what are your perspective on day-to-day languages and sacred ones?
  • Consider the usage of language by an English man and someone of another culture
  • Discuss the essence of code-mixing and code-switching
  • Attempt a psychological assessment on the role of language in academic development
  • How does language pose a challenge to studying?
  • Choose a multicultural society of your choice and explain the problem they face
  • What forms does Language use in expression?
  • Identify the reasons behind unspoken words and actions
  • Why do universal languages exist as a means of easy communication?
  • Examine the role of the English language in the world
  • Examine the role of Arabic in the world
  • Examine the role of romantic languages in the world
  • Evaluate the significance of each teaching Resources in a language classroom
  • Consider an assessment of language analysis
  • Why do people comprehend beyond what is written or expressed?
  • What is the impact of hate speech on a woman?
  • Do you believe that grammatical errors are how everyone’s comprehension of language is determined?
  • Observe the Influence of technology in language learning and development
  • Which parts of the body are responsible for understanding new languages
  • How has language informed development?
  • Would you say language has improved human relations or worsened it considering it as a tool for violence?
  • Would you say language in a black populous state is different from its social culture in white populous states?
  • Give an overview of the English language in Nigeria
  • Give an overview of the English language in Uganda
  • Give an overview of the English language in India
  • Give an overview of Russian in Europe
  • Give a conceptual analysis on stress and how it works
  • Consider the means of vocabulary development and its role in cultural relationships
  • Examine the effects of Linguistics in language
  • Present your understanding of sign language
  • What do you understand about descriptive language and prescriptive Language?

List of Research Topics in English Language

You may need English research topics for your next research. These are topics that are socially crafted for you as a student of language in any institution. You can consider the following for in-depth analysis:

  • Examine the travail of women in any feminist text of your choice
  • Examine the movement of feminist literature in the Industrial period
  • Give an overview of five Gothic literature and what you understand from them
  • Examine rock music and how it emerged as a genre
  • Evaluate the cultural association with Nina Simone’s music
  • What is the relevance of Shakespeare in English literature?
  • How has literature promoted the English language?
  • Identify the effect of spelling errors in the academic performance of students in an institution of your choice
  • Critically survey a university and give rationalize the literary texts offered as Significant
  • Examine the use of feminist literature in advancing the course against patriarchy
  • Give an overview of the themes in William Shakespeare’s “Julius Caesar”
  • Express the significance of Ernest Hemingway’s diction in contemporary literature
  • Examine the predominant devices in the works of William Shakespeare
  • Explain the predominant devices in the works of Christopher Marlowe
  • Charles Dickens and his works: express the dominating themes in his Literature
  • Why is Literature described as the mirror of society?
  • Examine the issues of feminism in Sefi Atta’s “Everything Good Will Come” and Bernadine Evaristos’s “Girl, Woman, Other”
  • Give an overview of the stylistics employed in the writing of “Girl, Woman, Other” by Bernadine Evaristo
  • Describe the language of advertisement in social media and newspapers
  • Describe what poetic Language means
  • Examine the use of code-switching and code-mixing on Mexican Americans
  • Examine the use of code-switching and code-mixing in Indian Americans
  • Discuss the influence of George Orwell’s “Animal Farm” on satirical literature
  • Examine the Linguistics features of “Native Son” by Richard Wright
  • What is the role of indigenous literature in promoting cultural identities
  • How has literature informed cultural consciousness?
  • Analyze five literature on semantics and their Influence on the study
  • Assess the role of grammar in day to day communications
  • Observe the role of multidisciplinary approaches in understanding the English language
  • What does stylistics mean while analyzing medieval literary texts?
  • Analyze the views of philosophers on language, society, and culture

English Research Paper Topics for College Students

For your college work, you may need to undergo a study of any phenomenon in the world. Note that they could be Linguistics essay topics or mainly a research study of an idea of your choice. Thus, you can choose your research ideas from any of the following:

  • The concept of fairness in a democratic Government
  • The capacity of a leader isn’t in his or her academic degrees
  • The concept of discrimination in education
  • The theory of discrimination in Islamic states
  • The idea of school policing
  • A study on grade inflation and its consequences
  • A study of taxation and Its importance to the economy from a citizen’s perspectives
  • A study on how eloquence lead to discrimination amongst high school students
  • A study of the influence of the music industry in teens
  • An Evaluation of pornography and its impacts on College students
  • A descriptive study of how the FBI works according to Hollywood
  • A critical consideration of the cons and pros of vaccination
  • The health effect of sleep disorders
  • An overview of three literary texts across three genres of Literature and how they connect to you
  • A critical overview of “King Oedipus”: the role of the supernatural in day to day life
  • Examine the novel “12 Years a Slave” as a reflection of servitude and brutality exerted by white slave owners
  • Rationalize the emergence of racist Literature with concrete examples
  • A study of the limits of literature in accessing rural readers
  • Analyze the perspectives of modern authors on the Influence of medieval Literature on their craft
  • What do you understand by the mortality of a literary text?
  • A study of controversial Literature and its role in shaping the discussion
  • A critical overview of three literary texts that dealt with domestic abuse and their role in changing the narratives about domestic violence
  • Choose three contemporary poets and analyze the themes of their works
  • Do you believe that contemporary American literature is the repetition of unnecessary themes already treated in the past?
  • A study of the evolution of Literature and its styles
  • The use of sexual innuendos in literature
  • The use of sexist languages in literature and its effect on the public
  • The disaster associated with media reports of fake news
  • Conduct a study on how language is used as a tool for manipulation
  • Attempt a criticism of a controversial Literary text and why it shouldn’t be studied or sold in the first place

Finding Linguistics Hard To Write About?

With these topics, you can commence your research with ease. However, if you need professional writing help for any part of the research, you can scout here online for the best research paper writing service.

There are several expert writers on ENL hosted on our website that you can consider for a fast response on your research study at a cheap price.

As students, you may be unable to cover every part of your research on your own. This inability is the reason you should consider expert writers for custom research topics in Linguistics approved by your professor for high grades.

Sociology Research Topics

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Comment * Error message

Name * Error message

Email * Error message

Save my name, email, and website in this browser for the next time I comment.

As Putin continues killing civilians, bombing kindergartens, and threatening WWIII, Ukraine fights for the world's peaceful future.

Ukraine Live Updates

  • Write my thesis
  • Thesis writers
  • Buy thesis papers
  • Bachelor thesis
  • Master's thesis
  • Thesis editing services
  • Thesis proofreading services
  • Buy a thesis online
  • Write my dissertation
  • Dissertation proposal help
  • Pay for dissertation
  • Custom dissertation
  • Dissertation help online
  • Buy dissertation online
  • Cheap dissertation
  • Dissertation editing services
  • Write my research paper
  • Buy research paper online
  • Pay for research paper
  • Research paper help
  • Order research paper
  • Custom research paper
  • Cheap research paper
  • Research papers for sale
  • Thesis subjects
  • How It Works

130+ Original Linguistics Research Topics: Ideas To Focus On

Linguistics research topics

Linguistics is an exciting course to learn. Unfortunately, writing a research paper or essay or write my thesis in linguistics is not as easy. Many students struggle to find a good research topic to write about. Finding a good research topic is crucial because it is the foundation of your paper. It will guide your research and dictate what you write.

Creative Language Research Topics

Argumentative research titles about language, english language research topics for stem students, social media research topics about language, the best quantitative research topics about language, more creative sociolinguistics research topics, research topics in english language education for students, top thesis topics in language, creative language and gender research topics, language education research topics on social issues, research title about language acquisition.

Most students turn to the internet to find research paper topics. Sadly, most sources provide unoriginal and basic topics. For this reason, this article provides some creative sample research topics for English majors.

Linguistics is a fascinating subject with so many research topic options. Check out the following creative research topics in language

  • How you can use linguistic patterns to locate migration paths
  • Computers and their effect on language creation
  • The internet and its impacts on modern language
  • Has text messages helped create a new linguistic culture?
  • Language and change; how social changes influence language development
  • How language changes over time
  • How effective is non-verbal communication in communicating emotions?
  • Verbal communication and emotional displays: what is the link?
  • The negative power of language in internet interactions
  • How words change as society develops
  • Is the evolution of languages a scientific concept?
  • Role of technology in linguistics

Argumentative essay topics should state your view on a subject so you can create content to defend the view and convince others that it is logical and well-researched. Here are some excellent language research titles examples

  • Society alters words and their meanings over time
  • Children have a better grasp of new language and speech than adults
  • Childhood is the perfect time to develop speech
  • Individuals can communicate without a shared language
  • Learning more than one language as a child can benefit individuals in adulthood
  • Elementary schools should teach students a second language
  • Language acquisition changes at different growth stages
  • The impact of technology on linguistics
  • Language has significant power to capitalize on emotions
  • The proper use of language can have positive impacts on society

Research topics for STEM students do not differ much from those for college and high school students. However, they are slightly more targeted. Find an excellent research title about language for your paper below:

  • How does language promote gender differences?
  • Music and language evolution: the correlation
  • Slang: development and evolution in different cultures
  • Can language create bonds among cross-cultural societies?
  • Formal vs informal language: what are the differences?
  • Age and pronunciation: what is the correlation?
  • How languages vary across STEM subjects
  • Are STEM students less proficient in languages?
  • The use of language in the legal sector
  • The importance of non-verbal communication and body language
  • How politeness is perceived through language choices and use
  • The evolution of English through history

Did you know you can find excellent social media research topics if you do it right? Check out the following social media language research titles:

  • The role of the internet in promoting language acquisition
  • A look at changes in languages since social media gained traction
  • How social media brings new language
  • How effective are language apps in teaching foreign languages?
  • The popularity of language applications among learners
  • A study of the impact of the internet on the spreading of slang
  • Social media as a tool for promoting hate language
  • Free speech vs hate speech: what is the difference?
  • How social media platforms can combat hate language propagation
  • How can social media users express emotions through written language?
  • Political censorship and its impact on the linguistics applied in the media
  • The differences between social media and real-life languages

A language research title can be the foundation of your quantitative research. Find some of the best examples of research topics for English majors here:

  • Language barriers in the healthcare sector
  • What percentage of kids below five struggle with languages?
  • Understanding the increase in multilingual people
  • Language barriers and their impact on effective communication
  • Social media and language: are language barriers existent in social media?
  • Bilingualism affects people’s personalities and temperaments
  • Can non-native teachers effectively teach local students the English language?
  • Bilingualism and its impact on social perceptions
  • The new generative grammar concept: an in-depth analysis
  • Racist language: its history and impacts
  • A look into examples of endangered languages
  • Attitudes toward a language and how it can impact language acquisition

You can choose a research topic about language based on social issues, science concerns like biochemistry topics , and much more. Sociolinguistics is the study of the correlation between language and society and the application of language in various social situations. Here are some excellent research topics in sociolinguistics:

  • An analysis of how sociolinguistics can help people understand multi-lingual language choices
  • An analysis of sociolinguistics through America’s color and race background
  • The role of sociolinguistics in children development
  • Comparing sociolinguistics and psycholinguistics
  • Sociolinguistics and gender empowerment: an analysis of their correlation
  • How media houses use sociolinguistics to create bias and gain a competitive advantage
  • The value of sociolinguistics education in the teaching of discipline
  • The role played by sociolinguistics in creating social change throughout history
  • Research methods used in sociolinguistics
  • Different sociolinguistics and their role in English evolution
  • Sociolinguistics: an in-depth analysis
  • What is sociolinguistics, and what is its role in language evolution?

A good research topic in English will serve as the guiding point for your research paper. Find a suitable research topic for English majors below:

  • Types of indigenous languages
  • Language s an essential element of human life
  • Language as the primary communication medium
  • The value of language in society
  • The negative side of coded language
  • School curriculums and how they influence languages
  • Linguistics: a forensic language
  • Elements that influence people’s ability to learn a new language
  • The development of the English language
  • How the English language borrows from other languages
  • Multilingualism: an insight
  • The correlation between metaphors and similes

Many students struggle to find good thesis topics in language and linguistics. As you read more on the thesis statement about social media , make sure you also understand every thesis title about language from the following examples:

  • The classification of human languages
  • The application of different tools in language identification
  • The role of linguists in language identification
  • The contributions of Greek philosophers to language development
  • The origin of language: early speculations
  • The history of language through the scope of mythology
  • Theories that explain the origin and development of language
  • Is language the most effective form of communication
  • The impact of brain injuries on language
  • Language impacts on sports
  • Linguistics intervention that won’t work in this century
  • Language as a system of symbols

Just like economic research paper topics , gender and language topics do not have to stick to the norms or the standards by which all students write. You can exercise some creativity when creating your topic. Discover a topic about language and gender from this list:

  • Language and gender: what is the correlation?
  • How different genders perceive language
  • Does a kid’s gender influence their grasp of languages?
  • Men vs Women: a statistical overview of their multilingual prowess.
  • The perception of language from the female standpoint
  • The difference between female and male language use
  • The use of language as a tool for connection between females and males
  • Does gender have an impact on efficient communication
  • Does gender impact word choices in conversations?
  • Females have an easier time learning two or more languages
  • What makes female and male language choices differ?
  • Are females better at communicating using spoken language?

There are many social issues related to language education that you can cover in your research paper. Check out the following topics about language related to social issues research topics for your research:

  • Language translation: what makes it possible
  • How does the mother tongue influence pronunciation?
  • Issues that encourage people to learn different languages
  • Sign language: origin and more
  • Role of language in solving conflicts
  • Language and mental health: a vivid analysis
  • The similarities between English and French languages
  • Language disorders: an overview
  • Common barriers to language acquisition
  • The impact of mother tongue on effective communication
  • Reasons you should learn two or more languages
  • The benefits of multilingualism in the corporate world
  • Language and identity: what is the correlation?

Language acquisition is the process by which people gain the ability to understand and produce language. Like anatomy research paper topics , language acquisition is a great area to focus your linguistics research. Here are some research questions that bring the focus of the study of linguistic and language acquisition:

  • Language acquisition: an overview
  • What attitudes do people have about language acquisition
  • How attitude can impact language acquisition
  • The evolution of language acquisition over time
  • Language and ethnicity: their correlation
  • Do native English speakers have an easier time acquiring new languages?
  • A case study on political language
  • Why is language acquisition a key factor in leadership
  • Language acquisition and mother tongue pronunciation: the link
  • Ambiguity as a barrier to language acquisition
  • How words acquire their meanings

While a good topic can help capture the reader and create a good impression, it is insufficient to earn you excellent grades. You also need quality content for your paper to get perfect grades. However, creating a high-quality research paper takes time, effort, and skill, which most students do not have.

For these reasons, we offer quality research paper writing services for all students. We guarantee quality papers, timely deliveries, and originality. Reach out to our writers for top linguistics research papers today!

Leave a Reply Cancel reply

100+ Linguistic Topics for Excellent Research Papers

13 December, 2021

12 minutes read

Author:  Donna Moores

Linguistics is an English language category that deals with logical dialectal analysis and interpretation. It seeks to reveal the form, meaning, and context of language. While most college students may perceive linguistics as a simple subject, it is pretty complex. English tutors might issue topics in linguistics in various disciplines like phonology or semantics, which leaves many learners grappling to tackle the research papers.

linguistic topics

When analyzing language, you should write a paper that clarifies the nature, classification, and proper identification tools. Therefore, your linguistics topics must be relevant and within the research purpose. It is essential to pick an appropriate topic to allow the audience to understand the fundamental research.

With numerous dialects across the globe, identifying a worthy topic should be a simple task. We have compiled lists of engaging topic ideas to help you craft an outstanding research paper and inspire your academic projects.

Linguistics Research Paper: Definition, Explanation, Examples

Any linguistics paper should comprise an in-depth analysis of language development and acquisition. The subject explores various aspects of different dialects and their meanings. It also covers style and form to develop comprehensive arguments under various contexts.

That is why English professors test students with various academic projects to measure their comprehension levels. Thus, learners should ensure they select good linguistics research paper topics. Here is an overview example of the paper structure.


  • Background information.
  • Hypothesis.
  • Literature review.


  • Data sources.
  • Data organization.
  • Analysis/Findings.
  • Paraphrase hypothesis.
  • Significance of the study.
  • Recommendations.

Therefore, ensure your paper meets the specified academic standards. You must read the requirements keenly to craft an outstanding paper that meets the tutor’s expectations. If you encounter challenges, you can research further online or seek clarification from your professor to know how you will approach the research question.

Choosing A Good Linguistic Topic Isn’t Hard – Here’s How To Do It

Struggling to pick a relevant topic for your research paper? Fret not. We will help you understand the steps to identify an appropriate topic. Most students often underestimate the significance of the pre-writing stages, which entails topic selection. It is a vital phase where you need to choose relevant linguistics topics for your research paper. Hence, ensure you read the research question carefully to understand its requirements.

Carry out an extensive brainstorming session to identify relatable themes within the subject area. Avoid selecting a broad theme, but if you do, break it into minor sub-topics. This will help you during the research phase to get adequate information. Use different websites to get verifiable academic sources and published papers from reputable scholars.

Don’t forget to make your linguistics research paper topics catchy and exciting to capture your readers’ attention. No one wants to read a dull paper.

Finally, follow all the academic requirements for research paper writing – proper grammar, style, correct citation, etc. College tutors often award well-written, original papers.

However, if you still find it challenging to move beyond topic selection, you can reach out to one of our subject-oriented experts for assistance.

We are here to offer the following:

  • Quality-approved papers.
  • 100% authentic papers.
  • One-on-one personalized learning.
  • Efficient support services.
  • Complete confidentiality and data privacy.

Therefore, do not endure the academic pressure alone. Talk to us we will help you select unique linguistic research topics.

Top 15 Brilliant Psycholinguistics Topics

Psycholinguistics deals with language development and acquisition. Below is a compilation of brilliant linguistics paper topics to inspire your essay compositions.

  • The significance of learning many languages as a young child.
  • The importance of music in language development.
  • An analysis of how language forms cross-cultural ties.
  • Why you should learn the art of body language.
  • What is hate speech? Is it self-taught:
  • The impact of speech on human character.
  • Linguistic patterns: A study of tracking migration routes.
  • The impact of technology on linguistics.
  • A comparative analysis of non-verbal communication.
  • Discuss how children get impressive language skills.
  • Compare and contrast verbal and non-verbal communication.
  • Discuss the different stages in dialect acquisition.
  • The influence of linguistic ethics in evoking mass emotions.
  • Effective language use improves an individual’s personality: Discuss.
  • An analysis of learning mechanisms in a foreign dialect.

15 Interesting Sociolinguistics Topic Ideas

Need help with your sociolinguistics research paper? Here are interesting topics in linguistics to jumpstart your writing.

  • An in-depth theoretical analysis of language development.
  • Explore dialect as a communication tool.
  • How brain injuries influence language and speech.
  • Language is a symbolic system: Discuss.
  • Examine the different linguistic disorders and challenges.
  • The impact of mother tongue on effective communication.
  • The importance of learning more than one dialect.
  • Evaluate mother tongue pronunciation and language fluency.
  • Compare and contrast the English and French languages.
  • Why do people communicate in different languages?
  • The role of Greek philosophers in language formation.
  • Language origination as an unfathomable issue.
  • Discuss language as a national identity in a multicultural nation.
  • Is there a difference between adult and child language acquisition?
  • Discuss the challenges in language development.

15 Good Applied Linguistics Topics

Applied linguistics is an essential discipline that allows learners to comprehend effective communication. Below are interesting linguistics topics to help you during writing.

  • What is applied linguistics?
  • Evaluate applied linguistics in a technological environment.
  • Discuss the intricacies of spoken and written language.
  • Explore bilingualism and multilingualism.
  • An analysis of communication barriers in delivering health services.
  • The influence of identity in a multicultural society.
  • Discuss dialect barriers in social media networks.
  • An in-depth analysis of hate speech.
  • The importance of applied linguistics development.
  • The adverse effects of social media on effective communication.
  • The impact of culture on multilingualism.
  • An in-depth evaluation of applied linguistics.
  • The influence of politics on linguistic media.
  • An analysis of practical research methods on linguistics.
  • How bilingualism enhances human personality.

15 Computational Linguistics Research Paper Topics

Computational linguistics involves technology in translation and other language-enhancing tools. Below are compelling linguistics thesis topics for your research compositions.

  • What is computational linguistics?
  • The impact of technology in speech recognition.
  • The evolution of the translation industry in enhancing communication.
  • Does translation cause communication barriers?
  • An analysis of audiovisual translation.
  • Discuss the effectiveness of supervised learning.
  • An analysis of effective programs for phonetic comparison of dialects.
  • Speech recognition: description of dialect performance.
  • An analysis of linguistic dimensions using technology.
  • Effective methods of text extraction.
  • Discuss the reasons for learning computational linguistics.
  • The influence of modern communication on computational linguistics.
  • Discuss the different approaches to effective learning.
  • An analysis of speech synthesis.
  • Discuss the benefits of machine translation.

15 Engaging Comparative Linguistics Research Paper Topics

Looking for winning research topics in linguistics? Search no more. Here are impressive comparative topic ideas for your research compositions:

  • Compare and contrast English and Latin.
  • A comparative study of speech physiology and anatomy.
  • An evaluation of the Ape language.
  • What is folk speech?
  • An analysis of historical linguistics.
  • An in-depth study of ethnographic semantics.
  • The connection between culture and linguistics.
  • A comparative analysis of phonetics in linguistics.
  • The influence of computers on dialect development.
  • Analyze communication in a paralinguistic dialect.
  • English popularity: A comparative study of the world.
  • Does accent fluency boost effective communication?
  • Neologism: An analysis of UK English.
  • Discuss the idioms of Australian English compared to American.
  • A comparative study of the Anglo-Saxon dialects.

15 Interesting Historical Linguistics Topic Ideas

Let us explore historical linguistics essay topics that will translate into remarkable papers with impressive literary arguments.

  • Discuss the significance of the Greek philosophers in language development.
  • An analysis of the preserved cuneiform writings.
  • Evaluate the origin of language theories.
  • Discuss the history of language in mythology.
  • An analysis of language translation.
  • A critical analysis of language development.
  • How speech impacts human interaction.
  • An analysis of modern communication evolution.
  • Discuss the history of written communication.
  • Analyze the different linguistics theories.
  • Why some dialects are challenging to learn.
  • What is structuralism in linguistics?
  • The effectiveness of mother tongue in linguistics.
  • The ancient relationship between French and English.
  • Is English considered indigenous?

15 Compelling Stylistics Linguistics Research Paper Topics

The following are interesting linguistics topics to help in crafting unique research papers. Peruse and pick one that suits your paper’s requirements.

  • Analyze the stylistic features of a business letter.
  • A comparative study of newspaper advertisement style.
  • An analysis of public speeches style
  • The forms and function of legal documents.
  • Discuss the functions of different newspaper genres.
  • The influence of ethnicity on linguistics.
  • Explore the effectiveness of spoken vs. written communication.
  • How effective is language translation?
  • Persuasive linguistics: An analysis of different strategies in politics.
  • The pros and cons of colonialism and the effects on African languages.
  • Discuss practical strategies for language acquisition.
  • Evaluate the social factors impacting language variation.
  • Discuss the various attitudes in society to language.
  • The impact of language on cultural identity.
  • The role of linguistics in different communities.

linguistics research topics

Having Problems with Your Paper? Our Experts Are Available 24/7

Research paper writing requires dedication in terms of time and effort. Most learners get stuck because of a lack of time and complex topics to handle. But with the correct strategy, you can simplify the entire composition. Let us look at some of the tips and tricks to help you compose an exceptional paper.

Read the essay prompt carefully

Take adequate time to acquaint yourself with the research prompt. What does your tutor expect from you? Read the assignment carefully before moving ahead with the research writing.

Choose a topic

Identify an appropriate topic through an extensive brainstorming exercise. It is pretty simple once you have the required themes in place.

Conduct comprehensive research

Carry out intense research on the topic you have selecting taking careful consideration about the relevant information. Use multiple trusted sources to extract adequate research content regarding the theme.

Develop a thesis

Organize your research and develop a powerful thesis statement. It gives your target audience an idea of the paper’s direction.

Design an outline

As per your paper requirements, design an appropriate outline that captures your entire research logically. Include an introduction, main body, and conclusion.

Writing process

Finally, start writing and make sure your arguments flow logically and clearly without any vague explanation in each paragraph.

Thorough editing and proofreading

Edit your work thoroughly and proofread for errors. Make sure it follows all the academic standard rules before turning in the paper to your tutor.

Need help with your research paper? Relax and let our qualified experts assist you in getting top-notch results. There is no need to struggle alone when our writers are available 24/7, ready to provide professional writing help. We have a team of skilled experts who are highly knowledgeable in diverse disciplines. Moreover, you will enjoy a personalized learning experience with our pro essay writers .

Whether you need help choosing linguistics anthropology research topics or composing the entire research paper, we have you covered in all aspects. No matter how complex the topic is, our experts will pull all-nighters to ensure you get your paper on time.

We are a reliable service that puts the interests of customers first. From having speedy client support to prompt deliveries, you can be sure of enjoying top-of-the-range services. We do not gamble with your academics, and that is why we promise our clients original research papers.

Therefore, contact us with detailed information about the writing service you need. Talk to us and improve your academic performance within no time.

A life lesson in Romeo and Juliet taught by death

A life lesson in Romeo and Juliet taught by death

Due to human nature, we draw conclusions only when life gives us a lesson since the experience of others is not so effective and powerful. Therefore, when analyzing and sorting out common problems we face, we may trace a parallel with well-known book characters or real historical figures. Moreover, we often compare our situations with […]

Ethical Research Paper Topics

Ethical Research Paper Topics

Writing a research paper on ethics is not an easy task, especially if you do not possess excellent writing skills and do not like to contemplate controversial questions. But an ethics course is obligatory in all higher education institutions, and students have to look for a way out and be creative. When you find an […]

Art Research Paper Topics

Art Research Paper Topics

Students obtaining degrees in fine art and art & design programs most commonly need to write a paper on art topics. However, this subject is becoming more popular in educational institutions for expanding students’ horizons. Thus, both groups of receivers of education: those who are into arts and those who only get acquainted with art […]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals

Language and linguistics articles from across Nature Portfolio

Latest research and reviews.

research paper topics languages

Does simplification hold true for machine translations? A corpus-based analysis of lexical diversity in text varieties across genres

Subtitling saudi arabic slang into english: the case of “the book of the sun” on netflix.

  • Sukayna Ali
  • Hanan Al-Jabri
  • Wan Rose Eliza Abdul Rahman

ChatGPT and the digitisation of writing

research paper topics languages

Translation, transmission and indigenization of Christianity in nineteenth-century China: south-to-north travel of the disguised San Zi Jing by Medhurst

A multi-dimensional analysis of interpreted and non-interpreted english discourses at chinese and american government press conferences.

  • Dandan Sheng

research paper topics languages

Geospace translation strategies and their cognitive construal


News and Comment

Time to revise the terminology we use to regulate water management practices.

  • Paul Jeffrey
  • Heather Smith
  • Francis Hassard

research paper topics languages

Is boredom a source of noise and/or a confound in behavioral science research?

Behavioral researchers tend to study behavior in highly controlled laboratory settings to minimize the effects of potential confounders. Yet, while doing so, the artificial setup itself might unintentionally introduce noise or confounders, such as boredom. In this perspective, we draw upon theoretical and empirical evidence to make the case that (a) some experimental setups are likely to induce boredom in participants, (b) the degree of boredom induced might differ between individuals as a function of differences in trait boredom, (c) boredom can impair participants’ attention, can make study participation more effortful, and can increase the urge to do something else (i.e., to disengage from the study). Most importantly, we argue that some participants might adjust their behavior because they are bored. Considering boredom’s potential for adding noise to data, or for being an unwanted confound, we discuss a set of recommendations on how to control for and deal with the occurrence and effects of boredom in behavioral science research.

  • Maria Meier
  • Corinna S. Martarelli
  • Wanja Wolff

research paper topics languages

Exploration of the social and philosophical underpinning of ‘the patient’—what this means for people with a long-term condition

Should healthcare professionals use the term ‘patient’? A patient is a social construct, in a biomedical model, in which each actor has their role to play. This model has been criticised as belonging to an era of medical hegemony and (mis)represents an individual seeking healthcare as one who is simply a passive participant and recipient of care. The ‘Language Matters’ campaign, for people living with diabetes, has sought to address the role of language in interactions between healthcare providers. A key point raised in the campaign is whether someone who feels well, but has ongoing healthcare input, should be referred to as a patient? In this article, we address the concept of a patient and how its use can belie a particular mindset (or ‘discourse’) in which power is established in a relationship and can lead to individuals being defined by their condition. However, for some linguistic communities (such as nurses and doctors), a patient may be considered less as one over whom they have dominion, but rather someone for whom they have specific responsibilities and duty of care. Drawing upon the philosophical theories of language—that the meaning and inference of a word is dependent on its use—we argue that the context in which use of the term patient occurs is crucial. Without more fundamental cultural disruption of the biomedical model, word substitution, in itself, will not change perception.

  • M. B. Whyte

Approaching the neuroscience of language

  • Marika Gobbo

Neural evidence of word prediction

  • Jane Aristia

The usefulness of ChatGPT for psychotherapists and patients

ChatGPT is a chatbot based on a large language model. Its application possibilities are extensive, and it is freely accessible to all people, including psychotherapists and individuals with mental illnesses. Some blog posts about the possible use of ChatGPT as a psychotherapist or as a supplement to psychotherapy already exist. Based on three detailed chats, the author analyzed the chatbot’s responses to psychotherapists seeking assistance, to patients looking for support between psychotherapy sessions, during their psychotherapists’ vacations, and to people suffering from mental illnesses who are not yet in psychotherapy. The results suggest that ChatGPT offers an interesting complement to psychotherapy and an easily accessible, good (and currently free) place to go for people with mental-health problems who have not yet sought professional help and have no psychotherapeutic experience. The information is, however, one-sided, and in any future regulation of AI it must also be made clear that the proposals are not only insufficient as a psychotherapy substitute, but also have a bias that favors certain methods while not even mentioning other approaches that may be more helpful for some people.

  • Paolo Raile

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research paper topics languages

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

Trends and hot topics in linguistics studies from 2011 to 2021: A bibliometric analysis of highly cited papers

Associated data.

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ supplementary material .

High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators was retrieved and used to identify and analyze influential contributors at the levels of journals, authors, and countries. The most frequently explored topics were identified by corpus analysis and manual checking. The retrieved topics can be grouped into five general categories: multilingual-related , language teaching , and learning related , psycho/pathological/cognitive linguistics-related , methods and tools-related , and others . Topics such as bi/multilingual(ism) , translanguaging , language/writing development , models , emotions , foreign language enjoyment (FLE) , cognition , anxiety are among the most frequently explored. Multilingual and positive trends are discerned from the investigated HCPs. The findings inform linguistic researchers of the publication characteristics of the HCPs in the linguistics field and help them pinpoint the research trends and directions to exert their efforts in future studies.

1. Introduction

Citations, as a rule, exhibit a skewed distributional pattern over the academic publications: a few papers accumulate an overwhelming large citations while the majority are rarely, if ever, cited. Correspondingly, the highly cited papers (HCPs) receive the greatest amount of attention in the academia as citations are commonly regarded as a strong indicator of research excellence. For academic professionals, following HCPs is an efficient way to stay current with the developments in a field and to make better informed decisions regarding potential research topics and directions to exert their efforts. For academic institutions, government and private agencies, and generally the science policy makers, they keep a close eye on and take advantage of this visible indicator, citations, to make more informed decisions on research funding allocation and science policy formulation. Under the backdrop of ever-growing academic outputs, there is noticeable attention shift from publication quantity to publication quality. Many countries are developing research policies to identify “excellent” universities, research groups, and researchers ( Danell, 2011 ). In a word, HCPs showcase high-quality research, encompass significant themes, and constitute a critical reference point in a research field as they are “gold bullion of science” ( Smith, 2007 ).

2. Literature review

Bibliometrics, a term coined by Pritchard (1969) , refers to the application of mathematical methods to the analysis of academic publications. Essentially this is a quantitative method to depict publication patterns within a given field based on a body of literature. There are many bibliometric studies on natural and social sciences in general ( Hsu and Ho, 2014 ; Zhu and Lei, 2022 ) and on various specific disciplines such as management sciences ( Liao et al., 2018 ), biomass research ( Chen and Ho, 2015 ), computer sciences ( Xie and Willett, 2013 ), and sport sciences ( Mancebo et al., 2013 ; Ríos et al., 2013 ), etc. In these studies, researchers tracked developments, weighed research impacts, and highlighted emerging scientific fronts with bibliometric methods. In the field of linguistics, bibliometric studies all occurred in the past few years ( van Doorslaer and Gambier, 2015 ; Lei and Liao, 2017 ; Gong et al., 2018 ; Lei and Liu, 2018 , 2019 ). These bibliometric studies mostly examined a sub-area of linguistics, such as corpus linguistics ( Liao and Lei, 2017 ), translation studies ( van Doorslaer and Gambier, 2015 ), the teaching of Chinese as a second/foreign language ( Gong et al., 2018 ), academic journals like System ( Lei and Liu, 2018 ) or Porta Linguarum ( Sabiote and Rodríguez, 2015 ), etc. Although Lei and Liu (2019) took the entire discipline of linguistics under investigation, their research is exclusively focused on applied linguistics and restricted in a limited number of journals (42 journals in total), leaving publications in other linguistics disciplines and qualified journals unexamined.

Over the recent years, a number of studies have been concerned with “excellent” papers or HCPs. For example, Small (2004) surveyed the HCPs authors’ opinions on why their papers are highly cited. The strong interest, the novelty, the utility, and the high importance of the work were among the most frequently mentioned. Most authors also considered that their selected HCPs are indeed based on their most important work in their academic career. Aksnes (2003) investigated the characteristics of HCPs and found that they were generally authored by a large number of scientists, often involving international collaboration. Some researchers even attempted to predict the HCPs by building mathematical models, implying “the first mover advantage in scientific publication” ( Newman, 2008 , 2014 ). In other words, papers published earlier in a field generally are more likely to accumulate more citations than those published later. Although many papers addressed HCPs from different perspectives, they held a common belief that HCPs are very different from less or zero cited papers and thus deserve utmost attention in academic research ( Aksnes, 2003 ; Blessinger and Hrycaj, 2010 ; Yan et al., 2022 ).

Although an increased focus on research quality can be observed in different fields, opinions diverge on the range and the inclusion criterion of excellent papers. Are they ‘highly cited’, ‘top cited’, or ‘most frequently cited’ papers? Aksnes (2003) noted two different approaches to define a highly cited article, involving absolute or relative thresholds, respectively. An absolute threshold stipulates a minimum number of citations for identifying excellent papers while a relative threshold employs the percentile rank classes, for example, the top 10% most highly cited papers in a discipline or in a publication year or in a publication set. It is important to note that citations differ significantly in different fields and disciplines. A HCP in natural sciences generally accumulates more citations than its counterpart in social sciences. Thus, it is necessary to investigate HCPs from different fields separately or adopt different inclusion criterion to ensure a valid comparison.

The present study has been motivated by two considerations. First, the sizable number of publications of varied qualities in a scientific field makes it difficult or even impossible to conduct any reliable and effective literature research. Focusing on the quality publications, the HCPs in particular, might lend more credibility to the findings on trends. Second, HCPs can serve as a great platform to discover potentially important information for the development of a discipline and understand the past, present, and future of the scientific structure. Therefore, the present study aims to investigate the hot topics and publication trends in the Web of Science category of linguistics or language & linguistics (shortened as linguistics in later references) with bibliometric methods. The study aims to answer the following three questions:

  • Who are the most productive and impactful contributors of the HCPs in WoS category of linguistics or language & linguistics in terms of publication venues, authors, and countries?
  • What are the most frequently explored topics in HCPs?
  • What are the general research trends revealed from the HCPs?

3. Materials and methods

Different from previous studies which used an arbitrary inclusion threshold (e.g., Blessinger and Hrycaj, 2010 ; Hsu and Ho, 2014 ), we rely on Essential Science Indicator (ESI) to identify the HCPs. Developed by Clarivate, a leading company in the areas of bibliometrics and scientometrics, ESI reveals emerging science trends as well as influential individuals, institutions, papers, journals, and countries in any scientific fields of inquiry by drawing on the complete WoS databases. ESI has been chosen for the following three reasons. First, ESI adopts a stricter inclusion criterion for HCPs identification. That is, a paper is selected as a HCP only when its citations exceed the top 1% citation threshold in each of the 22 ESI subject categories. Second, ESI is widely used and recognized for its reliability and authority in identifying the top-charting work, generating “excellent” metrics including hot and highly cited papers. Third, ESI automatically updates its database to generate the most recent HCPs, especially suitable for trend studies for a specified timeframe.

3.1. Data source

The data retrieval was completed at the portal of our university library on June 20, 2022. The methods to retrieve the data are described in Table 1 . The bibliometric indicators regarding the important contributors at journal/author/country levels were obtained. Specifically, after the research was completed, we clicked the “Analyze Results” bar on the result page for the detailed descriptive analysis of the retrieved bibliometric data.

Retrieval strategies.

Several points should be noted about the search strategies. First, we searched the bibliometric data from two sub-databases of WoS core collection: Social Science Citation Index (SSCI) and Arts & Humanities Citation Index (A&HCI). There is no need to include the sub-database of Science Citation Index Expanded (SCI-EXPANDED) because publications in the linguistics field are almost exclusively indexed in SSCI and A&HCI journals. WoS core collection was chosen as the data source because it boasts one of the most comprehensive and authoritative databases of bibliometric information in the world. Many previous studies utilized WoS to retrieve bibliometric data. van Oorschot et al. (2018) and Ruggeri et al. (2019) even indicated that WoS meets the highest standards in terms of impact factor and citation counts and hence guarantees the validity of any bibliometric analysis. Second, we do not restrict the document types as HCPs selection informed by ESI only considers articles and reviews. Third, we do not set the date range as the dataset of ESI-HCPs is automatically updated regularly to include the most recent 10 years of publications.

The aforementioned query obtained a total of 143 HCPs published in 48 journals contributed by 352 authors of 226 institutions. We then downloaded the raw bibliometric parameters of the 143 HCPs for follow-up analysis including publication years, authors, publication titles, countries, affiliations, abstracts, citation reports, etc. A complete list of the 143 HCPs can be found in the Supplementary Material . We collected the most recent impact factor (IF) of each journal from the 2022 Journal Citation Reports (JCR).

3.2. Data analysis

3.2.1. citation analysis.

A citation threshold is the minimum number of citations obtained by ranking papers in a research field in descending order by citation counts and then selecting the top fraction or percentage of papers. In ESI, the highly cited threshold reveals the minimum number of citations received by the top 1% of papers from each of the 10 database years. In other words, a paper has to meet the minimum citation threshold that varies by research fields and by years to enter the HCP list. Of the 22 research fields in ESI, Social Science, General is a broad field covering a number of WoS categories including linguistics and language & linguistics . We checked the ESI official website to obtain the yearly highly cited thresholds in the research field of Social Science , General as shown in Figure 1 ( https://esi.clarivate.com/ThresholdsAction.action ). As we can see, the longer a paper has been published, the more citations it has to receive to meet the threshold. We then divided the raw citation numbers of HCPs with the Highly Cited Thresholds in the corresponding year to obtain the normalized citations for each HCP.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-1052586-g001.jpg

Highly cited thresholds in the research field of Social Sciences, General.

3.2.2. Corpus analysis and manual checking

To determine the most frequently explored topics in these HCPs, we used both corpus-based analysis of word frequency and manual checking. Specifically, the more frequently a word or phrase occurs in a specifically designed corpus, the more likely it constitutes a research topic. In this study, we built an Abstract corpus with all the abstracts of the 143 HCPs, totaling 24,800 tokens. The procedures to retrieve the research topics in the Abstract corpus were as follows. First, the 143 pieces of abstracts were saved as separate .txt files in one folder. Second, AntConc ( Anthony, 2022 ), a corpus analysis tool for concordancing and text analysis, was employed to extract lists of n-grams (2–4) in decreasing order of frequency. We also generated a list of individual nouns because sometimes individual nouns can also constitute research topics. Considering our small corpus data, we adopted both frequency (3) and range criteria (3) for topic candidacy. That is, a candidate n-gram must occur at least 3 times and in at least 3 different abstract files. The frequency threshold guarantees the importance of the candidate topics while the range threshold guarantees that the topics are not overly crowded in a few number of publications. In this process, we actually tested the frequency and range thresholds several rounds for the inclusion of all the potential topics. In total, we obtained 531 nouns, 1,330 2-grams, 331 3-grams, and 81 4-grams. Third, because most of the retrieved n-grams cannot function as meaningful research topics, we manually checked all the candidate items and discussed extensively to decide their roles as potential research topics until full agreements were reached. Finally, we read all the abstracts of the 143 HCPs to further validate their roles as research topics. In the end, we got 118 topic items in total.

4.1. Main publication venues of HCPs

Of the 48 journals which published the 143 HCPs, 17 journals have contributed at least 3 HCPs ( Table 2 ), around 71.33% of the total examined HCPs (102/143), indicating that HCPs tend to be highly concentrated in a limited number of journals. The three largest publication outlets of HCPs are Bilingualism Language and Cognition (16), International Journal of Bilingual Education and Bilingualism (11), and Modern Language Journal (10). Because each journal varies greatly in the number of papers published per year and the number of HCPs is associated with journal circulations, we divided the total number of papers (TP) in the examined years (2011–2021) with the number of the HCPs to acquire the HCP percentage for each journal (HCPs/TP). The three journals with the highest HCPs/TP percentage are Annual Review of Applied Linguistics (2.26), Modern Language Journal (2.08), and Bilingualism Language and Cognition (1.74), indicating that papers published in these journals have a higher probability to enter the HCPs list.

Top 17 publication venues of HCPs.

N: the number of HCPs in each journal; N%: the percentage of HCPs in each journal in the total of 143 HCPs; TP: the total number of papers in the examined timespan (2011–2021); N/TP %: the percentage of HCPs in the total journal publications in the examined time span; TC/HCP: average citations of each HCP; R: journal ranking for the designated indicator; IF: Impact Factor in the year of 2022.

In terms of the general impact of the HCPs from each journal, we divided the number of HCPs with their total citations (TC) to obtain the average citations for each HCP (TC/HCP). The three journals with the highest TC/HCP are Journal of Memory and Language (837.86), Computational Linguistics (533.75), and Journal of Pragmatics (303.75). It indicates that even in the same WoS category, HCPs in different journals have strikingly different capability to accumulate citations. For example, the TC/HCP in System is as low as 31.73, which is even less than 4% of the highest TC/HCP in Journal of Memory and Language .

In regards to the latest journal impact factor (IF) in 2022, the top four journals with the highest IF are Computational Linguistics (7.778) , Modern Language Journal (7.5), Computer Assisted Language Learning (5.964), and Language Learning (5.24). According to the Journal Citation Reports (JCR) quantile rankings in WoS category of linguistics , all the journals on the list belong to the Q 1 (the top 25%), indicating that contributors are more likely to be attracted to contribute and cite papers in these prestigious high impact journals.

4.2. Authors of HCPs

A total of 352 authors had their names listed in the 143 HCPs, of whom 33 authors appeared in at least 2 HCPs as shown in Table 3 . We also provided in Table 3 other indicators to evaluate the authors’ productivity and impact including the total number of citations (TC), the number of citations per HCP, and the number of First author or Corresponding author HCPs (FA/CA). The reason we include the FA/CA indicator is that first authors and corresponding authors are usually considered to contribute the most and should receive greater proportion of credit in academic publications ( Marui et al., 2004 ; Dance, 2012 ).

Authors with at least 2 HCPs.

N: number of HCPs from each author; FA/CA: first author or corresponding author HCPs; TC: total citations of the HCPs from each author; C/HCP: average citations per HCP for each author.

In terms of the number of HCPs, Dewaele JM from Birkbeck Univ London tops the list with 7 HCPs with total citations of 492 (TC = 492), followed by Li C from Huazhong Univ Sci & Technol (#HCPs = 5; TC = 215) and Saito K from UCL (#HCPs = 5; TC = 576). It is to be noted that both Li C and Saito K have close academic collaborations with Dewaele JM . For example, 3 of the 5 HCPs by Li C are co-authored with Dewaele JM . The topics in their co-authored HCPs are mostly about foreign language learning emotions such as boredom , anxiety , enjoyment , the measurement , and positive psychology .

In regards to TC, Li, W . from UCL stands out as the most influential scholar among all the listed authors with total citations of 956 from 2 HCPs, followed by Norton B from Univ British Columbia (TC = 915) and Vasishth S from Univ Potsdam (TC = 694). The average citations per HCP from them are also the highest among the listed authors (478, 305, 347, respectively). It is important to note that Li, W.’ s 2 HCPs are his groundbreaking works on translanguaging which almost become must-reads for anyone who engages in translanguaging research ( Li, 2011 , 2018 ). Besides, Li, W. single authors his 2 HCPs, which is extremely rare as HCPs are often the results from multiple researchers. Norton B ’s HCPs are exploring some core issues in applied linguistics such as identity and investment , language learning , and social change that are considered the foundational work in its field ( Norton and Toohey, 2011 ; Darvin and Norton, 2015 ).

From the perspective of FA/CA papers, Li C from Huazhong Univ Sci and Technol is prominent because she is the first author of all her 5 HCPs. Her research on language learning emotions in the Chinese context is gaining widespread recognition ( Li et al., 2018 , 2019 , 2021 ; Li, 2019 , 2021 ). However, as a newly emerging researcher, most of her HCPs are published in the very recent years and hence accumulate relatively fewer citations (TC = 215). Mondada L from Univ Basel follows closely and single authors her 3 HCPs. Her work is mostly devoted to conversation analysis , multimodality , and social interaction ( Mondada, 2016 , 2018 , 2019 ).

We need to mention the following points regarding the productive authors of HCPs. First, when we calculated the number of HCPs from each author, only the papers published in the journals indexed in the investigated WoS categories were taken in account ( linguistics; language & linguistics ), which came as a compromise to protect the linguistics oriented nature of the HCPs. For example, Brysbaert M from Ghent University claimed a total of 8 HCPs at the time of the data retrieval, of which 6 HCPs were published in WoS category of psychology and more psychologically oriented, hence not included in our study. Besides, all the authors on the author list were treated equally when we calculated the number of HCPs, disregarding the author ordering. That implies that some influential authors may not be able to enter the list as their publications are comparatively fewer. Second, as some authors reported different affiliations at their different career stages, we only provide their most recent affiliation for convenience. Third, it is highly competitive to have one’s work selected as HCPs. The fact that a majority of the HCPs authors do not appear in our productive author list does not diminish their great contributions to this field. The rankings in Table 3 does not necessarily reflect the recognition authors have earned in academia at large.

4.3. Productive countries of HCPs

In total, the 143 HCPs originated from 33 countries. The most productive countries that contributed at least three HCPs are listed in Table 4 . The USA took an overwhelming lead with 59 HCPs, followed distantly by England with 31 HCPs. They also boasted the highest total citations (TC = 15,770; TC = 9,840), manifesting their high productivity and strong influence as traditional powerhouses in linguistics research. In regards to the average citations per HCP, Germany , England and the USA were the top three countries (TC/HCP = 281.67, 281.14, and 267.29, respectively). Although China held the third position with 19 HCPs published, its TC/HCP is the third from the bottom (TC/HCP = 66.84). One of the important reasons is that 13 out of the 19 HCPs contributed by scholars in China are published in the year of 2020 or 2021. The newly published HCPs may need more time to accumulate citations. Besides, 18 out of the 19 HCPs in China are first author and/or corresponding authors, indicating that scholars in China are becoming more independent and gaining more voice in English linguistics research.

Top 18 countries with at least 3 HCPs.

Two points should be noted here as to the productive countries. First, we calculated the HCP contributions from the country level instead of the region level. In other words, HCP contributions from different regions of the same country will be combined in the calculation. For example, HCPs from Scotland were added to the HCPs from England . HCPs from Hong Kong , Macau , and Taiwan are put together with the HCPs from Mainland China . In this way, a clear picture of the HCPs on the country level can be painted. Second, we manually checked the address information of the first author and corresponding author for each HCP. There are some cases where the first author or the corresponding author may report affiliations from more than one country. In this case, every country in their address list will be treated equally in the FA/CA calculation. In other word, a HCP may be classified into more than one country because of the different country backgrounds of the first and/or the corresponding author.

4.4. Top 20 HCPs

The top 20 HCPs with the highest normed citations are listed in decreasing order in Table 5 . The top cited publications can guide us to better understand the development and research topics in recent years.

Top 20 HCPs.

To save space, not full information about the HCPs is given. Some article titles have been abbreviated if they are too lengthy; for the authors, we report the first two authors and use “et al” if there are three authors or more; RC: raw citations; NC: normalized citations

By reading the titles and the abstracts of these top HCPs, we categorized the topics of the 20 HCPs into the following five groups: (i) statistical and analytical methods in (psycho)linguistics such as sentimental analysis, sentence simplification techniques, effect sizes, linear mixed models (#1, 3, 4, 6, 9, 14), (ii) language learning/teaching emotions such enjoyment, anxiety, boredom, stress (#11, 15, 16, 18, 19), (iii) translanguaging or multilinguilism (#5, 13, 20, 17), (iv) language perception (#2, 7, 10), (v) medium of instruction (#8, 12). It is no surprise that 6 out of the top 20 HCPs are about statistical methods in linguistics because language researchers aspire to employ statistics to make their research more scientific. Besides, we noticed that the papers on language teaching/learning emotions on the list are all published in the year of 2020 and 2021, indicating that these emerging topics may deserve more attention in future research. We also noticed two Covid-19 related articles (#16, 19) explored the emotions teachers and students experience during the pandemic, a timely response to the urgent need of the language learning and teaching community.

It is of special interest to note that papers from the journals indexed in multiple JCR categories seem to accumulate more citations. For example, Journal of Memory and Language , American Journal of Speech-Language Pathology , and Computational Linguistics are indexed both in SSCI and SCIE and contribute the top 4 HCPs, manifesting the advantage of these hybrid journals in amassing citations compared to the conventional language journals. Besides, different to findings from Yan et al. (2022) that most of the top HCPs in the field of radiology are reviews in document types, 19 out of the top 20 HCPs are research articles instead of reviews except Macaro et al. (2018) .

4.5. Most frequently explored topics of HCPs

After obtaining the corpus based topic items, we read all the titles and abstracts of the 143 HCPs to further validate their roles as research topics. Table 6 presents the top research topics with the observed frequency of 5 or above. We grouped these topics into five broad categories: bilingual-related, language learning/teaching-related, psycho/pathological/cognitive linguistics-related, methods and tools-related, and others . The observed frequency count for each topic in the abstract corpus were included in the brackets. We found that about 34 of the 143 HCPs are exploring bilingual related issues, the largest share among all the categorized topics, testifying its academic popularity in the examined timespan. Besides, 30 of the 143 HCPs are investigating language learning/teaching-related issues, with topics ranging from learners (e.g., EFL learners, individual difference) to multiple learning variables (e.g., learning strategy, motivation, agency). The findings here will be validated by the analysis of the keywords.

Categorization of the most explored research topics.

N: the number of the HCPs in each topic category; ELF: English as a lingua franca; CLIL: content and language integrated learning; FLE: foreign language enjoyment; FLCA: foreign language classroom anxiety

Several points should be mentioned regarding the topic candidacy. First, for similar topic expressions, we used a cover term and added the frequency counts. For example, multilingualism is a cover term for bilinguals, bilingualism, plurilingualism, and multilingualism . Second, for nouns of singular and plural forms (e.g., emotion and emotions ) or for items with different spellings (e.g., meta analysis and meta analyses ), we combined the frequency counts. Third, we found that some longer items (3 grams and 4 grams) could be subsumed to short ones (2 grams or monogram) without loss of essential meaning (e.g., working memory from working memory capacity ). In this case, the shorter ones were kept for their higher frequency. Fourth, some highly frequent terms were discarded because they were too general to be valuable topics in language research, for example, applied linguistics , language use , second language .

5. Discussion and implications

Based on 143 highly cited papers collected from the WoS categories of linguistics , the present study attempts to present a bird’s eye view of the publication landscape and the most updated research themes reflected from the HCPs in the linguistics field. Specifically, we investigated the important contributors of HCPs in terms of journals, authors and countries. Besides, we spotlighted the research topics by corpus-based analysis of the abstracts and a detailed analysis of the top HCPs. The study has produced several findings that bear important implications.

The first finding is that the HCPs are highly concentrated in a limited journals and countries. In regards to journals, those in the spheres of bilingualism and applied linguistics (e.g., language teaching and learning) are likely to accumulate more citations and hence to produce more HCPs. Journals that focus on bilingualism from a linguistic, psycholinguistic, and neuroscientific perspective are the most frequent outlets of HCPs as evidenced by the top two productive journals of HCPs, Bilingualism Language and Cognition and International Journal of Bilingual Education and Bilingualism . This can be explained by the multidisciplinary nature of bilingual-related research and the development of cognitive measurement techniques. The merits of analyzing publication venues of HCPs are two folds. One the one hand, it can point out which sources of high-quality publications in this field can be inquired for readers as most of the significant and cutting-edge achievements are concentrated in these prestigious journals. On the other hand, it also provides essential guidance or channels for authors or contributors to submit their works for higher visibility.

In terms of country distributions, the traditional powerhouses in linguistics research such as the USA and England are undoubtedly leading the HCP publications in both the number and the citations of the HCPs. However, developing countries are also becoming increasing prominent such as China and Iran , which could be traceable in the funding and support of national language policies and development policies as reported in recent studies ( Ping et al., 2009 ; Lei and Liu, 2019 ). Take China as an example. Along with economic development, China has given more impetus to academic outputs with increased investment in scientific research ( Lei and Liao, 2017 ). Therefore, researchers in China are highly motivated to publish papers in high-quality journals to win recognition in international academia and to deal with the publish or perish pressure ( Lee, 2014 ). These factors may explain the rise of China as a new emerging research powerhouse in both natural and social sciences, including English linguistics research.

The second finding is the multilingual trend in linguistics research. The dominant clustering of topics regarding multilingualism can be understood as a timely response to the multilingual research fever ( May, 2014 ). 34 out of the 143 HCPs have such words as bilingualism, bilingual, multilingualism , translanguaging , etc., in their titles, reflecting a strong multilingual tendency of the HCPs. Multilingual-related HCPs mainly involve three aspects: multilingualism from the perspectives of psycholinguistics and cognition (e.g., Luk et al., 2011 ; Leivada et al., 2020 ); multilingual teaching (e.g., Schissel et al., 2018 ; Ortega, 2019 ; Archila et al., 2021 ); language policies related to multilingualism (e.g., Shen and Gao, 2018 ). As a pedagogical process initially used to describe the bilingual classroom practice and also a frequently explored topic in HCPs, translanguaging is developed into an applied linguistics theory since Li’s Translanguaging as a Practical Theory of Language ( Li, 2018 ). The most common collocates of translanguaging in the Abstract corpus are pedagogy/pedagogies, practices, space/spaces . There are two main reasons for this multilingual turn. First, the rapid development of globalization, immigration, and overseas study programs greatly stimulate the use and research of multiple languages in different linguistic contexts. Second, in many non-English countries, courses are delivered through languages (mostly English) besides their mother tongue ( Clark, 2017 ). Students are required to use multiple languages as resources to learn and understand subjects and ideas. The burgeoning body of English Medium Instruction literature in higher education is in line with the rising interest in multilingualism. Due to the innate multidisciplinary nature, it is to be expected that, multilingualism, the topic du jour, is bound to attract more attention in the future.

The third finding is the application of Positive Psychology (PP) in second language acquisition (SLA), that is, the positive trend in linguistic research. In our analysis, 20 out of 143 HCPs have words or phrases such as emotions, enjoyment, boredom, anxiety , and positive psychology in their titles, which might signal a shift of interest in the psychology of language learners and teachers in different linguistic environments. Our study shows Foreign language enjoyment (FLE) is the most frequently explored emotion, followed by foreign language classroom anxiety (FLCA), the learners’ metaphorical left and right feet on their journey to acquiring the foreign language ( Dewaele and MacIntyre, 2016 ). In fact, the topics of PP are not entirely new to SLA. For example, studies of language motivations, affections, and good language learners all provide roots for the emergence of PP in SLA ( Naiman, 1978 ; Gardner, 2010 ). In recent years, both research and teaching applications of PP in SLA are building rapidly, with a diversity of topics already being explored such as positive education and PP interventions. It is to be noted that SLA also feeds back on PP theories and concepts besides drawing inspirations from it, which makes it “an area rich for interdisciplinary cross-fertilization of ideas” ( Macintyre et al., 2019 ).

It should be noted that subjectivity is involved when we decide and categorize the candidate topic items based on the Abstract corpus. However, the frequency and range criteria guarantee that these items are actually more explored in multiple HCPs, thus indicating topic values for further investigation. Some high frequent n-grams are abandoned because they are too general or not meaningful topics. For example, applied linguistics is too broad to be included as most of the HCPs concern issues in this research line instead of theoretical linguistics. By meaningful topics, we mean that the topics can help journal editors and readers quickly locate their interested fields ( Lei and Liu, 2019 ), as the author keywords such as bilingualism , emotions , and individual differences . The examination of the few 3/4-grams and monograms (mostly nouns) revealed that most of them were either not meaningful topics or they could be subsumed in the 2-grams. Besides, there is inevitably some overlapping in the topic categorizations. For example, some topics in the language teaching and learning category are situated and discussed within the context of multilingualism. The merits of topic categorizations are two folds: to better monitor the overlapping between the Abstract corpus-based topic items and the keywords; to roughly delineate the research strands in the HCPs for future research.

It should also be noted that all the results were based on the retrieved HCPs only. The study did not aim to paint a comprehensive and full picture of the whole landscape of linguistic research. Rather, it specifically focused on the most popular literature in a specified timeframe, thus generating the snapshots or trends in linguistic research. One of the important merits of this methodology is that some newly emerging but highly cited researchers can be spotlighted and gain more academic attention because only the metrics of HCPs are considered in calculation. On the contrary, the exclusion of some other highly cited researchers in general such as Rod Ellis and Ken Hyland just indicates that their highly cited publications are not within our investigated timeframe and cannot be interpreted as their diminishing academic influence in the field. Besides, the study does not consider the issue of collaborators or collaborations in calculating the number of HCPs for two reasons. First, although some researchers are regular collaborators such as Li CC and Dewaele JM, their individual contribution can never be undermined. Second, the study also provides additional information about the number of the FA/CA HCPs from each listed author, which may aid readers in locating their interested research.

We acknowledge that our study has some limitations that should be addressed in future research. First, our study focuses on the HCPs extracted from WoS SSCI and A&HCI journals, the alleged most celebrated papers in this field. Future studies may consider including data from other databases such as Scopus to verify the findings of the present study. Second, our Abstract corpus-based method for topic extraction involved human judgement. Although the final list was the result of several rounds of discussions among the authors, it is difficult or even impossible to avoid subjectivity and some worthy topics may be unconsciously missed. Therefore, future research may consider employing automatic algorithms to extract topics. For example, a dependency-based machine learning approach can be used to identify research topics ( Zhu and Lei, 2021 ).

Data availability statement

Author contributions.

SY: conceptualization and methodology. SY and LZ: writing-review and editing and writing-original draft. All authors contributed to the article and approved the submitted version.

This work was supported by Humanities and Social Sciences Youth Fund of China MOE under the grant 20YJC740076 and 18YJC740141.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1052586/full#supplementary-material

  • Aksnes D. W. (2003). Characteristics of highly cited papers . Res. Eval. 12 , 159–170. doi: 10.3152/147154403781776645 [ CrossRef ] [ Google Scholar ]
  • Anthony L. (2022). AntConc (version 4.0.5) Tokyo, Japan: Waseda University. Available at: https://www.laurenceanthony.net/software (Accessed June 20, 2022).
  • Archila P. A., Molina J., Truscott de Mejía A.-M. (2021). Fostering bilingual scientific writing through a systematic and purposeful code-switching pedagogical strategy . Int. J. Biling. Educ. Biling. 24 , 785–803. doi: 10.1080/13670050.2018.1516189 [ CrossRef ] [ Google Scholar ]
  • Blessinger K., Hrycaj P. (2010). Highly cited articles in library and information science: an analysis of content and authorship trends . Libr. Inf. Sci. Res. 32 , 156–162. doi: 10.1016/j.lisr.2009.12.007 [ CrossRef ] [ Google Scholar ]
  • Chen H., Ho Y. S. (2015). Highly cited articles in biomass research: a bibliometric analysis . Renew. Sust. Energ. Rev. 49 , 12–20. doi: 10.1016/j.rser.2015.04.060 [ CrossRef ] [ Google Scholar ]
  • Clark S. (2017). Translanguaging in higher education: beyond monolingual ideologies . Int. J. Biling. Educ. Biling. 22 , 1048–1051. doi: 10.1080/13670050.2017.1322568 [ CrossRef ] [ Google Scholar ]
  • Dance A. (2012). Authorship: Who’s on first? Nature 489 , 591–593. doi: 10.1038/nj7417-591a, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Danell R. (2011). Can the quality of scientific work be predicted using information on the author’s track record? J. Am. Soc. Inf. Sci. Technol. 62 , 50–60. doi: 10.1002/asi.21454 [ CrossRef ] [ Google Scholar ]
  • Darvin R., Norton B. (2015). Identity and a model of Investment in Applied Linguistics . Annu. Rev. Appl. Linguist. 35 , 36–56. doi: 10.1017/S0267190514000191 [ CrossRef ] [ Google Scholar ]
  • Dewaele J.-M., MacIntyre P. D. (2016). “ Foreign language enjoyment and foreign language classroom anxiety: the right and left feet of the language learner ” in Positive psychology in SLA . eds. Peter D. M., Tammy G., Sarah M. (Bristol, Blue Ridge Summit: Multilingual Matters; ), 215–236. [ Google Scholar ]
  • Gardner R. (2010). Motivation and second language acquisition: The socio-educational model . New York: Peter Lang. [ Google Scholar ]
  • Gong Y., Lyu B., Gao X. (2018). Research on teaching Chinese as a second or foreign language in and outside mainland China: a bibliometric analysis . Asia Pac. Educ. Res. 27 , 277–289. doi: 10.1007/s40299-018-0385-2 [ CrossRef ] [ Google Scholar ]
  • Hsu Y., Ho Y. S. (2014). Highly cited articles in health care sciences and services field in science citation index Expanded . Methods Inf. Med. 53 , 446–458. doi: 10.3414/ME14-01-0022, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lee I. (2014). Publish or perish: the myth and reality of academic publishing . Lang. Teach. 47 , 250–261. doi: 10.1017/S0261444811000504 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liao S. (2017). Publications in linguistics journals from mainland China, Hong Kong, Taiwan, and Macau (2003–2012): a bibliometric analysis . J. Quant. Ling. 24 , 54–64. doi: 10.1080/09296174.2016.1260274 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liu D. (2018). The research trends and contributions of System’s publications over the past four decades (1973–2017): a bibliometric analysis . System 80 , 1–13. doi: 10.1016/j.system.2018.10.003 [ CrossRef ] [ Google Scholar ]
  • Lei L., Liu D. (2019). Research trends in applied linguistics from 2005 to 2016: a bibliometric analysis and its implications . Appl. Linguis. 40 , 540–561. doi: 10.1093/applin/amy003 [ CrossRef ] [ Google Scholar ]
  • Leivada E., Westergaard M., Duabeitia J. A., Rothman J. (2020). On the phantom-like appearance of bilingualism effects on neurocognition: (how) should we proceed? Biling. Lang. Congn. 24 , 197–210. doi: 10.1017/S1366728920000358 [ CrossRef ] [ Google Scholar ]
  • Li W. (2011). Moment analysis and translanguaging space: discursive construction of identities by multilingual Chinese youth in Britain . Energy Fuel 43 , 1222–1235. doi: 10.1016/j.pragma.2010.07.035 [ CrossRef ] [ Google Scholar ]
  • Li W. (2018). Translanguaging as a practical theory of language . Appl. Linguis. 39 , 9–30. doi: 10.1093/applin/amx039 [ CrossRef ] [ Google Scholar ]
  • Li C. (2019). A positive psychology perspective on Chinese EFL students’ trait emotional intelligence, foreign language enjoyment and EFL learning achievement . J. Multiling. Multicult. Dev. 41 , 246–263. doi: 10.1080/01434632.2019.1614187 [ CrossRef ] [ Google Scholar ]
  • Li C. (2021). A control-value theory approach to boredom in English classes among university students in China . Mod. Lang. J. 105 , 317–334. doi: 10.1111/modl.12693 [ CrossRef ] [ Google Scholar ]
  • Li C., Dewaele J. M., Hu Y. (2021). Foreign language learning boredom: conceptualization and measurement . Appl. Ling. Rev. doi: 10.1515/applirev-2020-0124 [ CrossRef ] [ Google Scholar ]
  • Li C., Dewaele J. M., Jiang G. (2019). The complex relationship between classroom emotions and EFL achievement in China . Appl. Ling. Rev. 11 , 485–510. doi: 10.1515/applirev-2018-0043 [ CrossRef ] [ Google Scholar ]
  • Li C., Jiang G., Jean-Marc D. (2018). Understanding Chinese high school students’ foreign language enjoyment: validation of the Chinese version of the foreign language enjoyment scale . System 76 , 183–196. doi: 10.1016/j.system.2018.06.004 [ CrossRef ] [ Google Scholar ]
  • Liao S., Lei L. (2017). What we talk about when we talk about corpus: a bibliometric analysis of corpus-related research in linguistics (2000-2015) . Glottometrics 38 , 1–20. [ Google Scholar ]
  • Liao H., Tang M., Li Z., Lev B. (2018). Bibliometric analysis for highly cited papers in operations research and management science from 2008 to 2017 based on essential science indicators . Omega 88 , 223–236. doi: 10.1016/j.omega.2018.11.005 [ CrossRef ] [ Google Scholar ]
  • Luk G., Sa E. D., Bialystok E. (2011). Is there a relation between onset age of bilingualism and enhancement of cognitive control? Biling. Lang. Cogn. 14 , 588–595. doi: 10.1017/S1366728911000010 [ CrossRef ] [ Google Scholar ]
  • Macaro E., Curle S., Pun J., Dearden J. (2018). A systematic review of English medium instruction in higher education . Lang. Teach. 51 , 36–76. doi: 10.1017/S0261444817000350 [ CrossRef ] [ Google Scholar ]
  • Macintyre P., Gregersen T., Mercer S. (2019). Setting an agenda for positive psychology in SLA: theory, practice, and research . Mod. Lang. J. 103 , 262–274. doi: 10.1111/modl.12544 [ CrossRef ] [ Google Scholar ]
  • Mancebo F. P., Sapena A. F., Herrera M. V., González L., Toca H., Benavent R. A. (2013). Scientific literature analysis of judo in web of science . Arch. Budo 9 , 81–91. doi: 10.12659/AOB.883883 [ CrossRef ] [ Google Scholar ]
  • Marui M., Bozikov J., Katavi V., Hren D., Kljakovi-Gapi M., Marui A. (2004). Authorship in a small medical journal: a study of contributorship statements by corresponding authors . Sci. Eng. Ethics 10 , 493–502. doi: 10.1007/s11948-004-0007-7, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • May S. (2014). The multilingual turn: Implications for SLA, TESOL and bilingual education . New York: Routledge. [ Google Scholar ]
  • Mondada L. (2016). Challenges of multimodality: language and the body in social interaction . J. Socioling. 20 , 336–366. doi: 10.1111/josl.1_12177 [ CrossRef ] [ Google Scholar ]
  • Mondada L. (2018). Multiple temporalities of language and body in interaction: challenges for transcribing multimodality . Res. Lang. Soc. Interact. 51 , 85–106. doi: 10.1080/08351813.2018.1413878 [ CrossRef ] [ Google Scholar ]
  • Mondada L. (2019). Contemporary issues in conversation analysis: embodiment and materiality, multimodality and multisensoriality in social interaction . J. Pragmat. 145 , 47–62. doi: 10.1016/j.pragma.2019.01.016 [ CrossRef ] [ Google Scholar ]
  • Naiman N. (1978). The good language learner . Clevedon, UK: Multilingual Matters. [ Google Scholar ]
  • Newman M. (2008). The first-mover advantage in scientific publication . Eplasty 86 , 68001–68006. doi: 10.1209/0295-5075/86/68001 [ CrossRef ] [ Google Scholar ]
  • Newman M. (2014). Prediction of highly cited papers . Eplasty 105 :28002. doi: 10.1209/0295-5075/105/28002 [ CrossRef ] [ Google Scholar ]
  • Norton B., Toohey K. (2011). Identity, language learning, and social change . Lang. Teach. 44 , 412–446. doi: 10.1017/S0261444811000309 [ CrossRef ] [ Google Scholar ]
  • Ortega L. (2019). SLA and the study of equitable multilingualism . Mod. Lang. J. 103 , 23–38. doi: 10.1111/modl.12525 [ CrossRef ] [ Google Scholar ]
  • Ping Z., Thijs B., Glnzel W. (2009). Is China also becoming a giant in social sciences? Scientometrics 79 , 593–621. doi: 10.1007/s11192-007-2068-x [ CrossRef ] [ Google Scholar ]
  • Pritchard A. (1969). Statistical bibliography or bibliometrics . J. Doc. 25 , 348–349. [ Google Scholar ]
  • Ríos L. J. C., Tamao I. M., Olmos J. (2013). Bibliometric study (1922-2009) on rugby articles in research journals . South Afr. J. Res. Sport Phys. Educ. Rec. 17 , 313–109. doi: 10.3176/tr.2013.3.06 [ CrossRef ] [ Google Scholar ]
  • Ruggeri G., Orsi L., Corsi S. (2019). A bibliometric analysis of the scientific literature on Fairtrade labelling . Int. IJC 43 , 134–152. doi: 10.1111/ijcs.12492 [ CrossRef ] [ Google Scholar ]
  • Sabiote C. R., Rodríguez J. A. (2015). Bibliometric study and methodological quality indicators of the journal porta Linguarum during six year period 2008-2013 . Porta Ling. 24 , 135–150. doi: 10.30827/Digibug.53866 [ CrossRef ] [ Google Scholar ]
  • Schissel J. L., De Korne H., López-Gopar M. E. (2018). Grappling with translanguaging for teaching and assessment in culturally and linguistically diverse contexts: teacher perspectives from Oaxaca, Mexico . Int. J. Biling. Educ. Biling. 24 , 340–356. doi: 10.1080/13670050.2018.1463965 [ CrossRef ] [ Google Scholar ]
  • Shen Q., Gao X. (2018). Multilingualism and policy making in greater China: ideological and implementational spaces . Lang. Policy 18 , 1–16. doi: 10.1007/s10993-018-9473-7 [ CrossRef ] [ Google Scholar ]
  • Small H. (2004). Why authors think their papers are highly cited . Scientometrics 60 , 305–316. doi: 10.1023/B:SCIE.0000034376.55800.18 [ CrossRef ] [ Google Scholar ]
  • Smith D. R. (2007). The New Zealand timber economy, 1840–1935 . N. Z. Med. J. 120 , U2871–U2313. doi: 10.1016/0305-7488(90)90044-C, PMID: [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van Doorslaer L., Gambier Y. (2015). Measuring relationships in translation studies. On affiliations and keyword frequencies in the translation studies bibliography . Perspectives 23 , 305–319. doi: 10.1080/0907676X.2015.1026360 [ CrossRef ] [ Google Scholar ]
  • van Oorschot J. A. W. H., Hofman E., Halman J. (2018). A bibliometric review of the innovation adoption literature . Technol. Forecast. Soc. Chang. 134 , 1–21. doi: 10.1016/j.techfore.2018.04.032 [ CrossRef ] [ Google Scholar ]
  • Xie Z., Willett P. (2013). The development of computer science research in the People’s republic of China 2000–2009: a bibliometric study . Inf. Dev. 29 , 251–264. doi: 10.1177/0266666912458515 [ CrossRef ] [ Google Scholar ]
  • Yan S., Zhang H., Wang J. (2022). Trends and hot topics in radiology, nuclear medicine and medical imaging from 2011–2021: a bibliometric analysis of highly cited papers . Jpn. J. Radiol. 40 , 847–856. doi: 10.1007/s11604-022-01268-z, PMID: [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zhu H., Lei L. (2021). A dependency-based machine learning approach to the identification of research topics: a case in COVID-19 studies . Lib. Hi Tech 40 , 495–515. doi: 10.1108/LHT-01-2021-0051 [ CrossRef ] [ Google Scholar ]
  • Zhu H., Lei L. (2022). The research trends of text classification studies (2000–2020): a bibliometric analysis . SAGE Open 12 , 215824402210899–215824402210816. doi: 10.1177%2F21582440221089963 [ Google Scholar ]

Natural language processing: state of the art, current trends and challenges

  • Published: 14 July 2022
  • Volume 82 , pages 3713–3744, ( 2023 )

Cite this article

  • Diksha Khurana 1 ,
  • Aditya Koli 1 ,
  • Kiran Khatter   ORCID: orcid.org/0000-0002-1000-6102 2 &
  • Sukhdev Singh 3  

130k Accesses

265 Citations

34 Altmetric

Explore all metrics

This article has been updated

Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of N atural L anguage G eneration followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.

Similar content being viewed by others

research paper topics languages

Natural Language Processing

research paper topics languages

A survey on deep learning approaches for text-to-SQL

George Katsogiannis-Meimarakis & Georgia Koutrika

research paper topics languages

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Marco Cascella, Federico Semeraro, … Elena Bignami

Avoid common mistakes on your manuscript.

1 Introduction

A language can be defined as a set of rules or set of symbols where symbols are combined and used for conveying information or broadcasting the information. Since all the users may not be well-versed in machine specific language, N atural Language Processing (NLP) caters those users who do not have enough time to learn new languages or get perfection in it. In fact, NLP is a tract of Artificial Intelligence and Linguistics, devoted to make computers understand the statements or words written in human languages. It came into existence to ease the user’s work and to satisfy the wish to communicate with the computer in natural language, and can be classified into two parts i.e. Natural Language Understanding or Linguistics and Natural Language Generation which evolves the task to understand and generate the text. L inguistics is the science of language which includes Phonology that refers to sound, Morphology word formation, Syntax sentence structure, Semantics syntax and Pragmatics which refers to understanding. Noah Chomsky, one of the first linguists of twelfth century that started syntactic theories, marked a unique position in the field of theoretical linguistics because he revolutionized the area of syntax (Chomsky, 1965) [ 23 ]. Further, Natural Language Generation (NLG) is the process of producing phrases, sentences and paragraphs that are meaningful from an internal representation. The first objective of this paper is to give insights of the various important terminologies of NLP and NLG.

In the existing literature, most of the work in NLP is conducted by computer scientists while various other professionals have also shown interest such as linguistics, psychologists, and philosophers etc. One of the most interesting aspects of NLP is that it adds up to the knowledge of human language. The field of NLP is related with different theories and techniques that deal with the problem of natural language of communicating with the computers. Few of the researched tasks of NLP are Automatic Summarization ( Automatic summarization produces an understandable summary of a set of text and provides summaries or detailed information of text of a known type), Co-Reference Resolution ( Co-reference resolution refers to a sentence or larger set of text that determines all words which refer to the same object), Discourse Analysis ( Discourse analysis refers to the task of identifying the discourse structure of connected text i.e. the study of text in relation to social context),Machine Translation ( Machine translation refers to automatic translation of text from one language to another),Morphological Segmentation ( Morphological segmentation refers to breaking words into individual meaning-bearing morphemes), Named Entity Recognition ( Named entity recognition (NER) is used for information extraction to recognized name entities and then classify them to different classes), Optical Character Recognition ( Optical character recognition (OCR) is used for automatic text recognition by translating printed and handwritten text into machine-readable format), Part Of Speech Tagging ( Part of speech tagging describes a sentence, determines the part of speech for each word) etc. Some of these tasks have direct real-world applications such as Machine translation, Named entity recognition, Optical character recognition etc. Though NLP tasks are obviously very closely interwoven but they are used frequently, for convenience. Some of the tasks such as automatic summarization, co-reference analysis etc. act as subtasks that are used in solving larger tasks. Nowadays NLP is in the talks because of various applications and recent developments although in the late 1940s the term wasn’t even in existence. So, it will be interesting to know about the history of NLP, the progress so far has been made and some of the ongoing projects by making use of NLP. The second objective of this paper focus on these aspects. The third objective of this paper is on datasets, approaches, evaluation metrics and involved challenges in NLP. The rest of this paper is organized as follows. Section 2 deals with the first objective mentioning the various important terminologies of NLP and NLG. Section 3 deals with the history of NLP, applications of NLP and a walkthrough of the recent developments. Datasets used in NLP and various approaches are presented in Section 4 , and Section 5 is written on evaluation metrics and challenges involved in NLP. Finally, a conclusion is presented in Section 6 .

2 Components of NLP

NLP can be classified into two parts i.e., Natural Language Understanding and Natural Language Generation which evolves the task to understand and generate the text. Figure 1 presents the broad classification of NLP. The objective of this section is to discuss the Natural Language Understanding (Linguistic) (NLU) and the Natural Language Generation (NLG) .

figure 1

Broad classification of NLP

NLU enables machines to understand natural language and analyze it by extracting concepts, entities, emotion, keywords etc. It is used in customer care applications to understand the problems reported by customers either verbally or in writing. Linguistics is the science which involves the meaning of language, language context and various forms of the language. So, it is important to understand various important terminologies of NLP and different levels of NLP. We next discuss some of the commonly used terminologies in different levels of NLP.

Phonology is the part of Linguistics which refers to the systematic arrangement of sound. The term phonology comes from Ancient Greek in which the term phono means voice or sound and the suffix –logy refers to word or speech. In 1993 Nikolai Trubetzkoy stated that Phonology is “the study of sound pertaining to the system of language” whereas Lass1998 [ 66 ]wrote that phonology refers broadly with the sounds of language, concerned with sub-discipline of linguistics, behavior and organization of sounds. Phonology includes semantic use of sound to encode meaning of any Human language.

The different parts of the word represent the smallest units of meaning known as Morphemes. Morphology which comprises Nature of words, are initiated by morphemes. An example of Morpheme could be, the word precancellation can be morphologically scrutinized into three separate morphemes: the prefix pre , the root cancella , and the suffix -tion . The interpretation of morphemes stays the same across all the words, just to understand the meaning humans can break any unknown word into morphemes. For example, adding the suffix –ed to a verb, conveys that the action of the verb took place in the past. The words that cannot be divided and have meaning by themselves are called Lexical morpheme (e.g.: table, chair). The words (e.g. -ed, −ing, −est, −ly, −ful) that are combined with the lexical morpheme are known as Grammatical morphemes (eg. Worked, Consulting, Smallest, Likely, Use). The Grammatical morphemes that occur in combination called bound morphemes (eg. -ed, −ing) Bound morphemes can be divided into inflectional morphemes and derivational morphemes. Adding Inflectional morphemes to a word changes the different grammatical categories such as tense, gender, person, mood, aspect, definiteness and animacy. For example, addition of inflectional morphemes –ed changes the root park to parked . Derivational morphemes change the semantic meaning of the word when it is combined with that word. For example, in the word normalize, the addition of the bound morpheme –ize to the root normal changes the word from an adjective ( normal ) to a verb ( normalize ).

In Lexical, humans, as well as NLP systems, interpret the meaning of individual words. Sundry types of processing bestow to word-level understanding – the first of these being a part-of-speech tag to each word. In this processing, words that can act as more than one part-of-speech are assigned the most probable part-of-speech tag based on the context in which they occur. At the lexical level, Semantic representations can be replaced by the words that have one meaning. In fact, in the NLP system the nature of the representation varies according to the semantic theory deployed. Therefore, at lexical level, analysis of structure of words is performed with respect to their lexical meaning and PoS. In this analysis, text is divided into paragraphs, sentences, and words. Words that can be associated with more than one PoS are aligned with the most likely PoS tag based on the context in which they occur. At lexical level, semantic representation can also be replaced by assigning the correct POS tag which improves the understanding of the intended meaning of a sentence. It is used for cleaning and feature extraction using various techniques such as removal of stop words, stemming, lemmatization etc. Stop words such as ‘ in ’, ‘the’, ‘and’ etc. are removed as they don’t contribute to any meaningful interpretation and their frequency is also high which may affect the computation time. Stemming is used to stem the words of the text by removing the suffix of a word to obtain its root form. For example: consulting and consultant words are converted to the word consult after stemming, using word gets converted to us and driver is reduced to driv . Lemmatization does not remove the suffix of a word; in fact, it results in the source word with the use of a vocabulary. For example, in case of token drived , stemming results in “driv”, whereas lemmatization attempts to return the correct basic form either drive or drived depending on the context it is used.

After PoS tagging done at lexical level, words are grouped to phrases and phrases are grouped to form clauses and then phrases are combined to sentences at syntactic level. It emphasizes the correct formation of a sentence by analyzing the grammatical structure of the sentence. The output of this level is a sentence that reveals structural dependency between words. It is also known as parsing which uncovers the phrases that convey more meaning in comparison to the meaning of individual words. Syntactic level examines word order, stop-words, morphology and PoS of words which lexical level does not consider. Changing word order will change the dependency among words and may also affect the comprehension of sentences. For example, in the sentences “ram beats shyam in a competition” and “shyam beats ram in a competition”, only syntax is different but convey different meanings [ 139 ]. It retains the stopwords as removal of them changes the meaning of the sentence. It doesn’t support lemmatization and stemming because converting words to its basic form changes the grammar of the sentence. It focuses on identification on correct PoS of sentences. For example: in the sentence “frowns on his face”, “frowns” is a noun whereas it is a verb in the sentence “he frowns”.

On a semantic level, the most important task is to determine the proper meaning of a sentence. To understand the meaning of a sentence, human beings rely on the knowledge about language and the concepts present in that sentence, but machines can’t count on these techniques. Semantic processing determines the possible meanings of a sentence by processing its logical structure to recognize the most relevant words to understand the interactions among words or different concepts in the sentence. For example, it understands that a sentence is about “movies” even if it doesn’t comprise actual words, but it contains related concepts such as “actor”, “actress”, “dialogue” or “script”. This level of processing also incorporates the semantic disambiguation of words with multiple senses (Elizabeth D. Liddy, 2001) [ 68 ]. For example, the word “bark” as a noun can mean either as a sound that a dog makes or outer covering of the tree. The semantic level examines words for their dictionary interpretation or interpretation is derived from the context of the sentence. For example: the sentence “Krishna is good and noble.” This sentence is either talking about Lord Krishna or about a person “Krishna”. That is why, to get the proper meaning of the sentence, the appropriate interpretation is considered by looking at the rest of the sentence [ 44 ].

While syntax and semantics level deal with sentence-length units, the discourse level of NLP deals with more than one sentence. It deals with the analysis of logical structure by making connections among words and sentences that ensure its coherence. It focuses on the properties of the text that convey meaning by interpreting the relations between sentences and uncovering linguistic structures from texts at several levels (Liddy,2001) [ 68 ]. The two of the most common levels are: Anaphora Resolution an d Coreference Resolution. Anaphora resolution is achieved by recognizing the entity referenced by an anaphor to resolve the references used within the text with the same sense. For example, (i) Ram topped in the class. (ii) He was intelligent. Here i) and ii) together form a discourse. Human beings can quickly understand that the pronoun “he” in (ii) refers to “Ram” in (i). The interpretation of “He” depends on another word “Ram” presented earlier in the text. Without determining the relationship between these two structures, it would not be possible to decide why Ram topped the class and who was intelligent. Coreference resolution is achieved by finding all expressions that refer to the same entity in a text. It is an important step in various NLP applications that involve high-level NLP tasks such as document summarization, information extraction etc. In fact, anaphora is encoded through one of the processes called co-reference.

Pragmatic level focuses on the knowledge or content that comes from the outside the content of the document. It deals with what speaker implies and what listener infers. In fact, it analyzes the sentences that are not directly spoken. Real-world knowledge is used to understand what is being talked about in the text. By analyzing the context, meaningful representation of the text is derived. When a sentence is not specific and the context does not provide any specific information about that sentence, Pragmatic ambiguity arises (Walton, 1996) [ 143 ]. Pragmatic ambiguity occurs when different persons derive different interpretations of the text, depending on the context of the text. The context of a text may include the references of other sentences of the same document, which influence the understanding of the text and the background knowledge of the reader or speaker, which gives a meaning to the concepts expressed in that text. Semantic analysis focuses on literal meaning of the words, but pragmatic analysis focuses on the inferred meaning that the readers perceive based on their background knowledge. For example, the sentence “Do you know what time is it?” is interpreted to “Asking for the current time” in semantic analysis whereas in pragmatic analysis, the same sentence may refer to “expressing resentment to someone who missed the due time” in pragmatic analysis. Thus, semantic analysis is the study of the relationship between various linguistic utterances and their meanings, but pragmatic analysis is the study of context which influences our understanding of linguistic expressions. Pragmatic analysis helps users to uncover the intended meaning of the text by applying contextual background knowledge.

The goal of NLP is to accommodate one or more specialties of an algorithm or system. The metric of NLP assess on an algorithmic system allows for the integration of language understanding and language generation. It is even used in multilingual event detection. Rospocher et al. [ 112 ] purposed a novel modular system for cross-lingual event extraction for English, Dutch, and Italian Texts by using different pipelines for different languages. The system incorporates a modular set of foremost multilingual NLP tools. The pipeline integrates modules for basic NLP processing as well as more advanced tasks such as cross-lingual named entity linking, semantic role labeling and time normalization. Thus, the cross-lingual framework allows for the interpretation of events, participants, locations, and time, as well as the relations between them. Output of these individual pipelines is intended to be used as input for a system that obtains event centric knowledge graphs. All modules take standard input, to do some annotation, and produce standard output which in turn becomes the input for the next module pipelines. Their pipelines are built as a data centric architecture so that modules can be adapted and replaced. Furthermore, modular architecture allows for different configurations and for dynamic distribution.

Ambiguity is one of the major problems of natural language which occurs when one sentence can lead to different interpretations. This is usually faced in syntactic, semantic, and lexical levels. In case of syntactic level ambiguity, one sentence can be parsed into multiple syntactical forms. Semantic ambiguity occurs when the meaning of words can be misinterpreted. Lexical level ambiguity refers to ambiguity of a single word that can have multiple assertions. Each of these levels can produce ambiguities that can be solved by the knowledge of the complete sentence. The ambiguity can be solved by various methods such as Minimizing Ambiguity, Preserving Ambiguity, Interactive Disambiguation and Weighting Ambiguity [ 125 ]. Some of the methods proposed by researchers to remove ambiguity is preserving ambiguity, e.g. (Shemtov 1997; Emele & Dorna 1998; Knight & Langkilde 2000; Tong Gao et al. 2015, Umber & Bajwa 2011) [ 39 , 46 , 65 , 125 , 139 ]. Their objectives are closely in line with removal or minimizing ambiguity. They cover a wide range of ambiguities and there is a statistical element implicit in their approach.

Natural Language Generation (NLG) is the process of producing phrases, sentences and paragraphs that are meaningful from an internal representation. It is a part of Natural Language Processing and happens in four phases: identifying the goals, planning on how goals may be achieved by evaluating the situation and available communicative sources and realizing the plans as a text (Fig. 2 ). It is opposite to Understanding.

Speaker and Generator

figure 2

Components of NLG

To generate a text, we need to have a speaker or an application and a generator or a program that renders the application’s intentions into a fluent phrase relevant to the situation.

Components and Levels of Representation

The process of language generation involves the following interweaved tasks. Content selection: Information should be selected and included in the set. Depending on how this information is parsed into representational units, parts of the units may have to be removed while some others may be added by default. Textual Organization : The information must be textually organized according to the grammar, it must be ordered both sequentially and in terms of linguistic relations like modifications. Linguistic Resources : To support the information’s realization, linguistic resources must be chosen. In the end these resources will come down to choices of particular words, idioms, syntactic constructs etc. Realization : The selected and organized resources must be realized as an actual text or voice output.

Application or Speaker

This is only for maintaining the model of the situation. Here the speaker just initiates the process doesn’t take part in the language generation. It stores the history, structures the content that is potentially relevant and deploys a representation of what it knows. All these forms the situation, while selecting subset of propositions that speaker has. The only requirement is the speaker must make sense of the situation [ 91 ].

3 NLP: Then and now

In the late 1940s the term NLP wasn’t in existence, but the work regarding machine translation (MT) had started. In fact, Research in this period was not completely localized. Russian and English were the dominant languages for MT (Andreev,1967) [ 4 ]. In fact, MT/NLP research almost died in 1966 according to the ALPAC report, which concluded that MT is going nowhere. But later, some MT production systems were providing output to their customers (Hutchins, 1986) [ 60 ]. By this time, work on the use of computers for literary and linguistic studies had also started. As early as 1960, signature work influenced by AI began, with the BASEBALL Q-A systems (Green et al., 1961) [ 51 ]. LUNAR (Woods,1978) [ 152 ] and Winograd SHRDLU were natural successors of these systems, but they were seen as stepped-up sophistication, in terms of their linguistic and their task processing capabilities. There was a widespread belief that progress could only be made on the two sides, one is ARPA Speech Understanding Research (SUR) project (Lea, 1980) and other in some major system developments projects building database front ends. The front-end projects (Hendrix et al., 1978) [ 55 ] were intended to go beyond LUNAR in interfacing the large databases. In early 1980s computational grammar theory became a very active area of research linked with logics for meaning and knowledge’s ability to deal with the user’s beliefs and intentions and with functions like emphasis and themes.

By the end of the decade the powerful general purpose sentence processors like SRI’s Core Language Engine (Alshawi,1992) [ 2 ] and Discourse Representation Theory (Kamp and Reyle,1993) [ 62 ] offered a means of tackling more extended discourse within the grammatico-logical framework. This period was one of the growing communities. Practical resources, grammars, and tools and parsers became available (for example: Alvey Natural Language Tools) (Briscoe et al., 1987) [ 18 ]. The (D)ARPA speech recognition and message understanding (information extraction) conferences were not only for the tasks they addressed but for the emphasis on heavy evaluation, starting a trend that became a major feature in 1990s (Young and Chase, 1998; Sundheim and Chinchor,1993) [ 131 , 157 ]. Work on user modeling (Wahlster and Kobsa, 1989) [ 142 ] was one strand in a research paper. Cohen et al. (2002) [ 28 ] had put forwarded a first approximation of a compositional theory of tune interpretation, together with phonological assumptions on which it is based and the evidence from which they have drawn their proposals. At the same time, McKeown (1985) [ 85 ] demonstrated that rhetorical schemas could be used for producing both linguistically coherent and communicatively effective text. Some research in NLP marked important topics for future like word sense disambiguation (Small et al., 1988) [ 126 ] and probabilistic networks, statistically colored NLP, the work on the lexicon, also pointed in this direction. Statistical language processing was a major thing in 90s (Manning and Schuetze,1999) [ 75 ], because this not only involves data analysts. Information extraction and automatic summarizing (Mani and Maybury,1999) [ 74 ] was also a point of focus. Next, we present a walkthrough of the developments from the early 2000.

3.1 A walkthrough of recent developments in NLP

The main objectives of NLP include interpretation, analysis, and manipulation of natural language data for the intended purpose with the use of various algorithms, tools, and methods. However, there are many challenges involved which may depend upon the natural language data under consideration, and so makes it difficult to achieve all the objectives with a single approach. Therefore, the development of different tools and methods in the field of NLP and relevant areas of studies have received much attention from several researchers in the recent past. The developments can be seen in the Fig.  3 :

figure 3

A walkthrough of recent developments in NLP

In early 2000, neural language modeling in which the probability of occurring of next word (token) is determined given n previous words. Bendigo et al. [ 12 ] proposed the concept of feed forward neural network and lookup table which represents the n previous words in sequence. Collobert et al. [ 29 ] proposed the application of multitask learning in the field of NLP, where two convolutional models with max pooling were used to perform parts-of-speech and named entity recognition tagging. Mikolov et.al. [ 87 ] proposed a word embedding process where the dense vector representation of text was addressed. They also report the challenges faced by traditional sparse bag-of-words representation. After the advancement of word embedding, neural networks were introduced in the field of NLP where variable length input is taken for further processing. Sutskever et al. [ 132 ] proposed a general framework for sequence-to-sequence mapping where encoder and decoder networks are used to map from sequence to vector and vector to sequence respectively. In fact, the use of neural networks have played a very important role in NLP. One can observe from the existing literature that enough use of neural networks was not there in the early 2000s but till the year 2013enough discussion had happened about the use of neural networks in the field of NLP which transformed many things and further paved the way to implement various neural networks in NLP. Earlier the use of Convolutional neural networks ( CNN ) contributed to the field of image classification and analyzing visual imagery for further analysis. Later the use of CNNs can be observed in tackling problems associated with NLP tasks like Sentence Classification [ 127 ], Sentiment Analysis [ 135 ], Text Classification [ 118 ], Text Summarization [ 158 ], Machine Translation [ 70 ] and Answer Relations [ 150 ] . An article by Newatia (2019) [ 93 ] illustrates the general architecture behind any CNN model, and how it can be used in the context of NLP. One can also refer to the work of Wang and Gang [ 145 ] for the applications of CNN in NLP. Further Neural Networks those are recurrent in nature due to performing the same function for every data, also known as Recurrent Neural Networks (RNNs), have also been used in NLP, and found ideal for sequential data such as text, time series, financial data, speech, audio, video among others, see article by Thomas (2019) [ 137 ]. One of the modified versions of RNNs is Long Short-Term Memory (LSTM) which is also very useful in the cases where only the desired important information needs to be retained for a much longer time discarding the irrelevant information, see [ 52 , 58 ]. Further development in the LSTM has also led to a slightly simpler variant, called the gated recurrent unit (GRU), which has shown better results than standard LSTMs in many tasks [ 22 , 26 ]. Attention mechanisms [ 7 ] which suggest a network to learn what to pay attention to in accordance with the current hidden state and annotation together with the use of transformers have also made a significant development in NLP, see [ 141 ]. It is to be noticed that Transformers have a potential of learning longer-term dependency but are limited by a fixed-length context in the setting of language modeling. In this direction recently Dai et al. [ 30 ] proposed a novel neural architecture Transformer-XL (XL as extra-long) which enables learning dependencies beyond a fixed length of words. Further the work of Rae et al. [ 104 ] on the Compressive Transformer, an attentive sequence model which compresses memories for long-range sequence learning, may be helpful for the readers. One may also refer to the recent work by Otter et al. [ 98 ] on uses of Deep Learning for NLP, and relevant references cited therein. The use of BERT (Bidirectional Encoder Representations from Transformers) [ 33 ] model and successive models have also played an important role for NLP.

Many researchers worked on NLP, building tools and systems which makes NLP what it is today. Tools like Sentiment Analyser, Parts of Speech (POS) Taggers, Chunking, Named Entity Recognitions (NER), Emotion detection, Semantic Role Labeling have a huge contribution made to NLP, and are good topics for research. Sentiment analysis (Nasukawaetal.,2003) [ 156 ] works by extracting sentiments about a given topic, and it consists of a topic specific feature term extraction, sentiment extraction, and association by relationship analysis. It utilizes two linguistic resources for the analysis: the sentiment lexicon and the sentiment pattern database. It analyzes the documents for positive and negative words and tries to give ratings on scale −5 to +5. The mainstream of currently used tagsets is obtained from English. The most widely used tagsets as standard guidelines are designed for Indo-European languages but it is less researched on Asian languages or middle- eastern languages. Various authors have done research on making parts of speech taggers for various languages such as Arabic (Zeroual et al., 2017) [ 160 ], Sanskrit (Tapswi & Jain, 2012) [ 136 ], Hindi (Ranjan & Basu, 2003) [ 105 ] to efficiently tag and classify words as nouns, adjectives, verbs etc. Authors in [ 136 ] have used treebank technique for creating rule-based POS Tagger for Sanskrit Language. Sanskrit sentences are parsed to assign the appropriate tag to each word using suffix stripping algorithm, wherein the longest suffix is searched from the suffix table and tags are assigned. Diab et al. (2004) [ 34 ] used supervised machine learning approach and adopted Support Vector Machines (SVMs) which were trained on the Arabic Treebank to automatically tokenize parts of speech tag and annotate base phrases in Arabic text.

Chunking is a process of separating phrases from unstructured text. Since simple tokens may not represent the actual meaning of the text, it is advisable to use phrases such as “North Africa” as a single word instead of ‘North’ and ‘Africa’ separate words. Chunking known as “Shadow Parsing” labels parts of sentences with syntactic correlated keywords like Noun Phrase (NP) and Verb Phrase (VP). Chunking is often evaluated using the CoNLL 2000 shared task. Various researchers (Sha and Pereira, 2003; McDonald et al., 2005; Sun et al., 2008) [ 83 , 122 , 130 ] used CoNLL test data for chunking and used features composed of words, POS tags, and tags.

There are particular words in the document that refer to specific entities or real-world objects like location, people, organizations etc. To find the words which have a unique context and are more informative, noun phrases are considered in the text documents. Named entity recognition (NER) is a technique to recognize and separate the named entities and group them under predefined classes. But in the era of the Internet, where people use slang not the traditional or standard English which cannot be processed by standard natural language processing tools. Ritter (2011) [ 111 ] proposed the classification of named entities in tweets because standard NLP tools did not perform well on tweets. They re-built NLP pipeline starting from PoS tagging, then chunking for NER. It improved the performance in comparison to standard NLP tools.

Emotion detection investigates and identifies the types of emotion from speech, facial expressions, gestures, and text. Sharma (2016) [ 124 ] analyzed the conversations in Hinglish means mix of English and Hindi languages and identified the usage patterns of PoS. Their work was based on identification of language and POS tagging of mixed script. They tried to detect emotions in mixed script by relating machine learning and human knowledge. They have categorized sentences into 6 groups based on emotions and used TLBO technique to help the users in prioritizing their messages based on the emotions attached with the message. Seal et al. (2020) [ 120 ] proposed an efficient emotion detection method by searching emotional words from a pre-defined emotional keyword database and analyzing the emotion words, phrasal verbs, and negation words. Their proposed approach exhibited better performance than recent approaches.

Semantic Role Labeling (SRL) works by giving a semantic role to a sentence. For example, in the PropBank (Palmer et al., 2005) [ 100 ] formalism, one assigns roles to words that are arguments of a verb in the sentence. The precise arguments depend on the verb frame and if multiple verbs exist in a sentence, it might have multiple tags. State-of-the-art SRL systems comprise several stages: creating a parse tree, identifying which parse tree nodes represent the arguments of a given verb, and finally classifying these nodes to compute the corresponding SRL tags.

Event discovery in social media feeds (Benson et al.,2011) [ 13 ], using a graphical model to analyze any social media feeds to determine whether it contains the name of a person or name of a venue, place, time etc. The model operates on noisy feeds of data to extract records of events by aggregating multiple information across multiple messages, despite the noise of irrelevant noisy messages and very irregular message language, this model was able to extract records with a broader array of features on factors.

We first give insights on some of the mentioned tools and relevant work done before moving to the broad applications of NLP.

3.2 Applications of NLP

Natural Language Processing can be applied into various areas like Machine Translation, Email Spam detection, Information Extraction, Summarization, Question Answering etc. Next, we discuss some of the areas with the relevant work done in those directions.

Machine Translation

As most of the world is online, the task of making data accessible and available to all is a challenge. Major challenge in making data accessible is the language barrier. There are a multitude of languages with different sentence structure and grammar. Machine Translation is generally translating phrases from one language to another with the help of a statistical engine like Google Translate. The challenge with machine translation technologies is not directly translating words but keeping the meaning of sentences intact along with grammar and tenses. The statistical machine learning gathers as many data as they can find that seems to be parallel between two languages and they crunch their data to find the likelihood that something in Language A corresponds to something in Language B. As for Google, in September 2016, announced a new machine translation system based on artificial neural networks and Deep learning. In recent years, various methods have been proposed to automatically evaluate machine translation quality by comparing hypothesis translations with reference translations. Examples of such methods are word error rate, position-independent word error rate (Tillmann et al., 1997) [ 138 ], generation string accuracy (Bangalore et al., 2000) [ 8 ], multi-reference word error rate (Nießen et al., 2000) [ 95 ], BLEU score (Papineni et al., 2002) [ 101 ], NIST score (Doddington, 2002) [ 35 ] All these criteria try to approximate human assessment and often achieve an astonishing degree of correlation to human subjective evaluation of fluency and adequacy (Papineni et al., 2001; Doddington, 2002) [ 35 , 101 ].

Text Categorization

Categorization systems input a large flow of data like official documents, military casualty reports, market data, newswires etc. and assign them to predefined categories or indices. For example, The Carnegie Group’s Construe system (Hayes, 1991) [ 54 ], inputs Reuters articles and saves much time by doing the work that is to be done by staff or human indexers. Some companies have been using categorization systems to categorize trouble tickets or complaint requests and routing to the appropriate desks. Another application of text categorization is email spam filters. Spam filters are becoming important as the first line of defence against the unwanted emails. A false negative and false positive issue of spam filters is at the heart of NLP technology, it has brought down the challenge of extracting meaning from strings of text. A filtering solution that is applied to an email system uses a set of protocols to determine which of the incoming messages are spam; and which are not. There are several types of spam filters available. Content filters : Review the content within the message to determine whether it is spam or not. Header filters : Review the email header looking for fake information. General Blacklist filters : Stop all emails from blacklisted recipients. Rules Based Filters : It uses user-defined criteria. Such as stopping mails from a specific person or stopping mail including a specific word. Permission Filters : Require anyone sending a message to be pre-approved by the recipient. Challenge Response Filters : Requires anyone sending a message to enter a code to gain permission to send email.

Spam Filtering

It works using text categorization and in recent times, various machine learning techniques have been applied to text categorization or Anti-Spam Filtering like Rule Learning (Cohen 1996) [ 27 ], Naïve Bayes (Sahami et al., 1998; Androutsopoulos et al., 2000; Rennie.,2000) [ 5 , 109 , 115 ],Memory based Learning (Sakkiset al.,2000b) [ 117 ], Support vector machines (Druker et al., 1999) [ 36 ], Decision Trees (Carreras and Marquez, 2001) [ 19 ], Maximum Entropy Model (Berger et al. 1996) [ 14 ], Hash Forest and a rule encoding method (T. Xia, 2020) [ 153 ], sometimes combining different learners (Sakkis et al., 2001) [ 116 ]. Using these approaches is better as classifier is learned from training data rather than making by hand. The naïve bayes is preferred because of its performance despite its simplicity (Lewis, 1998) [ 67 ] In Text Categorization two types of models have been used (McCallum and Nigam, 1998) [ 77 ]. Both modules assume that a fixed vocabulary is present. But in first model a document is generated by first choosing a subset of vocabulary and then using the selected words any number of times, at least once irrespective of order. This is called Multi-variate Bernoulli model. It takes the information of which words are used in a document irrespective of number of words and order. In second model, a document is generated by choosing a set of word occurrences and arranging them in any order. This model is called multi-nomial model, in addition to the Multi-variate Bernoulli model, it also captures information on how many times a word is used in a document. Most text categorization approaches to anti-spam Email filtering have used multi variate Bernoulli model (Androutsopoulos et al., 2000) [ 5 ] [ 15 ].

Information Extraction

Information extraction is concerned with identifying phrases of interest of textual data. For many applications, extracting entities such as names, places, events, dates, times, and prices is a powerful way of summarizing the information relevant to a user’s needs. In the case of a domain specific search engine, the automatic identification of important information can increase accuracy and efficiency of a directed search. There is use of hidden Markov models (HMMs) to extract the relevant fields of research papers. These extracted text segments are used to allow searched over specific fields and to provide effective presentation of search results and to match references to papers. For example, noticing the pop-up ads on any websites showing the recent items you might have looked on an online store with discounts. In Information Retrieval two types of models have been used (McCallum and Nigam, 1998) [ 77 ]. Both modules assume that a fixed vocabulary is present. But in first model a document is generated by first choosing a subset of vocabulary and then using the selected words any number of times, at least once without any order. This is called Multi-variate Bernoulli model. It takes the information of which words are used in a document irrespective of number of words and order. In second model, a document is generated by choosing a set of word occurrences and arranging them in any order. This model is called multi-nominal model, in addition to the Multi-variate Bernoulli model, it also captures information on how many times a word is used in a document.

Discovery of knowledge is becoming important areas of research over the recent years. Knowledge discovery research use a variety of techniques to extract useful information from source documents like Parts of Speech (POS) tagging , Chunking or Shadow Parsing , Stop-words (Keywords that are used and must be removed before processing documents), Stemming (Mapping words to some base for, it has two methods, dictionary-based stemming and Porter style stemming (Porter, 1980) [ 103 ]. Former one has higher accuracy but higher cost of implementation while latter has lower implementation cost and is usually insufficient for IR). Compound or Statistical Phrases (Compounds and statistical phrases index multi token units instead of single tokens.) Word Sense Disambiguation (Word sense disambiguation is the task of understanding the correct sense of a word in context. When used for information retrieval, terms are replaced by their senses in the document vector.)

The extracted information can be applied for a variety of purposes, for example to prepare a summary, to build databases, identify keywords, classifying text items according to some pre-defined categories etc. For example, CONSTRUE, it was developed for Reuters, that is used in classifying news stories (Hayes, 1992) [ 54 ]. It has been suggested that many IE systems can successfully extract terms from documents, acquiring relations between the terms is still a difficulty. PROMETHEE is a system that extracts lexico-syntactic patterns relative to a specific conceptual relation (Morin,1999) [ 89 ]. IE systems should work at many levels, from word recognition to discourse analysis at the level of the complete document. An application of the Blank Slate Language Processor (BSLP) ( Bondale et al., 1999) [ 16 ] approach for the analysis of a real-life natural language corpus that consists of responses to open-ended questionnaires in the field of advertising.

There is a system called MITA (Metlife’s Intelligent Text Analyzer) (Glasgow et al. (1998) [ 48 ]) that extracts information from life insurance applications. Ahonen et al. (1998) [ 1 ] suggested a mainstream framework for text mining that uses pragmatic and discourse level analyses of text .


Overload of information is the real thing in this digital age, and already our reach and access to knowledge and information exceeds our capacity to understand it. This trend is not slowing down, so an ability to summarize the data while keeping the meaning intact is highly required. This is important not just allowing us the ability to recognize the understand the important information for a large set of data, it is used to understand the deeper emotional meanings; For example, a company determines the general sentiment on social media and uses it on their latest product offering. This application is useful as a valuable marketing asset.

The types of text summarization depends on the basis of the number of documents and the two important categories are single document summarization and multi document summarization (Zajic et al. 2008 [ 159 ]; Fattah and Ren 2009 [ 43 ]).Summaries can also be of two types: generic or query-focused (Gong and Liu 2001 [ 50 ]; Dunlavy et al. 2007 [ 37 ]; Wan 2008 [ 144 ]; Ouyang et al. 2011 [ 99 ]).Summarization task can be either supervised or unsupervised (Mani and Maybury 1999 [ 74 ]; Fattah and Ren 2009 [ 43 ]; Riedhammer et al. 2010 [ 110 ]). Training data is required in a supervised system for selecting relevant material from the documents. Large amount of annotated data is needed for learning techniques. Few techniques are as follows–

Bayesian Sentence based Topic Model (BSTM) uses both term-sentences and term document associations for summarizing multiple documents. (Wang et al. 2009 [ 146 ])

Factorization with Given Bases (FGB) is a language model where sentence bases are the given bases and it utilizes document-term and sentence term matrices. This approach groups and summarizes the documents simultaneously. (Wang et al. 2011) [ 147 ])

Topic Aspect-Oriented Summarization (TAOS) is based on topic factors. These topic factors are various features that describe topics such as capital words are used to represent entity. Various topics can have various aspects and various preferences of features are used to represent various aspects. (Fang et al. 2015 [ 42 ])

Dialogue System

Dialogue systems are very prominent in real world applications ranging from providing support to performing a particular action. In case of support dialogue systems, context awareness is required whereas in case to perform an action, it doesn’t require much context awareness. Earlier dialogue systems were focused on small applications such as home theater systems. These dialogue systems utilize phonemic and lexical levels of language. Habitable dialogue systems offer potential for fully automated dialog systems by utilizing all levels of a language. (Liddy, 2001) [ 68 ].This leads to producing systems that can enable robots to interact with humans in natural languages such as Google’s assistant, Windows Cortana, Apple’s Siri and Amazon’s Alexa etc.

NLP is applied in the field as well. The Linguistic String Project-Medical Language Processor is one the large scale projects of NLP in the field of medicine [ 21 , 53 , 57 , 71 , 114 ]. The LSP-MLP helps enabling physicians to extract and summarize information of any signs or symptoms, drug dosage and response data with the aim of identifying possible side effects of any medicine while highlighting or flagging data items [ 114 ]. The National Library of Medicine is developing The Specialist System [ 78 , 79 , 80 , 82 , 84 ]. It is expected to function as an Information Extraction tool for Biomedical Knowledge Bases, particularly Medline abstracts. The lexicon was created using MeSH (Medical Subject Headings), Dorland’s Illustrated Medical Dictionary and general English Dictionaries. The Centre d’Informatique Hospitaliere of the Hopital Cantonal de Geneve is working on an electronic archiving environment with NLP features [ 81 , 119 ]. In the first phase, patient records were archived. At later stage the LSP-MLP has been adapted for French [ 10 , 72 , 94 , 113 ], and finally, a proper NLP system called RECIT [ 9 , 11 , 17 , 106 ] has been developed using a method called Proximity Processing [ 88 ]. It’s task was to implement a robust and multilingual system able to analyze/comprehend medical sentences, and to preserve a knowledge of free text into a language independent knowledge representation [ 107 , 108 ]. The Columbia university of New York has developed an NLP system called MEDLEE (MEDical Language Extraction and Encoding System) that identifies clinical information in narrative reports and transforms the textual information into structured representation [ 45 ].

3.3 NLP in talk

We next discuss some of the recent NLP projects implemented by various companies:

ACE Powered GDPR Robot Launched by RAVN Systems [ 134 ]

RAVN Systems, a leading expert in Artificial Intelligence (AI), Search and Knowledge Management Solutions, announced the launch of a RAVN (“Applied Cognitive Engine”) i.e. powered software Robot to help and facilitate the GDPR (“General Data Protection Regulation”) compliance. The Robot uses AI techniques to automatically analyze documents and other types of data in any business system which is subject to GDPR rules. It allows users to search, retrieve, flag, classify, and report on data, mediated to be super sensitive under GDPR quickly and easily. Users also can identify personal data from documents, view feeds on the latest personal data that requires attention and provide reports on the data suggested to be deleted or secured. RAVN’s GDPR Robot is also able to hasten requests for information (Data Subject Access Requests - “DSAR”) in a simple and efficient way, removing the need for a physical approach to these requests which tends to be very labor thorough. Peter Wallqvist, CSO at RAVN Systems commented, “GDPR compliance is of universal paramountcy as it will be exploited by any organization that controls and processes data concerning EU citizens.

Link: http://markets.financialcontent.com/stocks/news/read/33888795/RAVN_Systems_Launch_the_ACE_Powered_GDPR_Robot

Eno A Natural Language Chatbot Launched by Capital One [ 56 ]

Capital One announces a chatbot for customers called Eno. Eno is a natural language chatbot that people socialize through texting. CapitalOne claims that Eno is First natural language SMS chatbot from a U.S. bank that allows customers to ask questions using natural language. Customers can interact with Eno asking questions about their savings and others using a text interface. Eno makes such an environment that it feels that a human is interacting. This provides a different platform than other brands that launch chatbots like Facebook Messenger and Skype. They believed that Facebook has too much access to private information of a person, which could get them into trouble with privacy laws U.S. financial institutions work under. Like Facebook Page admin can access full transcripts of the bot’s conversations. If that would be the case then the admins could easily view the personal banking information of customers with is not correct.

Link: https://www.macobserver.com/analysis/capital-one-natural-language-chatbot-eno/

Future of BI in Natural Language Processing [ 140 ]

Several companies in BI spaces are trying to get with the trend and trying hard to ensure that data becomes more friendly and easily accessible. But still there is a long way for this.BI will also make it easier to access as GUI is not needed. Because nowadays the queries are made by text or voice command on smartphones.one of the most common examples is Google might tell you today what tomorrow’s weather will be. But soon enough, we will be able to ask our personal data chatbot about customer sentiment today, and how we feel about their brand next week; all while walking down the street. Today, NLP tends to be based on turning natural language into machine language. But with time the technology matures – especially the AI component –the computer will get better at “understanding” the query and start to deliver answers rather than search results. Initially, the data chatbot will probably ask the question ‘how have revenues changed over the last three-quarters?’ and then return pages of data for you to analyze. But once it learns the semantic relations and inferences of the question, it will be able to automatically perform the filtering and formulation necessary to provide an intelligible answer, rather than simply showing you data.

Link: http://www.smartdatacollective.com/eran-levy/489410/here-s-why-natural-language-processing-future-bi

Using Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research [ 97 ]

Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research describes a theory derivation process that is used to develop a conceptual framework for medication therapy management (MTM) research. The MTM service model and chronic care model are selected as parent theories. Review article abstracts target medication therapy management in chronic disease care that were retrieved from Ovid Medline (2000–2016). Unique concepts in each abstract are extracted using Meta Map and their pair-wise co-occurrence are determined. Then the information is used to construct a network graph of concept co-occurrence that is further analyzed to identify content for the new conceptual model. 142 abstracts are analyzed. Medication adherence is the most studied drug therapy problem and co-occurred with concepts related to patient-centered interventions targeting self-management. The enhanced model consists of 65 concepts clustered into 14 constructs. The framework requires additional refinement and evaluation to determine its relevance and applicability across a broad audience including underserved settings.

Link: https://www.ncbi.nlm.nih.gov/pubmed/28269895?dopt=Abstract

Meet the Pilot, world’s first language translating earbuds [ 96 ]

The world’s first smart earpiece Pilot will soon be transcribed over 15 languages. According to Spring wise, Waverly Labs’ Pilot can already transliterate five spoken languages, English, French, Italian, Portuguese, and Spanish, and seven written affixed languages, German, Hindi, Russian, Japanese, Arabic, Korean and Mandarin Chinese. The Pilot earpiece is connected via Bluetooth to the Pilot speech translation app, which uses speech recognition, machine translation and machine learning and speech synthesis technology. Simultaneously, the user will hear the translated version of the speech on the second earpiece. Moreover, it is not necessary that conversation would be taking place between two people; only the users can join in and discuss as a group. As if now the user may experience a few second lag interpolated the speech and translation, which Waverly Labs pursue to reduce. The Pilot earpiece will be available from September but can be pre-ordered now for $249. The earpieces can also be used for streaming music, answering voice calls, and getting audio notifications.

Link: https://www.indiegogo.com/projects/meet-the-pilot-smart-earpiece-language-translator-headphones-travel#/

4 Datasets in NLP and state-of-the-art models

The objective of this section is to present the various datasets used in NLP and some state-of-the-art models in NLP.

4.1 Datasets in NLP

Corpus is a collection of linguistic data, either compiled from written texts or transcribed from recorded speech. Corpora are intended primarily for testing linguistic hypotheses - e.g., to determine how a certain sound, word, or syntactic construction is used across a culture or language. There are various types of corpus: In an annotated corpus, the implicit information in the plain text has been made explicit by specific annotations. Un-annotated corpus contains raw state of plain text. Different languages can be compared using a reference corpus. Monitor corpora are non-finite collections of texts which are mostly used in lexicography. Multilingual corpus refers to a type of corpus that contains small collections of monolingual corpora based on the same sampling procedure and categories for different languages. Parallel corpus contains texts in one language and their translations into other languages which are aligned sentence phrase by phrase. Reference corpus contains text of spoken (formal and informal) and written (formal and informal) language which represents various social and situational contexts. Speech corpus contains recorded speech and transcriptions of recording and the time each word occurred in the recorded speech. There are various datasets available for natural language processing; some of these are listed below for different use cases:

Sentiment Analysis: Sentiment analysis is a rapidly expanding field of natural language processing (NLP) used in a variety of fields such as politics, business etc. Majorly used datasets for sentiment analysis are:

Stanford Sentiment Treebank (SST): Socher et al. introduced SST containing sentiment labels for 215,154 phrases in parse trees for 11,855 sentences from movie reviews posing novel sentiment compositional difficulties [ 127 ].

Sentiment140: It contains 1.6 million tweets annotated with negative, neutral and positive labels.

Paper Reviews: It provides reviews of computing and informatics conferences written in English and Spanish languages. It has 405 reviews which are evaluated on a 5-point scale ranging from very negative to very positive.

IMDB: For natural language processing, text analytics, and sentiment analysis, this dataset offers thousands of movie reviews split into training and test datasets. This dataset was introduced in by Mass et al. in 2011 [ 73 ].

G.Rama Rohit Reddy of the Language Technologies Research Centre, KCIS, IIIT Hyderabad, generated the corpus “Sentiraama.” The corpus is divided into four datasets, each of which is annotated with a two-value scale that distinguishes between positive and negative sentiment at the document level. The corpus contains data from a variety of fields, including book reviews, product reviews, movie reviews, and song lyrics. The annotators meticulously followed the annotation technique for each of them. The folder “Song Lyrics” in the corpus contains 339 Telugu song lyrics written in Telugu script [ 121 ].

Language Modelling: Language models analyse text data to calculate word probability. They use an algorithm to interpret the data, which establishes rules for context in natural language. The model then uses these rules to accurately predict or construct new sentences. The model basically learns the basic characteristics and features of language and then applies them to new phrases. Majorly used datasets for Language modeling are as follows:

Salesforce’s WikiText-103 dataset has 103 million tokens collected from 28,475 featured articles from Wikipedia.

WikiText-2 is a scaled-down version of WikiText-103. It contains 2 million tokens with a 33,278 jargon size.

Penn Treebank piece of the Wall Street Diary corpus includes 929,000 tokens for training, 73,000 tokens for validation, and 82,000 tokens for testing purposes. Its context is limited since it comprises sentences rather than paragraphs [ 76 ].

The Ministry of Electronics and Information Technology’s Technology Development Programme for Indian Languages (TDIL) launched its own data distribution portal ( www.tdil-dc.in ) which has cataloged datasets [ 24 ].

Machine Translation: The task of converting the text of one natural language into another language while keeping the sense of the input text is known as machine translation. Majorly used datasets are as follows:

Tatoeba is a collection of multilingual sentence pairings. A tab-delimited pair of an English text sequence and the translated French text sequence appears on each line of the dataset. Each text sequence might be as simple as a single sentence or as complex as a paragraph of many sentences.

The Europarl parallel corpus is derived from the European Parliament’s proceedings. It is available in 21 European languages [ 40 ].

WMT14 provides machine translation pairs for English-German and English-French. Separately, these datasets comprise 4.5 million and 35 million sentence sets. Byte-Pair Encoding with 32 K tasks is used to encode the phrases.

There are around 160,000 sentence pairings in the IWSLT 14. The dataset includes descriptions in English-German (En-De) and German-English (De-En) languages. There are around 200 K training sentence sets in the IWSLT 13 dataset.

The IIT Bombay English-Hindi corpus comprises parallel corpora for English-Hindi as well as monolingual Hindi corpora gathered from several existing sources and corpora generated over time at IIT Bombay’s Centre for Indian Language Technology.

Question Answering System: Question answering systems provide real-time responses which are widely used in customer care services. The datasets used for dialogue system/question answering system are as follows:

Stanford Question Answering Dataset (SQuAD): it is a reading comprehension dataset made up of questions posed by crowd workers on a collection of Wikipedia articles.

Natural Questions: It is a large-scale corpus presented by Google used for training and assessing open-domain question answering systems. It includes 300,000 naturally occurring queries as well as human-annotated responses from Wikipedia pages for use in QA system training.

Question Answering in Context (QuAC): This dataset is used to describe, comprehend, and participate in information seeking conversation. In this dataset, instances are made up of an interactive discussion between two crowd workers: a student who asks a series of open-ended questions about an unknown Wikipedia text, and a teacher who responds by offering brief extracts from the text.

The neural learning models are overtaking traditional models for NLP [ 64 , 127 ]. In [ 64 ], authors used CNN (Convolutional Neural Network) model for sentiment analysis of movie reviews and achieved 81.5% accuracy. The results illustrate that using CNN was an appropriate replacement for state-of-the-art methods. Authors [ 127 ] have combined SST and Recursive Neural Tensor Network for sentiment analysis of the single sentence. This model amplifies the accuracy by 5.4% for sentence classification compared to traditional NLP models. Authors [ 135 ] proposed a combined Recurrent Neural Network and Transformer model for sentiment analysis. This hybrid model was tested on three different datasets: Twitter US Airline Sentiment, IMDB, and Sentiment 140: and achieved F1 scores of 91%, 93%, and 90%, respectively. This model’s performance outshined the state-of-art methods.

Santoro et al. [ 118 ] introduced a rational recurrent neural network with the capacity to learn on classifying the information and perform complex reasoning based on the interactions between compartmentalized information. They used the relational memory core to handle such interactions. Finally, the model was tested for language modeling on three different datasets (GigaWord, Project Gutenberg, and WikiText-103). Further, they mapped the performance of their model to traditional approaches for dealing with relational reasoning on compartmentalized information. The results achieved with RMC show improved performance.

Merity et al. [ 86 ] extended conventional word-level language models based on Quasi-Recurrent Neural Network and LSTM to handle the granularity at character and word level. They tuned the parameters for character-level modeling using Penn Treebank dataset and word-level modeling using WikiText-103. In both cases, their model outshined the state-of-art methods.

Luong et al. [ 70 ] used neural machine translation on the WMT14 dataset and performed translation of English text to French text. The model demonstrated a significant improvement of up to 2.8 bi-lingual evaluation understudy (BLEU) scores compared to various neural machine translation systems. It outperformed the commonly used MT system on a WMT 14 dataset.

Fan et al. [ 41 ] introduced a gradient-based neural architecture search algorithm that automatically finds architecture with better performance than a transformer, conventional NMT models. They tested their model on WMT14 (English-German Translation), IWSLT14 (German-English translation), and WMT18 (Finnish-to-English translation) and achieved 30.1, 36.1, and 26.4 BLEU points, which shows better performance than Transformer baselines.

Wiese et al. [ 150 ] introduced a deep learning approach based on domain adaptation techniques for handling biomedical question answering tasks. Their model revealed the state-of-the-art performance on biomedical question answers, and the model outperformed the state-of-the-art methods in domains.

Seunghak et al. [ 158 ] designed a Memory-Augmented-Machine-Comprehension-Network (MAMCN) to handle dependencies faced in reading comprehension. The model achieved state-of-the-art performance on document-level using TriviaQA and QUASAR-T datasets, and paragraph-level using SQuAD datasets.

Xie et al. [ 154 ] proposed a neural architecture where candidate answers and their representation learning are constituent centric, guided by a parse tree. Under this architecture, the search space of candidate answers is reduced while preserving the hierarchical, syntactic, and compositional structure among constituents. Using SQuAD, the model delivers state-of-the-art performance.

4.2 State-of-the-art models in NLP

Rationalist approach or symbolic approach assumes that a crucial part of the knowledge in the human mind is not derived by the senses but is firm in advance, probably by genetic inheritance. Noam Chomsky was the strongest advocate of this approach. It was believed that machines can be made to function like the human brain by giving some fundamental knowledge and reasoning mechanism linguistics knowledge is directly encoded in rule or other forms of representation. This helps the automatic process of natural languages [ 92 ]. Statistical and machine learning entail evolution of algorithms that allow a program to infer patterns. An iterative process is used to characterize a given algorithm’s underlying algorithm that is optimized by a numerical measure that characterizes numerical parameters and learning phase. Machine-learning models can be predominantly categorized as either generative or discriminative. Generative methods can generate synthetic data because of which they create rich models of probability distributions. Discriminative methods are more functional and have right estimating posterior probabilities and are based on observations. Srihari [ 129 ] explains the different generative models as one with a resemblance that is used to spot an unknown speaker’s language and would bid the deep knowledge of numerous languages to perform the match. Discriminative methods rely on a less knowledge-intensive approach and using distinction between languages. Whereas generative models can become troublesome when many features are used and discriminative models allow use of more features [ 38 ]. Few of the examples of discriminative methods are Logistic regression and conditional random fields (CRFs), generative methods are Naive Bayes classifiers and hidden Markov models (HMMs).

Naive Bayes Classifiers

Naive Bayes is a probabilistic algorithm which is based on probability theory and Bayes’ Theorem to predict the tag of a text such as news or customer review. It helps to calculate the probability of each tag for the given text and return the tag with the highest probability. Bayes’ Theorem is used to predict the probability of a feature based on prior knowledge of conditions that might be related to that feature. The choice of area in NLP using Naïve Bayes Classifiers could be in usual tasks such as segmentation and translation but it is also explored in unusual areas like segmentation for infant learning and identifying documents for opinions and facts. Anggraeni et al. (2019) [ 61 ] used ML and AI to create a question-and-answer system for retrieving information about hearing loss. They developed I-Chat Bot which understands the user input and provides an appropriate response and produces a model which can be used in the search for information about required hearing impairments. The problem with naïve bayes is that we may end up with zero probabilities when we meet words in the test data for a certain class that are not present in the training data.

Hidden Markov Model (HMM)

An HMM is a system where a shifting takes place between several states, generating feasible output symbols with each switch. The sets of viable states and unique symbols may be large, but finite and known. We can describe the outputs, but the system’s internals are hidden. Few of the problems could be solved by Inference A certain sequence of output symbols, compute the probabilities of one or more candidate states with sequences. Patterns matching the state-switch sequence are most likely to have generated a particular output-symbol sequence. Training the output-symbol chain data, reckon the state-switch/output probabilities that fit this data best.

Hidden Markov Models are extensively used for speech recognition, where the output sequence is matched to the sequence of individual phonemes. HMM is not restricted to this application; it has several others such as bioinformatics problems, for example, multiple sequence alignment [ 128 ]. Sonnhammer mentioned that Pfam holds multiple alignments and hidden Markov model-based profiles (HMM-profiles) of entire protein domains. The cue of domain boundaries, family members and alignment are done semi-automatically found on expert knowledge, sequence similarity, other protein family databases and the capability of HMM-profiles to correctly identify and align the members. HMM may be used for a variety of NLP applications, including word prediction, sentence production, quality assurance, and intrusion detection systems [ 133 ].

Neural Network

Earlier machine learning techniques such as Naïve Bayes, HMM etc. were majorly used for NLP but by the end of 2010, neural networks transformed and enhanced NLP tasks by learning multilevel features. Major use of neural networks in NLP is observed for word embedding where words are represented in the form of vectors. These vectors can be used to recognize similar words by observing their closeness in this vector space, other uses of neural networks are observed in information retrieval, text summarization, text classification, machine translation, sentiment analysis and speech recognition. Initially focus was on feedforward [ 49 ] and CNN (convolutional neural network) architecture [ 69 ] but later researchers adopted recurrent neural networks to capture the context of a word with respect to surrounding words of a sentence. LSTM (Long Short-Term Memory), a variant of RNN, is used in various tasks such as word prediction, and sentence topic prediction. [ 47 ] In order to observe the word arrangement in forward and backward direction, bi-directional LSTM is explored by researchers [ 59 ]. In case of machine translation, encoder-decoder architecture is used where dimensionality of input and output vector is not known. Neural networks can be used to anticipate a state that has not yet been seen, such as future states for which predictors exist whereas HMM predicts hidden states.

Bi-directional Encoder Representations from Transformers (BERT) is a pre-trained model with unlabeled text available on BookCorpus and English Wikipedia. This can be fine-tuned to capture context for various NLP tasks such as question answering, sentiment analysis, text classification, sentence embedding, interpreting ambiguity in the text etc. [ 25 , 33 , 90 , 148 ]. Earlier language-based models examine the text in either of one direction which is used for sentence generation by predicting the next word whereas the BERT model examines the text in both directions simultaneously for better language understanding. BERT provides contextual embedding for each word present in the text unlike context-free models (word2vec and GloVe). For example, in the sentences “he is going to the riverbank for a walk” and “he is going to the bank to withdraw some money”, word2vec will have one vector representation for “bank” in both the sentences whereas BERT will have different vector representation for “bank”. Muller et al. [ 90 ] used the BERT model to analyze the tweets on covid-19 content. The use of the BERT model in the legal domain was explored by Chalkidis et al. [ 20 ].

Since BERT considers up to 512 tokens, this is the reason if there is a long text sequence that must be divided into multiple short text sequences of 512 tokens. This is the limitation of BERT as it lacks in handling large text sequences.

5 Evaluation metrics and challenges

The objective of this section is to discuss evaluation metrics used to evaluate the model’s performance and involved challenges.

5.1 Evaluation metrics

Since the number of labels in most classification problems is fixed, it is easy to determine the score for each class and, as a result, the loss from the ground truth. In image generation problems, the output resolution and ground truth are both fixed. As a result, we can calculate the loss at the pixel level using ground truth. But in NLP, though output format is predetermined in the case of NLP, dimensions cannot be specified. It is because a single statement can be expressed in multiple ways without changing the intent and meaning of that statement. Evaluation metrics are important to evaluate the model’s performance if we were trying to solve two problems with one model.

BLEU (BiLingual Evaluation Understudy) Score: Each word in the output sentence is scored 1 if it appears in either of the reference sentences and a 0 if it does not. Further the number of words that appeared in one of the reference translations is divided by the total number of words in the output sentence to normalize the count so that it is always between 0 and 1. For example, if ground truth is “He is playing chess in the backyard” and output sentences are S1: “He is playing tennis in the backyard”, S2: “He is playing badminton in the backyard”, S3: “He is playing movie in the backyard” and S4: “backyard backyard backyard backyard backyard backyard backyard”. The score of S1, S2 and S3 would be 6/7,6/7 and 6/7. All sentences are getting the same score though information in S1 and S3 is not same. This is because BELU considers words in a sentence contribute equally to the meaning of a sentence which is not the case in real-world scenario. Using combination of uni-gram, bi-gram and n-grams, we can to capture the order of a sentence. We may also set a limit on how many times each word is counted based on how many times it appears in each reference phrase, which helps us prevent excessive repetition.

GLUE (General Language Understanding Evaluation) score: Previously, NLP models were almost usually built to perform effectively on a unique job. Various models such as LSTM, Bi-LSTM were trained solely for this task, and very rarely generalized to other tasks. The model which is used for named entity recognition can perform for textual entailment. GLUE is a set of datasets for training, assessing, and comparing NLP models. It includes nine diverse task datasets designed to test a model’s language understanding. To acquire a comprehensive assessment of a model’s performance, GLUE tests the model on a variety of tasks rather than a single one. Single-sentence tasks, similarity and paraphrase tasks, and inference tasks are among them. For example, in sentiment analysis of customer reviews, we might be interested in analyzing ambiguous reviews and determining which product the client is referring to in his reviews. Thus, the model obtains a good “knowledge” of language in general after some generalized pre-training. When the time comes to test out a model to meet a given task, this universal “knowledge” gives us an advantage. With GLUE, researchers can evaluate their model and score it on all nine tasks. The final performance score model is the average of those nine scores. It makes little difference how the model looks or works if it can analyze inputs and predict outcomes for all the activities.

Considering these metrics in mind, it helps to evaluate the performance of an NLP model for a particular task or a variety of tasks.

5.2 Challenges

The applications of NLP have been growing day by day, and with these new challenges are also occurring despite a lot of work done in the recent past. Some of the common challenges are: Contextual words and phrases in the language where same words and phrases can have different meanings in a sentence which are easy for the humans to understand but makes a challenging task. Such type of challenges can also be faced with dealing Synonyms in the language because humans use many different words to express the same idea, also in the language different levels of complexity such as large, huge, and big may be used by the different persons which become a challenging task to process the language and design algorithms to adopt all these issues. Further in language, Homonyms, the words used to be pronounced the same but have different definitions are also problematic for question answering and speech-to-text applications because they aren’t written in text form. Sentences using sarcasm and irony sometimes may be understood in the opposite way by the humans, and so designing models to deal with such sentences is a really challenging task in NLP. Furthermore, the sentences in the language having any type of ambiguity in the sense of interpreting in more than one way is also an area to work upon where more accuracy can be achieved. Language containing informal phrases, expressions, idioms, and culture-specific lingo make difficult to design models intended for the broad use, however having a lot of data on which training and updating on regular basis may improve the models, but it is a really challenging task to deal with the words having different meaning in different geographic areas. In fact, such types of issues also occur in dealing with different domains such as the meaning of words or sentences may be different in the education industry but have different meaning in health, law, defense etc. So, the models for NLP may be working good for an individual domain, geographic area but for a broad use such challenges need to be tackled. Further together with the above-mentioned challenges misspelled or misused words can also create a problem, although autocorrect and grammar corrections applications have improved a lot due to the continuous developments in the direction but predicting the intention of the writer that to from a specific domain, geographic area by considering sarcasm, expressions, informal phrases etc. is really a big challenge. There is no doubt that for most common widely used languages models for NLP have been doing very well, and further improving day by day but still there is a need for models for all the persons rather than specific knowledge of a particular language and technology. One may further refer to the work of Sharifirad and Matwin (2019) [ 123 ] for classification of different online harassment categories and challenges, Baclic et.al. (2020) [ 6 ] and Wong et al. (2018) [ 151 ] for challenges and opportunities in public health, Kang et.al. (2020) [ 63 ] for detailed literature survey and technological challenges relevant to management research and NLP, and a recent review work by Alshemali and Kalita (2020) [ 3 ], and references cited there in.

In the recent past, models dealing with Visual Commonsense Reasoning [ 31 ] and NLP have also been getting attention of the several researchers and seems a promising and challenging area to work upon. These models try to extract the information from an image, video using a visual reasoning paradigm such as the humans can infer from a given image, video beyond what is visually obvious, such as objects’ functions, people’s intents, and mental states. In this direction, recently Wen and Peng (2020) [ 149 ] suggested a model to capture knowledge from different perspectives, and perceive common sense in advance, and the results based on the conducted experiments on visual commonsense reasoning dataset VCR seems very satisfactory and effective. The work of Peng and Chi (2019) [ 102 ], that proposes Domain Adaptation with Scene Graph approach to transfer knowledge from the source domain with the objective to improve cross-media retrieval in the target domain, and Yen et al. (2019) [ 155 ] is also very useful to further explore the use of NLP and in its relevant domains.

6 Conclusion

This paper is written with three objectives. The first objective gives insights of the various important terminologies of NLP and NLG, and can be useful for the readers interested to start their early career in NLP and work relevant to its applications. The second objective of this paper focuses on the history, applications, and recent developments in the field of NLP. The third objective is to discuss datasets, approaches and evaluation metrics used in NLP. The relevant work done in the existing literature with their findings and some of the important applications and projects in NLP are also discussed in the paper. The last two objectives may serve as a literature survey for the readers already working in the NLP and relevant fields, and further can provide motivation to explore the fields mentioned in this paper. It is to be noticed that even though a great amount of work on natural language processing is available in literature surveys (one may refer to [ 15 , 32 , 63 , 98 , 133 , 151 ] focusing on one domain such as usage of deep-learning techniques in NLP, techniques used for email spam filtering, medication safety, management research, intrusion detection, and Gujarati language etc.), still there is not much work on regional languages, which can be the focus of future research.

Change history

25 july 2022.

Affiliation 3 has been added into the online PDF.

Ahonen H, Heinonen O, Klemettinen M, Verkamo AI (1998) Applying data mining techniques for descriptive phrase extraction in digital document collections. In research and technology advances in digital libraries, 1998. ADL 98. Proceedings. IEEE international forum on (pp. 2-11). IEEE

Alshawi H (1992) The core language engine. MIT press

Alshemali B, Kalita J (2020) Improving the reliability of deep neural networks in NLP: A review. Knowl-Based Syst 191:105210

Article   Google Scholar  

Andreev ND (1967) The intermediary language as the focal point of machine translation. In: Booth AD (ed) Machine translation. North Holland Publishing Company, Amsterdam, pp 3–27

Google Scholar  

Androutsopoulos I, Paliouras G, Karkaletsis V, Sakkis G, Spyropoulos CD, Stamatopoulos P (2000) Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. arXiv preprint cs/0009009

Baclic O, Tunis M, Young K, Doan C, Swerdfeger H, Schonfeld J (2020) Artificial intelligence in public health: challenges and opportunities for public health made possible by advances in natural language processing. Can Commun Dis Rep 46(6):161

Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In ICLR 2015

Bangalore S, Rambow O, Whittaker S (2000) Evaluation metrics for generation. In proceedings of the first international conference on natural language generation-volume 14 (pp. 1-8). Assoc Comput Linguist

Baud RH, Rassinoux AM, Scherrer JR (1991) Knowledge representation of discharge summaries. In AIME 91 (pp. 173–182). Springer, Berlin Heidelberg

Baud RH, Rassinoux AM, Scherrer JR (1992) Natural language processing and semantical representation of medical texts. Methods Inf Med 31(2):117–125

Baud RH, Alpay L, Lovis C (1994) Let’s meet the users with natural language understanding. Knowledge and Decisions in Health Telematics: The Next Decade 12:103

Bengio Y, Ducharme R, Vincent P (2001) A neural probabilistic language model. Proceedings of NIPS

Benson E, Haghighi A, Barzilay R (2011) Event discovery in social media feeds. In proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies-volume 1 (pp. 389-398). Assoc Comput Linguist

Berger AL, Della Pietra SA, Della Pietra VJ (1996) A maximum entropy approach to natural language processing. Computational Linguistics 22(1):39–71

Blanzieri E, Bryl A (2008) A survey of learning-based techniques of email spam filtering. Artif Intell Rev 29(1):63–92

Bondale N, Maloor P, Vaidyanathan A, Sengupta S, Rao PV (1999) Extraction of information from open-ended questionnaires using natural language processing techniques. Computer Science and Informatics 29(2):15–22

Borst F, Sager N, Nhàn NT, Su Y, Lyman M, Tick LJ, ..., Scherrer JR (1989) Analyse automatique de comptes rendus d'hospitalisation. In Degoulet P, Stephan JC, Venot A, Yvon PJ, rédacteurs. Informatique et Santé, Informatique et Gestion des Unités de Soins, Comptes Rendus du Colloque AIM-IF, Paris (pp. 246–56). [5]

Briscoe EJ, Grover C, Boguraev B, Carroll J (1987) A formalism and environment for the development of a large grammar of English. IJCAI 87:703–708

Carreras X, Marquez L (2001) Boosting trees for anti-spam email filtering. arXiv preprint cs/0109015

Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2020) LEGAL-BERT: the muppets straight out of law school. arXiv preprint arXiv:2010.02559

Chi EC, Lyman MS, Sager N, Friedman C, Macleod C (1985) A database of computer-structured narrative: methods of computing complex relations. In proceedings of the annual symposium on computer application in medical care (p. 221). Am Med Inform Assoc

Cho K, Van Merriënboer B, Bahdanau D, Bengio Y, (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259

Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge, Massachusetts

Choudhary N (2021) LDC-IL: the Indian repository of resources for language technology. Lang Resources & Evaluation 55:855–867. https://doi.org/10.1007/s10579-020-09523-3

Chouikhi H, Chniter H, Jarray F (2021) Arabic sentiment analysis using BERT model. In international conference on computational collective intelligence (pp. 621-632). Springer, Cham

Chung J, Gulcehre C, Cho K, Bengio Y, (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

Cohen WW (1996) Learning rules that classify e-mail. In AAAI spring symposium on machine learning in information access (Vol. 18, p. 25)

Cohen PR, Morgan J, Ramsay AM (2002) Intention in communication, Am J Psychol 104(4)

Collobert R, Weston J (2008) A unified architecture for natural language processing. In proceedings of the 25th international conference on machine learning (pp. 160–167)

Dai Z, Yang Z, Yang Y, Carbonell J, Le QV, Salakhutdinov R, (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860

Davis E, Marcus G (2015) Commonsense reasoning and commonsense knowledge in artificial intelligence. Commun ACM 58(9):92–103

Desai NP, Dabhi VK (2022) Resources and components for Gujarati NLP systems: a survey. Artif Intell Rev:1–19

Devlin J, Chang MW, Lee K, Toutanova K, (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

Diab M, Hacioglu K, Jurafsky D (2004) Automatic tagging of Arabic text: From raw text to base phrase chunks. In Proceedings of HLT-NAACL 2004: Short papers (pp. 149–152). Assoc Computat Linguist

Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In proceedings of the second international conference on human language technology research (pp. 138-145). Morgan Kaufmann publishers Inc

Drucker H, Wu D, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10(5):1048–1054

Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) QCS: A system for querying, clustering and summarizing documents. Inf Process Manag 43(6):1588–1605

Elkan C (2008) Log-Linear Models and Conditional Random Fields. http://cseweb.ucsd.edu/welkan/250B/cikmtutorial.pdf accessed 28 Jun 2017.

Emele MC, Dorna M (1998) Ambiguity preserving machine translation using packed representations. In proceedings of the 36th annual meeting of the Association for Computational Linguistics and 17th international conference on computational linguistics-volume 1 (pp. 365-371). Association for Computational Linguistics

Europarl: A Parallel Corpus for Statistical Machine Translation (2005) Philipp Koehn , MT Summit 2005

Fan Y, Tian F, Xia Y, Qin T, Li XY, Liu TY (2020) Searching better architectures for neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28:1574–1585

Fang H, Lu W, Wu F, Zhang Y, Shang X, Shao J, Zhuang Y (2015) Topic aspect-oriented summarization via group selection. Neurocomputing 149:1613–1619

Fattah MA, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23(1):126–144

Feldman S (1999) NLP meets the jabberwocky: natural language processing in information retrieval. Online-Weston Then Wilton 23:62–73

Friedman C, Cimino JJ, Johnson SB (1993) A conceptual model for clinical radiology reports. In proceedings of the annual symposium on computer application in medical care (p. 829). Am Med Inform Assoc

Gao T, Dontcheva M, Adar E, Liu Z, Karahalios K DataTone: managing ambiguity in natural language interfaces for data visualization, UIST ‘15: proceedings of the 28th annual ACM symposium on User Interface Software & Technology, November 2015, 489–500, https://doi.org/10.1145/2807442.2807478

Ghosh S, Vinyals O, Strope B, Roy S, Dean T, Heck L (2016) Contextual lstm (clstm) models for large scale nlp tasks. arXiv preprint arXiv:1602.06291

Glasgow B, Mandell A, Binney D, Ghemri L, Fisher D (1998) MITA: an information-extraction approach to the analysis of free-form text in life insurance applications. AI Mag 19(1):59

Goldberg Y (2017) Neural network methods for natural language processing. Synthesis lectures on human language technologies 10(1):1–309

Gong Y, Liu X (2001) Generic text summarization using relevance measure and latent semantic analysis. In proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (pp. 19-25). ACM

Green Jr, BF, Wolf AK, Chomsky C, Laughery K (1961) Baseball: an automatic question-answerer. In papers presented at the may 9-11, 1961, western joint IRE-AIEE-ACM computer conference (pp. 219-224). ACM

Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28(10):2222–2232

Article   MathSciNet   Google Scholar  

Grishman R, Sager N, Raze C, Bookchin B (1973) The linguistic string parser. In proceedings of the June 4-8, 1973, national computer conference and exposition (pp. 427-434). ACM

Hayes PJ (1992) Intelligent high-volume text processing using shallow, domain-specific techniques. Text-based intelligent systems: current research and practice in information extraction and retrieval, 227-242.

Hendrix GG, Sacerdoti ED, Sagalowicz D, Slocum J (1978) Developing a natural language interface to complex data. ACM Transactions on Database Systems (TODS) 3(2):105–147

"Here’s Why Natural Language Processing is the Future of BI (2017) " SmartData Collective. N.p., n.d. Web. 19

Hirschman L, Grishman R, Sager N (1976) From text to structured information: automatic processing of medical reports. In proceedings of the June 7-10, 1976, national computer conference and exposition (pp. 267-275). ACM

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991

Hutchins WJ (1986) Machine translation: past, present, future (p. 66). Ellis Horwood, Chichester

Jurafsky D, Martin J (2008) H. Speech and language processing. 2nd edn. Prentice-Hall, Englewood Cliffs, NJ

Kamp H, Reyle U (1993) Tense and aspect. In from discourse to logic (pp. 483-689). Springer Netherlands

Kang Y, Cai Z, Tan CW, Huang Q, Liu H (2020) Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics 7(2):139–172

Kim Y. (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

Knight K, Langkilde I (2000) Preserving ambiguities in generation via automata intersection. In AAAI/IAAI (pp. 697-702)

Lass R (1998) Phonology: An Introduction to Basic Concepts. Cambridge, UK; New York; Melbourne, Australia: Cambridge University Press. p. 1. ISBN 978–0–521-23728-4. Retrieved 8 January 2011Paperback ISBN 0–521–28183-0

Lewis DD (1998) Naive (Bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning (pp. 4–15). Springer, Berlin Heidelberg

Liddy ED (2001). Natural language processing

Lopez MM, Kalita J (2017) Deep learning applied to NLP. arXiv preprint arXiv:1703.03091

Luong MT, Sutskever I, Le Q V, Vinyals O, Zaremba W (2014) Addressing the rare word problem in neural machine translation. arXiv preprint arXiv:1410.8206

Lyman M, Sager N, Friedman C, Chi E (1985) Computer-structured narrative in ambulatory care: its use in longitudinal review of clinical data. In proceedings of the annual symposium on computer application in medical care (p. 82). Am Med Inform Assoc

Lyman M, Sager N, Chi EC, Tick LJ, Nhan NT, Su Y, ..., Scherrer, J. (1989) Medical Language Processing for Knowledge Representation and Retrievals. In Proceedings. Symposium on Computer Applications in Medical Care (pp. 548–553). Am Med Inform Assoc

Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (pp. 142-150)

Mani I, Maybury MT (eds) (1999) Advances in automatic text summarization, vol 293. MIT press, Cambridge, MA

Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT press, Cambridge

MATH   Google Scholar  

Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of english: the penn treebank. Comput Linguist 19(2):313–330

McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization (Vol. 752, pp. 41-48)

McCray AT (1991) Natural language processing for intelligent information retrieval. In Engineering in Medicine and Biology Society, 1991. Vol. 13: 1991., Proceedings of the Annual International Conference of the IEEE (pp. 1160–1161). IEEE

McCray AT (1991) Extending a natural language parser with UMLS knowledge. In proceedings of the annual symposium on computer application in medical care (p. 194). Am Med Inform Assoc

McCray AT, Nelson SJ (1995) The representation of meaning in the UMLS. Methods Inf Med 34(1–2):193–201

McCray AT, Razi A (1994) The UMLS knowledge source server. Medinfo MedInfo 8:144–147

McCray AT, Srinivasan S, Browne AC (1994) Lexical methods for managing variation in biomedical terminologies. In proceedings of the annual symposium on computer application in medical care (p. 235). Am Med Inform Assoc

McDonald R, Crammer K, Pereira F (2005) Flexible text segmentation with structured multilabel classification. In proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 987-994). Assoc Comput Linguist

McGray AT, Sponsler JL, Brylawski B, Browne AC (1987) The role of lexical knowledge in biomedical text understanding. In proceedings of the annual symposium on computer application in medical care (p. 103). Am Med Inform Assoc

McKeown KR (1985) Text generation. Cambridge University Press, Cambridge

Book   Google Scholar  

Merity S, Keskar NS, Socher R (2018) An analysis of neural language modeling at multiple scales. arXiv preprint arXiv:1803.08240

Mikolov T, Chen K, Corrado G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems

Morel-Guillemaz AM, Baud RH, Scherrer JR (1990) Proximity processing of medical text. In medical informatics Europe’90 (pp. 625–630). Springer, Berlin Heidelberg

Morin E (1999) Automatic acquisition of semantic relations between terms from technical corpora. In proc. of the fifth international congress on terminology and knowledge engineering-TKE’99

Müller M, Salathé M, Kummervold PE (2020) Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. arXiv preprint arXiv:2005.07503

"Natural Language Processing (2017) " Natural Language Processing RSS. N.p., n.d. Web. 25

"Natural Language Processing" (2017) Natural Language Processing RSS. N.p., n.d. Web. 23

Newatia R (2019) https://medium.com/saarthi-ai/sentence-classification-using-convolutional-neural-networks-ddad72c7048c . Accessed 15 Dec 2021

Nhàn NT, Sager N, Lyman M, Tick LJ, Borst F, Su Y (1989) A medical language processor for two indo-European languages. In proceedings. Symposium on computer applications in medical care (pp. 554-558). Am Med Inform Assoc

Nießen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation: fast evaluation for MT research. In LREC

Ochoa, A. (2016). Meet the Pilot: Smart Earpiece Language Translator. https://www.indiegogo.com/projects/meet-the-pilot-smart-earpiece-language-translator-headphones-travel . Accessed April 10, 2017

Ogallo, W., & Kanter, A. S. (2017). Using natural language processing and network analysis to develop a conceptual framework for medication therapy management research. https://www.ncbi.nlm.nih.gov/pubmed/28269895?dopt=Abstract . Accessed April 10, 2017

Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems 32(2):604–624

Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47(2):227–237

Palmer M, Gildea D, Kingsbury P (2005) The proposition bank: an annotated corpus of semantic roles. Computational linguistics 31(1):71–106

Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In proceedings of the 40th annual meeting on association for computational linguistics (pp. 311-318). Assoc Comput Linguist

Peng Y, Chi J (2019) Unsupervised cross-media retrieval using domain adaptation with scene graph. IEEE Transactions on Circuits and Systems for Video Technology 30(11):4368–4379

Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137

Rae JW, Potapenko A, Jayakumar SM, Lillicrap TP, (2019) Compressive transformers for long-range sequence modelling. arXiv preprint arXiv:1911.05507

Ranjan P, Basu HVSSA (2003) Part of speech tagging and local word grouping techniques for natural language parsing in Hindi. In Proceedings of the 1st International Conference on Natural Language Processing (ICON 2003)

Rassinoux AM, Baud RH, Scherrer JR (1992) Conceptual graphs model extension for knowledge representation of medical texts. MEDINFO 92:1368–1374

Rassinoux AM, Michel PA, Juge C, Baud R, Scherrer JR (1994) Natural language processing of medical texts within the HELIOS environment. Comput Methods Prog Biomed 45:S79–S96

Rassinoux AM, Juge C, Michel PA, Baud RH, Lemaitre D, Jean FC, Scherrer JR (1995) Analysis of medical jargon: The RECIT system. In Conference on Artificial Intelligence in Medicine in Europe (pp. 42–52). Springer, Berlin Heidelberg

Rennie J (2000) ifile: An application of machine learning to e-mail filtering. In Proc. KDD 2000 Workshop on text mining, Boston, MA

Riedhammer K, Favre B, Hakkani-Tür D (2010) Long story short–global unsupervised models for keyphrase based meeting summarization. Speech Comm 52(10):801–815

Ritter A, Clark S, Etzioni O (2011) Named entity recognition in tweets: an experimental study. In proceedings of the conference on empirical methods in natural language processing (pp. 1524-1534). Assoc Comput Linguist

Rospocher M, van Erp M, Vossen P, Fokkens A, Aldabe I, Rigau G, Soroa A, Ploeger T, Bogaard T(2016) Building event-centric knowledge graphs from news. Web Semantics: Science, Services and Agents on the World Wide Web, In Press

Sager N, Lyman M, Tick LJ, Borst F, Nhan NT, Revillard C, … Scherrer JR (1989) Adapting a medical language processor from English to French. Medinfo 89:795–799

Sager N, Lyman M, Nhan NT, Tick LJ (1995) Medical language processing: applications to patient data representation and automatic encoding. Methods Inf Med 34(1–2):140–146

Sahami M, Dumais S, Heckerman D, Horvitz E (1998) A Bayesian approach to filtering junk e-mail. In learning for text categorization: papers from the 1998 workshop (Vol. 62, pp. 98-105)

Sakkis G, Androutsopoulos I, Paliouras G, Karkaletsis V, Spyropoulos CD, Stamatopoulos P (2001) Stacking classifiers for anti-spam filtering of e-mail. arXiv preprint cs/0106040

Sakkis G, Androutsopoulos I, Paliouras G et al (2003) A memory-based approach to anti-spam filtering for mailing lists. Inf Retr 6:49–73. https://doi.org/10.1023/A:1022948414856

Santoro A, Faulkner R, Raposo D, Rae J, Chrzanowski M, Weber T, ..., Lillicrap T (2018) Relational recurrent neural networks. Adv Neural Inf Proces Syst, 31

Scherrer JR, Revillard C, Borst F, Berthoud M, Lovis C (1994) Medical office automation integrated into the distributed architecture of a hospital information system. Methods Inf Med 33(2):174–179

Seal D, Roy UK, Basak R (2020) Sentence-level emotion detection from text based on semantic rules. In: Tuba M, Akashe S, Joshi A (eds) Information and communication Technology for Sustainable Development. Advances in intelligent Systems and computing, vol 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_42

Chapter   Google Scholar  

Sentiraama Corpus by Gangula Rama Rohit Reddy, Radhika Mamidi. Language Technologies Research Centre, KCIS, IIIT Hyderabad (n.d.) ltrc.iiit.ac.in/showfile.php?filename=downloads/sentiraama/

Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In proceedings of the 2003 conference of the north American chapter of the Association for Computational Linguistics on human language technology-volume 1 (pp. 134-141). Assoc Comput Linguist

Sharifirad S, Matwin S, (2019) When a tweet is actually sexist. A more comprehensive classification of different online harassment categories and the challenges in NLP. arXiv preprint arXiv:1902.10584

Sharma S, Srinivas PYKL, Balabantaray RC (2016) Emotion Detection using Online Machine Learning Method and TLBO on Mixed Script. In Proceedings of Language Resources and Evaluation Conference 2016 (pp. 47–51)

Shemtov H (1997) Ambiguity management in natural language generation. Stanford University

Small SL, Cortell GW, Tanenhaus MK (1988) Lexical Ambiguity Resolutions. Morgan Kauffman, San Mateo, CA

Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In proceedings of the 2013 conference on empirical methods in natural language processing (pp. 1631-1642)

Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26(1):320–322

Srihari S (2010) Machine Learning: Generative and Discriminative Models. http://www.cedar.buffalo.edu/wsrihari/CSE574/Discriminative-Generative.pdf . accessed 31 May 2017.]

Sun X, Morency LP, Okanohara D, Tsujii JI (2008) Modeling latent-dynamic in shallow parsing: a latent conditional model with improved inference. In proceedings of the 22nd international conference on computational linguistics-volume 1 (pp. 841-848). Assoc Comput Linguist

Sundheim BM, Chinchor NA (1993) Survey of the message understanding conferences. In proceedings of the workshop on human language technology (pp. 56-60). Assoc Comput Linguist

Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems

Sworna ZT, Mousavi Z, Babar MA (2022) NLP methods in host-based intrusion detection Systems: A systematic review and future directions. arXiv preprint arXiv:2201.08066

Systems RAVN (2017) "RAVN Systems Launch the ACE Powered GDPR Robot - Artificial Intelligence to Expedite GDPR Compliance." Stock Market. PR Newswire, n.d. Web. 19

Tan KL, Lee CP, Anbananthen KSM, Lim KM (2022) RoBERTa-LSTM: A hybrid model for sentiment analysis with transformers and recurrent neural network. IEEE Access, RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network

Tapaswi N, Jain S (2012) Treebank based deep grammar acquisition and part-of-speech tagging for Sanskrit sentences. In software engineering (CONSEG), 2012 CSI sixth international conference on (pp. 1-4). IEEE

Thomas C (2019)  https://towardsdatascience.com/recurrent-neural-networks-and-natural-language-processing-73af640c2aa1 . Accessed 15 Dec 2021

Tillmann C, Vogel S, Ney H, Zubiaga A, Sawaf H (1997) Accelerated DP based search for statistical translation. In Eurospeech

Umber A, Bajwa I (2011) “Minimizing ambiguity in natural language software requirements specification,” in Sixth Int Conf Digit Inf Manag, pp. 102–107

"Using Natural Language Processing and Network Analysis to Develop a Conceptual Framework for Medication Therapy Management Research (2017) " AMIA ... Annual Symposium proceedings. AMIA Symposium. U.S. National Library of Medicine, n.d. Web. 19

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I, (2017) Attention is all you need. In advances in neural information processing systems (pp. 5998-6008)

Wahlster W, Kobsa A (1989) User models in dialog systems. In user models in dialog systems (pp. 4–34). Springer Berlin Heidelberg, User Models in Dialog Systems

Walton D (1996) A pragmatic synthesis. In: fallacies arising from ambiguity. Applied logic series, vol 1. Springer, Dordrecht)

Wan X (2008) Using only cross-document relationships for both generic and topic-focused multi-document summarizations. Inf Retr 11(1):25–49

Wang W, Gang J, 2018 Application of convolutional neural network in natural language processing. In 2018 international conference on information Systems and computer aided education (ICISCAE) (pp. 64-70). IEEE

Wang D, Zhu S, Li T, Gong Y (2009) Multi-document summarization using sentence-based topic models. In proceedings of the ACL-IJCNLP 2009 conference short papers (pp. 297-300). Assoc Comput Linguist

Wang D, Zhu S, Li T, Chi Y, Gong Y (2011) Integrating document clustering and multidocument summarization. ACM Transactions on Knowledge Discovery from Data (TKDD) 5(3):14–26

Wang Z, Ng P, Ma X, Nallapati R, Xiang B (2019) Multi-passage bert: A globally normalized bert model for open-domain question answering. arXiv preprint arXiv:1908.08167

Wen Z, Peng Y (2020) Multi-level knowledge injecting for visual commonsense reasoning. IEEE Transactions on Circuits and Systems for Video Technology 31(3):1042–1054

Wiese G, Weissenborn D, Neves M (2017) Neural domain adaptation for biomedical question answering. arXiv preprint arXiv:1706.03610

Wong A, Plasek JM, Montecalvo SP, Zhou L (2018) Natural language processing and its implications for the future of medication safety: a narrative review of recent advances and challenges. Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy 38(8):822–841

Woods WA (1978) Semantics and quantification in natural language question answering. Adv Comput 17:1–87

Xia T (2020) A constant time complexity spam detection algorithm for boosting throughput on rule-based filtering Systems. IEEE Access 8:82653–82661. https://doi.org/10.1109/ACCESS.2020.2991328

Xie P, Xing E (2017) A constituent-centric neural architecture for reading comprehension. In proceedings of the 55th annual meeting of the Association for Computational Linguistics (volume 1: long papers) (pp. 1405-1414)

Yan X, Ye Y, Mao Y, Yu H (2019) Shared-private information bottleneck method for cross-modal clustering. IEEE Access 7:36045–36056

Yi J, Nasukawa T, Bunescu R, Niblack W (2003) Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In data mining, 2003. ICDM 2003. Third IEEE international conference on (pp. 427-434). IEEE

Young SJ, Chase LL (1998) Speech recognition evaluation: a review of the US CSR and LVCSR programmes. Comput Speech Lang 12(4):263–279

Yu S, et al. (2018) "A multi-stage memory augmented neural network for machine reading comprehension." Proceedings of the workshop on machine reading for question answering

Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for email threads using sentence compression. Inf Process Manag 44(4):1600–1610

Zeroual I, Lakhouaja A, Belahbib R (2017) Towards a standard part of speech tagset for the Arabic language. J King Saud Univ Comput Inf Sci 29(2):171–178

Download references


Authors would like to express the gratitude to Research Mentors from CL Educate: Accendere Knowledge Management Services Pvt. Ltd. for their comments on earlier versions of the manuscript. Although any errors are our own and should not tarnish the reputations of these esteemed persons. We would also like to appreciate the Editor, Associate Editor, and anonymous referees for their constructive suggestions that led to many improvements on an earlier version of this manuscript.

Author information

Authors and affiliations.

Department of Computer Science, Manav Rachna International Institute of Research and Studies, Faridabad, India

Diksha Khurana & Aditya Koli

Department of Computer Science, BML Munjal University, Gurgaon, India

Kiran Khatter

Department of Statistics, Amity University Punjab, Mohali, India

Sukhdev Singh

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kiran Khatter .

Ethics declarations

Conflict of interest.

The first draft of this paper was written under the supervision of Dr. Kiran Khatter and Dr. Sukhdev Singh, associated with CL- Educate: Accendere Knowledge Management Services Pvt. Ltd. and deputed at the Manav Rachna International University. The draft is also available on arxiv.org at https://arxiv.org/abs/1708.05148

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Khurana, D., Koli, A., Khatter, K. et al. Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl 82 , 3713–3744 (2023). https://doi.org/10.1007/s11042-022-13428-4

Download citation

Received : 03 February 2021

Revised : 23 March 2022

Accepted : 02 July 2022

Published : 14 July 2022

Issue Date : January 2023

DOI : https://doi.org/10.1007/s11042-022-13428-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Natural language processing
  • Natural language understanding
  • Natural language generation
  • NLP applications
  • NLP evaluation metrics
  • Find a journal
  • Publish with us
  • Track your research


100+ Compelling Linguistics Research Topics for University Students

Linguistics Research Topics

Confused while selecting the interesting linguistics research topics to pen down your thoughts on a piece of paper? So, bounce back to this article and pick the best linguistics research paper topics and boost your grades.

Un-layering the essence of teaching-learning methodology demonstrates the development of linguistic theories. Linguistics is a science of language in which fact-finding is done through some rational and systematic study. While digging into the information about the history of linguistics, two perspectives on languages are unveiled: prescriptive and descriptive views.

The linguistic analysis uncovers the following areas: phonetics, phonology, syntax, morphology, semantics, and pragmatics. Furthermore, the scrutinization of linguistics helps you to know about every aspect of languages as well as methods for studying them.

Table of Contents

How To Choose the Right Linguistics Research Topics?

Stress work is the indication of degraded academic performance and lower grades even if we talk about a linguistics research paper. Make your every endeavor effective and energetic by applying the right strategy. Therefore, make the right selection for your academic writing that starts from the interesting topic selection in linguistics.

Moreover, take advantage of research paper help and discuss your concerns with professional writers. As a suggestion, you can choose the right linguistics research topics by keeping the following points in your mind:

Find your interest: Linguistics uncover various aspects of language learning and allow you to expand your mind capabilities. So, try to explore the depth of the subject and find your area of interest. It will make your academic writing more interesting and enthralling.

Brainstorm the ideas: Picking the interesting linguistics topics demands your knowledge and expertise. Therefore, you need to take the advantage of brainstorming and collect various ideas to explore the concept of linguistics.

Perform pensive research : When you are keen to score high marks, you need to have sufficient knowledge. Conduct insightful research and uncover the pensive ideas for your research paper topics in linguistics.

Interesting Topics in Linguistics

Linguistics is the foundation of language knowledge. Linguistics theories indeed are interrelated to learning the English language. When you have to boost your grades, your selection for linguistics research paper topics makes a huge difference.  Some of the interesting linguistics research topics are:

  • Explain the significance of music in the evolution of language.
  • Does age really impact English pronunciation?
  • What is the role of sociolinguistics education in creating discipline?
  • What is the significance of language in creating teaching methodology?
  • Analysis of verbal and written communication based on language usage.
  • Is it important to have expertise in several languages?
  • Explain the issues related to receptive language disorder and its impact on brain development.
  • How do you correlate sentence-making and word flow in linguistics?
  • Discuss the comparability between English and French languages.
  • Factors responsible for different spoken languages.
  • The impact of slang in the development of languages.
  • Is text messaging creating a revolutionary subculture in the new linguistic scenario?
  • How are linguistic patterns helpful in locating migration roadways?
  • What are factors affecting the capability of learning a language?
  • Explain the role of language in building a national identity for developing a multicultural society.
  • Digital Revolution: impact of computers in modern language
  • A systematic review on vowel pronunciation in the American Schools.
  • Significance of language in creating cross-cultural communities: A comprehensive review
  • Elucidate the impact of language on one’s perception.
  • Textual and Linguistic analysis for housing studies.

Stimulating Research Paper Topics In Sociolinguistics

While seeking linguistics research topics for your assignments or research paper, you may find sociolinguistics interesting to explore. Sociolinguistics demonstrates the impact of language on our society. When you are keen to explore the effect of language in different aspects of society (including cultural values and expectations), you need to do an in-depth analysis of sociolinguistics.

For building a good foundation on sociolinguistics, you can select the following linguistics paper topics:

  • How would you define linguistic practices in specific communities?
  • An elaborative approach for code-switching and code-mixing
  • Explain the impact of dialect on gender.
  • A correlational study to share the relationship between language, social class, and cognition.
  • In-depth study of interactional sociolinguistics in the 21st Century.
  • A comprehensive analysis on accountability and aptness of dialect.
  • Evaluate the education of language in the U.S.
  • The role of languages in controlling emotions.
  • Effectiveness of verbal communication in expressing one’s feelings: A competitive analysis.
  • A literature review on communication with a precise comparison of verbal and non-verbal communication
  • Difference between advanced placement (AP) English literature and language.
  • What is the relationship between language and one’s personality?
  • A critical analysis on the relation of language and ethnicity.
  • Describe the attitudes to various languages among societies.
  • A comprehensive approach on dialect variations in American English-speaking people.
  • Scrutinize linguistic variation on language loyalty.
  • Develop a good understanding of sociological variations to languages.
  • Impact of the generation gap on language usage.
  • Examine the impact of various factors (social tension, media, racism, and entertainment) on the utilization of languages.
  • Is there a difference between linguistic practices among men and women?

Also, Read: 150+ Business Research Topics

Interesting Research Topics in Applied Linguistics

Are you looking for linguistics research topics to advance your learning abilities? In such a case, you have to learn about “Applied Linguistics.” It is the branch of linguistics in which one can understand the practical applications of language studies such as speech therapy, language teaching, and more.

In other words, applied linguistics offers solutions to deal with language-related real-life problems. Imperative academic areas where you can find the applications of applied linguistics are psychology, education, sociology, communication research, and anthropology. Some applied linguistics research paper topics:

  • Discuss the expansion of learning a second language through reading.
  • Share your learning on the critical period hypothesis for the acquisition of the second language.
  • Impact of bilingualism on an individual’s personality.
  • Linguistics evaluation on the difference between written and spoken language.
  • Describe language cognition and perceptions in a learning process.
  • Impact of language barriers on healthcare delivery.
  • Detailed analysis on various methodologies to learn applied linguistics.
  • Discuss the relationship between empathy and language proficiency in learners of adult language.
  • Detailed analysis on multilingualism and multiculturalism.
  • Impact of extended instructions on the use of passive voices, modals, and relative clauses: A critical analysis.
  • Explain digitally-mediated collaborative writing for ESL students.
  • How do we evaluate self-efficacy in students who speak low-level English language?
  • Elucidate the significance of phrasal verbs in creating technical documents.
  • Expectations of American Students while taking Japanese language classes.
  • A detailed study on American deaf students in English as a Non-Native Language (ENNL) classes.
  • How do you understand by modeling music with Grammars?
  • The cognitive development of expertise as an ESL teacher: An insightful analysis.
  • Sound Effects: Gender, Age, and Sound symbolism in American English.
  • Importance of applied linguistics in today’s digital world.

Also, Read: Modern Literature

Interesting Research Topics in Semantics

The study of reference, meaning, and the truth is covered under semantics or semiotics, or semasiology. A comprehensive analysis of semantics reflects the essence of compositional semantics and lexical semantics.  The combination of words and their interaction to form larger experiences like sentences comes under compositional semantics. Whereas, the notion of words is shared under lexical semantics.

Some academic disciplines in linguistic semantics are conceptual semantics, cognitive semantics, formal semantics, computational semantics, and more. Linguistic research paper topics on Semantics are as follows:

  • Examine meaning work in language interpretation and scrutinization
  • A critical evaluation of language acquisition and language use.
  • Challenges in the study of semantic and pragmatic theory.
  • Discuss semantics lessons and paragraph structure in written language.
  • How do you explain the semantic richness effects in the recognition of visual words?
  • How richness of semantics affects the processing of a language.
  • Semantic generation to action-related stimuli: A neuroanatomical evaluation of embodied cognition.
  • Examine the understanding of blind children for reading phonological and tactual coding in Braille.
  • Explain a semantic typology of gradable predicates.
  • A comparison of between blind and sighted children’s memory performance: the reverse-generation effect.
  • Clinical research for designing medical decision support systems.
  • Discuss word recognition processes in blind and sighted children.
  • A corpus-based study on argumentative indicators.
  • The typology of modality in modern West Iranian languages.
  • A critical analysis on changes in naming and semantic abilities in different age groups.
  • Explain the multidimensional semantics of evaluative adverbs.
  • A comprehensive analysis on procedural meaning: problems and perspectives.
  • Cross-cultural and cross-linguistic perspectives on figurative language.
  • Elucidate semantic and pragmatic problems in discourse and dialogue.

Topics For Linguistics Essays

A curiosity of exploring the various concepts in linguistics leads you to work on essays. Projecting your thoughts in writing linguistics essays makes you understand the structure and changes in human languages. In a case, if you are searching for the best topics in linguistics, go through the following list of linguistics essays:

  • Difference between human language and artificial language.
  • Classification of writing systems based on various stages of development.
  • The laws of language development
  • Culture and language: impact on reflections.
  • Methodology of reading and writing for children by Albert James.
  • Significance of phoneme and phonological matters
  • The complexity of human language: the specific cases of the apes
  • Explain the development of languages and derivational morphology.
  • Detailed analysis on language extinction.
  • Investigate the peculiarities of English-Chinese and Chinese-English translations.
  • A comprehensive overview on the acquisition of English as a second language by Mid-Eastern students.
  • Discuss semiology in language analysis.
  • Impact of blogging on learning languages.
  • Linguistics: grammar and language teaching.
  • English Language: Explain its standard and non-standard types.
  • Discuss speech community as linguistic anthropology.
  • A systematic review on linguistic diversity in modern culture.
  • Similarities and differences between language and logic.
  • What is the impact of language on digital communication?
  • Listening comprehension: a comparative analysis of the articles.

Computational Linguistics Research Topics

Analysis and synthesis of language and speech using the techniques of computer science share the significance of computational linguistics. This branch of linguistics reflects the study of computational modeling of natural language. It also describes the computational approaches to answering the linguistic questions.

Under computational linguistics, you can explore different concepts such as artificial intelligence, mathematics, computer science, cognitive science, neuroscience, anthropology. More interesting computational linguistics research topics are:

  • Explain the factors measuring the performance of speech recognition.
  • Discuss word sense disambiguation.
  • Detailed analysis on dependency parsing based on graphs and transitions.
  • A multidimensional analysis on linguistic dimensions
  • Analyze Medieval German poetry through supervised learning.
  • Extraction of Danish verbs.
  • Analysis of Schizophrenia text dataset.
  • An intra-lingual contrastive corpus analysis based on computational linguistics.
  • Discuss various methods to introduce, create, and conclude a text.

Still, Confused? Select The Compelling Linguistics Research Topics With Our Writers!

Are you still stressed about picking the right linguistics research paper topic? Without striking the right ideas to your mind, you find it hard to initiate your research work. But, don’t take tension anymore. Our professional and Ph.D. writers will help you to make the appropriate selection for linguistics assignments. Grab our online paper help and receive customized solutions for your research papers.

' src=

By Alex Brown

I'm an ambitious, seasoned, and versatile author. I am experienced in proposing, outlining, and writing engaging assignments. Developing contagious academic work is always my top priority. I have a keen eye for detail and diligence in producing exceptional academic writing work. I work hard daily to help students with their assignments and projects. Experimenting with creative writing styles while maintaining a solid and informative voice is what I enjoy the most.

Thesis Helpers

research paper topics languages

Find the best tips and advice to improve your writing. Or, have a top expert write your paper.

55 Top-Rated Research Topics in Linguistics For an A+

Research Topics in Linguistics

The field of linguistics is one of the easiest yet challenging subjects for college and university students. Areas such as phonology, phonetics, syntax, morphology, and semantics in linguistics can keep you up all night.

That is why we came up with these quality language research topics.

What are the Linguistics Research Topics?

To understand this better, we’d have first to define the term linguistics.

  • Language in context ,
  • Language form, and
  • Language meaning.

The researcher will have to determine the interplay between sound and meaning when presented with this subject. A linguistics research paper will, therefore, deal with the following:

  • The nature of language
  • How human languages are classified
  • Tools used in language identification

Language entices researchers as it draws significant and sustained attention with the reader. With the numerous languages in the world now, you cannot miss finding an area or two to write on this topic.

However, we endeavor to make this task quick and easy for you by shooting up 55 research topics in linguistics.

How To Write Linguistic Topics For Your Dissertation

Are you having trouble coming up with a research topic for your research paper? Here are the top expert recommendations:

  • Brainstorm ideas on your own and with your friends
  • Pick a broad topic and free-write specific sub-topics on it
  • Get inspiration from other available linguistics research paper topics

After coming up with a topic that interests you, check to ensure that it meets your assignment criteria.

So let’s get started!

History of Language Research Topics

  • The contribution of Greek philosophers to language
  • Significance of the over 30,000 preserved cuneiform writings to language
  • Early speculations about the origin of language
  • The long history of language as rooted in mythology
  • Why the origin of language is an unanswerable problem
  • A critical analysis of theories that explain the origin and development of language

Argumentative College Linguistic Research Topics

  • Is language the only way we can use to communicate?
  • Does a brain injury have an impact on language?
  • Should we refer to the language as a mere system of symbols?
  • Do language disorders make it a difficult subject to study?
  • Does the mother tongue have an impact on efficient communication?
  • Should we learn two or more languages?

Linguistics Research Topics – Tough Questions

  • Why is there a similarity among many English and French words?
  • What makes people speak different languages?
  • Why does the mother tongue always interfere with one’s pronunciation?
  • What makes it possible for language translation?
  • Is sign language only a matter of making signs with the hands?
  • Why are some languages difficult to learn than others?

Sociolinguistic Research Topics

  • Social factors that necessitate language variation and varieties
  • What are the attitudes to language among different societies?
  • The relationship between language and identity
  • A critical evaluation of language and ethnicity
  • Analyzing language attrition among most English speakers
  • Distinct functions of language among different communities

Interesting Topics in Linguistics

  • Salient factors that contribute to language shift and death
  • Why nobody can claim to know a certain language in its entirety
  • Why is written communication more precise than spoken one?
  • Problems of ambiguity during language translation
  • Does language influence society, or vice versa, is it true?
  • The effectiveness of language support and subject teaching

Linguistics Paper Topics on Politics

  • Persuasive language strategies and techniques in political speeches
  • Why politicians use culturally used languages when addressing indigenous communities?
  • The place of colonial rule in African politics
  • A case study of effective political communication
  • Understanding the changing landscape of political communication
  • The use of buzz words and tag lines in political speeches

Linguistics Research Paper Topics on Semantics

  • How does meaning work in language analysis and interpretation?
  • How can the meanings of words relate to each other?
  • Ways in which sentences are related to one another
  • What causes ambiguity to arise in language?
  • How do different speakers acquire a sense of meaning?
  • A critical analysis of language use and language acquisition

Linguistic Topics on Translation

  • The role of the latest technologies in the translation industry
  • Is the translator training and pedagogy producing efficient translators?
  • Are translations the cause of misunderstanding between different languages?
  • What is the effectiveness of audiovisual translation?
  • Is literary translation causing more harm than good in communication?
  • What is the relationship between translation and popular culture?

Interesting Linguistics Topics on Language Disorders

  • Causes of receptive language disorders among children
  • Mental formation of language disorders during a child’s development
  • Symptoms of language disorder and how to deal with them
  • What is the effectiveness of psychotherapy in dealing with language disorders?
  • Why is autism spectrum disorder common among most children?
  • What causes problems with the sentence and word flow?
  • Why children of 1 and 2 years of age have trouble with p, b, m, h, and w sounds

For top grades, aim for a specific and original linguistic topic. If the task seems daunting and tedious to you, then professional thesis writing help is all you need. The service is available at cheap rates with guaranteed top quality.

Have a professional complete your linguistics research paper today!

academic vocabulary

Make PhD experience your own

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Architecture and Design
  • Asian and Pacific Studies
  • Business and Economics
  • Classical and Ancient Near Eastern Studies
  • Computer Sciences
  • Cultural Studies
  • Engineering
  • General Interest
  • Geosciences
  • Industrial Chemistry
  • Islamic and Middle Eastern Studies
  • Jewish Studies
  • Library and Information Science, Book Studies
  • Life Sciences
  • Linguistics and Semiotics
  • Literary Studies
  • Materials Sciences
  • Mathematics
  • Social Sciences
  • Sports and Recreation
  • Theology and Religion
  • Publish your article
  • The role of authors
  • Promoting your article
  • Abstracting & indexing
  • Publishing Ethics
  • Why publish with De Gruyter
  • How to publish with De Gruyter
  • Our book series
  • Our subject areas
  • Your digital product at De Gruyter
  • Contribute to our reference works
  • Product information
  • Tools & resources
  • Product Information
  • Promotional Materials
  • Orders and Inquiries
  • FAQ for Library Suppliers and Book Sellers
  • Repository Policy
  • Free access policy
  • Open Access agreements
  • Database portals
  • For Authors
  • Customer service
  • People + Culture
  • Journal Management
  • How to join us
  • Working at De Gruyter
  • Mission & Vision
  • De Gruyter Foundation
  • De Gruyter Ebound
  • Our Responsibility
  • Partner publishers

research paper topics languages

Your purchase has been completed. Your documents are now available to view.

book: Key Topics in Second Language Acquisition

Key Topics in Second Language Acquisition

  • Vivian Cook and David Singleton
  • X / Twitter

Please login or register with De Gruyter to order this product.

  • Language: English
  • Publisher: Multilingual Matters
  • Copyright year: 2014
  • Main content: 168
  • Published: April 2, 2014
  • ISBN: 9781783091812

Custom Essay, Term Paper & Research paper writing services

  • testimonials

Toll Free: +1 (888) 354-4744

Email: [email protected]

Writing custom essays & research papers since 2008

Top 50+ linguistics research topics for your paper.

linguistics research topics

Are you a student or a graduate of linguistics? If yes, there is no doubting the fact that research topics in linguistics are your bread and butter. You can escape them in school. Write within the confines of the topics in linguistics and get your grade without stress. However, linguistics is a wide field and it can be hard to pick one of the many linguistic topics for your research. Sometimes, the problem is not in picking a topic. The problem is that despite the wide scope of linguistics, you don’t know how to form linguistics research topics.

We noticed these problems with students and decided to help. Our solution is to compile a list of 50 linguistic research topics for linguistics students. These topics could form the basis of your linguistics research paper topics. You don’t have to worry anymore about topics for master thesis in linguistics. We have you covered for all English linguistics research topics. Let’s dive in!

Check Our 50 Linguistics Research Topics

There are linguistics research topics in abundance. If you search online, you would find more than a few examples. However, you need to know the aspect of linguistics you want to use for your linguistics paper topics. It would make no sense to have a list of thesis topics in applied linguistics and want to write on topics in cognitive linguistics. While they are all under the broad body of linguistics, they are quite different from each other.

So, the first step in finding the perfect linguistics essay topics is to choose the aspect of linguistics you want. After you have made a choice you can now look into linguistics topics in that aspect. We have made finding interesting linguistics topics in any aspect you chose easier by grouping our 50 linguistics research topics. All you have to do is search under the aspect of your choice.

Interesting Linguistics Research Topics

If you don’t only want to write a research paper but you want to find every minute of it intriguing, these interesting topics in linguistics are the ones for you.

  • What makes written communication more precise compared to spoken communication?
  • How to spot language disorders and deal with them
  • What contributes to the prevalent language shift and death in our society today?
  • The language of feminism: How formalized is it and how does it affect society?
  • Why is it impossible to claim to know a language entirely?
  • What salient factors cause ambiguity in language translation?
  • An in-depth analysis of feminism in Africa
  • Language vs Society: Which one influences the other? How does it affect the members?
  • How effective are subject teaching and language support?
  • What factors affect language choice in multilingual societies? (Study of selected communities)
  • The real functions of language

Linguistics Topics on Translation

If you want the latest research topics in applied linguistics, the topics under the following subheadings would help you. You just have to look for the aspect that you have an interest in and look at linguistics in that light.

  • How has technology affected translation at this age and time?
  • Is translation the cause of misunderstandings between speakers of different languages?
  • How effective is an audiovisual translation in revolutionizing the translation industry?
  • Does literal translation do more harm than good?
  • How has the translator training and pedagogy faired in the production of efficient translators?
  • How does translation relate to popular culture?

Translation is essential in this century with people of different languages communicating and coming together in a global economy. These topics look into the issues that translation encounter at this time.

Linguistics Topics on Politics

Politics is an ever-present phenomenon in any society. These dissertation topics in linguistics examine the issues surrounding language in the field of politics. We have explained samples of Ph.D. thesis topics in linguistics in this field.

  • The reality of hate speech in selected communities
  • The use of persuasive language strategies and tools in political speech
  • How colonial rule affected African politics and language
  • Why do politicians use indigenous languages to address communities?
  • A critical analysis of the changing political communication landscape
  • Effective political communication: A case study of selected politicians
  • How tag lines and buzz words are used to enhance political speeches

Sociolinguistics Research Topics

This aspect of linguistics examines how issues surrounding how language works in society. These research topics for English linguistics focus on how people in society use language and its effects on society.

  • What are the social factors that necessitate language varieties?
  • How does language affect identity?
  • An in-depth analysis of language attrition common to most English speakers
  • A critical evaluation of the difference in attitudes towards language in different societies
  • The differences in language functions in selected communities
  • How ethnicity affect language and vice versa

Argumentative Linguistics Research Topics

These topics in linguistics for research papers argue on issues surrounding language. You can use these topics if you want to show different sides of an argument in your research.

  • Is language the best way to communicate?
  • Can we say that language is merely a system of symbols
  • Do language disorders cause difficulties in the study of language?
  • Does brain injury lead to issues in language capacities?
  • Do mother tongue inflection and accent impact efficient communication?
  • Is it advisable to learn more than one language?

Linguistics Research Topics on History

Language was not a concept that started a few years ago. People have been communicating for centuries and centuries. These topics look at the history of language, sometimes about this age.

  • How Greek philosophy contributes to language
  • What are the early speculations scientists had about the origin of language?
  • Analysis of the history of language as explained in mythology
  • How do the 3,000 preserved cuneiform writings affect language?
  • A critical evaluation of different theories on the origin and development of language
  • Why has the question of language origin remained unanswerable?

Linguistics Research Topics on Semantics

Language is nothing without meaning. These interesting linguistic topics show how meaning and language mix and relate. You can research any one of these topics to understand how this field.

  • How does meaning affect language analysis and interpretation?
  • What is the major cause of language ambiguity?
  • How do sentences relate to one another?
  • How do speakers of different languages acquire a sense of meaning in conversation?
  • How can the meaning of words relate?
  • An in-depth analysis into how language is used and acquired in different communities

Tough Linguistics Research Topics

Do all the topics above seem too easy for you? Do you want something more challenging? We have a few topics for you. These topics would give you that challenge you want. Ensure that you do enough research on topics before you embark on them.

  • Why do people speak different languages?
  • What makes language translation possible?
  • What makes some languages harder to learn than others?
  • Why are English and French words similar?
  • Why does the mother tongue always affect pronunciation?
  • Does sign language only involve the hands?

How to Choose A Perfect Linguistics Topic for You

There are different aspects of linguistics. If you check online, you would find linguistic anthropology research topics, computational linguistics research topics, and much more. However, not all these aspects of linguistics would be perfect for your dissertation or thesis.

In selecting or creating the perfect linguistic topic for you, here are some of the tips from our experts in paper writing you should take into consideration.

  • Pick an aspect that interest you . Linguistics apply to different walks of life. Therefore, there are varied topics for your linguistics research. It can make choosing a topic quite stressful. What you do is find what interests you and find topics in that aspect. Start by looking for a broad aspect then narrow it down to a part of the field. For instance, you can start with applied linguistics and move on to linguistics in politics.
  • Brainstorm with friends . After you have chosen the aspect you like, you can pick a list of topics in linguistics for research papers and bounce off ideas from the topics with your friends. You can even write out your ideas from your brainstorming and ask your friends what they think about them. The topic that you notice that you and your friends keep going back to is possibly the best one for you. If you find a lot of things to talk about it, you would possibly find a lot of things to write about it.
  • Research the topics . Talk is cheap though. If you want to write on a topic, ensure that there are enough materials to support your claims. After you and your friends decide on a topic, research the topic before you start writing. Once you find that there are enough materials, you can start.

Linguistics has different aspects. If you check online and on our list, you would find different topics in these aspects, including topics related to linguistic diversity. Follow our guide and list to find the best linguistic topic for you!

popular culture essay topics

Questions? Call us: 


  • How it works
  • Testimonials

Essay Writing

  • Essay service
  • Essay writers
  • College essay service
  • Write my essay
  • Pay for essay
  • Essay topics

Term Paper Writing

  • Term paper service
  • Buy term papers
  • Term paper help
  • Term paper writers
  • College term papers
  • Write my term paper
  • Pay for term paper
  • Term paper topic

Research Paper Writing

  • Research paper service
  • Buy research paper
  • Research paper help
  • Research paper writers
  • College research papers
  • Write my research paper
  • Pay for research paper
  • Research paper topics

Dissertation Writing

  • Dissertation service
  • Buy dissertation
  • Dissertation help
  • Dissertation writers
  • College thesis
  • Write my dissertation
  • Pay for dissertation
  • Dissertation topics

Other Services

  • Custom writing services
  • Speech writing service
  • Movie review writing
  • Editing service
  • Assignment writing
  • Article writing service
  • Book report writing
  • Book review writing

Popular request:

100 best linguistic research topics.

November 26, 2020

Linguistic Research Topics

Some learners struggle to choose linguistic research topics to research and write about. That’s because linguistics is interesting to learn about yet challenging to write papers and essays about. Some students stay up at night learning about phonetics, phonology, morphology, syntax, and semantics. Unfortunately, they still struggle to write quality papers and essays on linguistic topics in these areas. If looking for ideas to form the basis of your paper or essay, here is a list of research topics in linguistics to consider.

Linguistic Research Topics in Discourse Studies

Discourse studies provide fascinating details about individuals, culture, technology, movements, and changes that take place over time. If looking for linguistics topics that relate to discourse studies, here are some of the best ideas to consider. You can also check out our communication research topics .

  • Childhood is the time when speech is made or broken
  • Cultivation of politicians’ buzzword through linguistic analysis
  • How linguistic patterns are sued to locate migration paths
  • How computers affect modern language negatively
  • How text messaging has created a new linguistic subculture
  • How the brain works when it comes to learning a new language
  • How words change over time
  • How effective is non-verbal communication when it comes to displaying emotions?
  • How effective is verbal communication when it comes to displaying feelings?
  • How society alters words and their meanings
  • How the negative power of a word be reduced by neuro-linguistic programming for trauma victims
  • Is verbal communication more effective than non-verbal communication?
  • How individuals communicate without a shared language
  • How beneficial is learning more than one language during childhood?
  • Why should Elementary School teach students a second language?
  • Explain the acquisition of a language at different growth stages
  • How global leaders use language ethics to change the emotional views of the masses
  • Explain the power of a language in capitalizing on emotions
  • How technology alters the communication
  • How proper use of a language makes a person better in society

A learner should pick a linguistics topic in this category if it piques their interest. That’s because writing a great paper or essay requires a student to explore an idea that they are interested in. Essentially, a learner should research and write about something that they find enjoyable.

Interesting Linguistic Topics for Research

Some topics in linguistics are very interesting to research. These are ideas that most people in society will find enjoyable to read about. Here is a list of the most interesting linguistics topics that students can choose for their papers and essays.

  • Explain how sociolinguistics help people understand multi-lingual language choices
  • A study of differences and similarities of Post-Tudor English
  • How language encourages gender differences
  • Understanding socio-linguistics via color and race background in America
  • Vowel pronunciation in the UK- A systematic review
  • The role of music in language evolution
  • Explain the development and evolution of slangs
  • A study of the connection between perception and language
  • How language creates bonds among cross-cultural communities
  • Language review in informal and formal settings
  • How age affects English pronunciation
  • A phonological treatment based review for English-French load words
  • How sociolinguistics influence gender empowerment
  • How words can be used to master legal settings
  • How the media use sociolinguistics to gain a competitive edge and create bias
  • Exploratory analysis of the impact and importance of body language
  • Importance of sociolinguistics education in discipline development
  • How genders perceive politeness via language use
  • A study of social change through history via sociolinguistics
  • An evaluation of English evolution via a focus on different sociolinguistics

The vast majority of topics in this category touches on language and society. That’s why papers and essays about these linguistic research topics will most likely impress many readers.

Applied Linguistics Research Paper Topics

Applied linguistics focus on finding meaningful language solutions to real-world issues. Some of the best linguistic paper topics to consider in this category include the following.

  • The beauty idea and its expression verbally
  • A detailed evaluation of hate language
  • What are the key determinants of hate language propagation?
  • A literature-based review that explores eye-tracking technology and its implication for applied linguistics advancement
  • A detailed evaluation of research methods for applied linguistics
  • How relevant is the development of applied linguistics?
  • Discuss the impacts of the language used in social media on the current generation
  • An essay on the impact of using proper linguistic communication in social media
  • Are applied linguistics relevant in the current digitalized world?
  • How political oppression affect the linguistic used in the media
  • How important is applied linguistics vocationally?
  • The major differences between spoken and written language via linguistics evaluation
  • Is multilingualism a possibility that follows bilingualism?
  • What is the contribution of a language to national identity within a multicultural society?
  • How effective is healthcare delivery when there are language barriers?
  • Is the language barrier relevant in social media?
  • How bilingualism enriches the personality of an individual
  • Discuss language cognition and perceptions during the learning process
  • Discuss the learning mechanisms when it comes to a foreign language
  • Explain how a non-native teacher can teach local students the English language

These can also be great dissertation topics in linguistics. That’s because they require extensive research and analysis of facts to write brilliant papers. So, if struggling to find an idea for your dissertation, consider one of these thesis topics in applied linguistics.

Great Linguistics Essay Topics

Perhaps, you’re looking for a list of English linguistics research topics from which you can get ideal for your essay. In that case, consider these amazing research proposal topics in linguistics.

  • Discuss the new generative grammar concept
  • Analysis of pragmatics and semantics in two texts
  • Identity analysis in racist language
  • Do humans have a predisposition to learn a language?
  • English assessment as a second language
  • Endangered languages and language death causes
  • Attitudes towards a language and childhood language acquisition
  • Mixing modern language and code-switching
  • Linguistic turn and cognitive turn
  • What is computational linguistics?
  • Linguistic and cultural diversity as an educational issue
  • Differences between adults and childhood language learning
  • Factors that affect the ability to learn a language
  • A forensic assessment of linguistics
  • Lexical and grammatical changes
  • How important is a language?
  • What are the effects of language on human behavior?
  • English or indigenous languages?
  • Is language an essential element of human life?
  • Is language the primary communication medium?

These can be great topics for short essays. However, they can also be PhD thesis topics in linguistics where learners will have to conduct extensive and detailed research. The most important thing is to gather relevant and new information that will interest the readers.

Research Topics in Cognitive Linguistics

Students that want to explore questions in cognitive linguistics should consider topics in this category. Here are some of the most interesting topics in linguistics for research papers that also touch on cognition. If these ideas seem a bit complicated, use our writing services .

  • How grammatical phrasing affects compliance with prescriptions, prohibitions, or suggestions
  • Latest research findings into cognitive literacy in Indian English poetry
  • Conceptual metaphor: Does the activation of a single-source domain activates the multiple target concepts?
  • Multilingualism: Does L2 modulate L1/L2 organization in the brain?
  • Can task-based language teaching perception be measured?
  • Are there prominent cognitive-linguistic books for students?
  • What role does cognitive linguistics play in the acquisition of a second language?
  • Is word meaning a concept that is advocated for by some scholars?
  • Which linguistic experiments can be used to understand how the right and left hemispheres work?
  • Discuss the relationship between metaphors and similes

Computational Linguistics Research Topics

Computational linguistics is an interdisciplinary field that deals with rule-based or statistical modeling of the natural language from the computational perspectives. Here are some of the best topics for research in this field.

  • Using supervised learning to analyze Medieval German poetry
  • Which computer-assisted program is best for phonetic comparison of different dialects and why?
  • How and where can Danish verbs be extracted?
  • Can computational linguistic suggest an intra-lingual contrastive corpus analysis?
  • Where can the Schizophrenia text dataset be found?
  • Discuss the techniques used for meaning or semantic representation in the natural language processing
  • Describe performance measures for speech recognition
  • How to extract the introduction, development, and conclusion of a text
  • Discuss the addition of matrices in a dictionary in python
  • Explain the definition and characterization of linguistic dimensions in a multidimensional analysis

Students that are struggling to choose what to write about can pick any topic in this list that they find interesting, research, and write about it. Taking the time to research extensively and write quality papers or essays is what will earn learners their desired grades.

research paper topics languages

Take a break from writing.

Top academic experts are here for you.

  • How To Write An Autobiography Guideline And Useful Advice
  • 182 Best Classification Essay Topics To Learn And Write About
  • How To Manage Stress In College: Top Practical Tips  
  • How To Write A Narrative Essay: Definition, Tips, And A Step-by-Step Guide
  • How To Write Article Review Like Professional
  • Great Problem Solution Essay Topics
  • Creating Best Stanford Roommate Essay
  • Costco Essay – Best Writing Guide
  • How To Quote A Dialogue
  • Wonderful Expository Essay Topics
  • Research Paper Topics For 2020
  • Interesting Persuasive Essay Topics

Language and Social Interaction Research Paper Topics

Academic Writing Service

  • Accounting Research
  • Action-Implicative Discourse Analysis
  • Apologies and Remedial Episodes
  • Argumentative Discourse
  • Broadcast Talk
  • Business Discourse
  • Cognitive Approaches to Discourse
  • Communication Accommodation Theory
  • Communities of Practice
  • Conversation Analysis
  • Deception in Discourse
  • Design Theory
  • Discourse in the Law
  • Discourse Markers
  • Discursive Psychology
  • Doctor–Patient Talk
  • Emotion and Discourse
  • English-Only Movements
  • Erving Goffman
  • Ethnography of Communication
  • Ethnomethodology
  • Gaze in Interaction
  • Gender and Discourse
  • Gestures in Discourse
  • Identities and Discourse
  • Interactional Sociolinguistics
  • Intimate Talk with Family and Friends
  • Language and Social Psychology
  • Language Varieties
  • Linguistic Pragmatics
  • Meta-Discourse
  • Microethnography
  • Mikhail Bakhtin
  • Power and Discourse
  • Public Meetings
  • Questions and Questioning
  • Small Talk and Gossip
  • Speech Codes Theory
  • Storytelling and Narration
  • Support Talk
  • Technologically Mediated Discourse
  • Telephone Talk
  • Transcribing and Transcription
  • Voice, Prosody, and Laughter

Approaches to Language and Social Interaction

Approaches to LSI come in flavors. The three most prominent are: (1) Conversation Analysis (CA): Developed by sociologist Harvey Sacks (1992), CA is committed to building an observational science of social life. CA’s first step is to collect tapes of ordinary talk and create detailed transcripts that capture as many features of talk as possible. Then CA seeks to identify interaction patterns. (2) Ethnography of Communication (EOC): Extending the anthropological work of Dell Hymes, EOC shows how a community of people speak, interpret others’ actions, and, more broadly, understand what it means to be a person and have relationships. Gerry Philipsen’s (1975) study of ways to “speak like a man” in a working-class Chicago community was key in bringing the EOC tradition to communication.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% off with 24start discount code.

(3) The third tradition, Discourse Analysis (DA), is actually better thought of as an umbrella for multiple LSI approaches that are neither conversation analysis nor EOC. DA includes Discursive Psychology, an approach initially developed by Jonathan Potter and colleagues that explores how everyday psychological terms are used and how psychological matters are managed rhetorically. Two DA approaches developed by communication scholars are “Action-implicative Discourse Analysis,” developed by Karen Tracy, which focuses on explicating the problems of a practice, the conversational strategies of key participants, and participants’ normative beliefs about good conduct; and “Design Theory,” developed by Mark Aakhus and Sally Jackson, which integrates linguistic pragmatic ideas and ideas from argument theory to consider how participation and conduct in a range of communicative practices ought to be designed.

In addition to the above most prominent approaches LSI research is informed by ideas from communities of practice, a tradition developed by Jean Lave and Etienne Wenger that studies how people pursue a common activity; “speech acts theory” developed by philosopher John Austin to dispute other philosophers who treated the purpose of language as making representational statements about the world; “politeness theories,” which explains why communicators deviate from maximally efficient communication, of which Brown and Levinson’s version is the most well-known; and “Critical Discourse Analysis” (Tracy et al. 2011), which studies texts, usually written ones, with the goal of exposing how power gets naturalized and how discourse practices are marshaled to suggest objectivity while all the time systematically advantaging those who have power. Finally, the ideas of Mikhail Bakhtin, who wrote in the early years of the twentieth century, have been influential, especially his notion of the utterance as the basic unit of interaction. LSI work has primarily been interested in what happens in interaction among people – not what goes on in their minds, but this scholarly tendency has been changing in recent years.

Basic Concepts and Findings

LSI examines people talking with others in a range of social occasions to accomplish complementary and antagonistic purposes. LSI studies typically accomplish three things: (1) they identify distinctive features of language and/or interaction; (2) they describe the interpersonal, organizational, or political functions that talk is serving; and (3) they show how the particulars of (1) and (2) come together to create interactional sites that are communicatively distinctive.

(1) Among the features of language dialect has received particular attention from researchers. When people speak a language, they always speak a particular dialect or variety of it whether the language be Korean or English. Features of pronunciation and grammar, for instance, are used differently by communicators of different geographic regions. Another small bit of talk that is ubiquitous, likely to appear at interactionally sensitive moments, is the metadiscursive comment. Metadiscourse labels communicative acts (e.g., ‘an argument’) and makes visible speakers’ beliefs about how communication is working. If one were to tally what aspects of talk have been studied particularly extensively, the most researched unit would undoubtedly be questions (Freed & Ehrlich 2010). Besides the rather obvious goal of gaining information, questions are how speakers claim status for self or give deference to the conversational partner. Another interesting, multipurpose unit of talk is the narrative. Narratives are extended talk units and they are often used to do sensitive actions such as disagreeing, giving advice, or advancing an argument.

(2) By and large LSI researchers have been especially interested in understanding how people design and vary speech acts that are sensitive. Studied acts have included directives, apologies, and accounts. Although information giving is recognized as an important function of talk, it is the less visible functions that have been given the most attention. Less obvious functions include the way talk in workplace settings strengthens bonds of connection among people or gives support for managing difficult moments. ‘Supportive talk,’ for example, is designed to avoid offending, and LSI scholars have been interested in design features of this kind of talk. Another function of talk is identity-work (Tracy & Robles 2013). A speaker’s talk inevitably presents the kind of person the speaker is and altercasts the other as well.

Of all the kinds of identity that have been examined, none has received as much attention as what it means to talk like a woman or a man. Much recent work shows how the performance of gender is strongly shaped by a speaker’s social class, race, and national culture. LSI scholars also see it as important to problematize generalizations about power. Among the vast number of talk features that could be attributed to power differences, they might also be accomplishing different functions. When we attend to discourse functions at societal and institutional levels, we notice a range of different aims served by talk, with one being the enactment of democracy; it is through talk that democracy touches down and becomes a concrete practice.

(3) Interaction has a distinctive shape and set of problems in each institutional setting. Healthcare settings have been studied extensively, as have courtroom discourse, broadcast talk, and political exchange. Ilie (2010), for instance, examined the discourse strategies of speakers in European parliaments. Language and social interaction is a distinctive area of communication study. Its trademark is the use of excerpts of talk to make claims about important units of interaction, the structure of social action, how identities and institutions are constructed, how culture is displayed discursively, and so on.


  • Fitch, K. & Sanders, R. E. (eds.) (2005). Handbook of language and social interaction. Mahwah, NJ: Lawrence Erlbaum.
  • Freed, A. F. & Ehrlich, S. (eds.) (2010). “Why do you ask?” The functions of questions in institutional discourse. Oxford: Oxford University Press.
  • Ilie, C. (2010). European parliaments under scrutiny: Discourse strategies and interaction practices. Amsterdam: John Benjamins.
  • Philipsen, G. (1975). Speaking “like a man” in Teamsterville: Cultural patterns of role enactment in an urban neighborhood. Quarterly Journal of Speech, 61, 13–22.
  • Sacks, H. (1992). Lectures on conversation. Oxford: Blackwell.
  • Tracy, K., Martinez-Guillem, S., Robles, J. S., & Casteline, K. E. (2011). Critical discourse analysis and (US) communication scholarship: Recovering old connections, envisioning new ones. In C. Salmon (ed.), Communication yearbook 35. Los Angeles, CA: Sage, pp. 239–286.
  • Tracy, K, and Robles, J. S. (2013). Everyday talk: Building and reflecting identities, 2nd edn. New York: Guilford.

Back to Communication Research Paper Topics .


research paper topics languages

Help | Advanced Search

Computer Science > Digital Libraries

Title: topics, authors, and institutions in large language model research: trends from 17k arxiv papers.

Abstract: Large language models (LLMs) are dramatically influencing AI research, spurring discussions on what has changed so far and how to shape the field's future. To clarify such questions, we analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on recent trends in 2023 vs. 2018-2022. First, we study disciplinary shifts: LLM research increasingly considers societal impacts, evidenced by 20x growth in LLM submissions to the Computers and Society sub-arXiv. An influx of new authors -- half of all first authors in 2023 -- are entering from non-NLP fields of CS, driving disciplinary expansion. Second, we study industry and academic publishing trends. Surprisingly, industry accounts for a smaller publication share in 2023, largely due to reduced output from Google and other Big Tech companies; universities in Asia are publishing more. Third, we study institutional collaboration: while industry-academic collaborations are common, they tend to focus on the same topics that industry focuses on rather than bridging differences. The most prolific institutions are all US- or China-based, but there is very little cross-country collaboration. We discuss implications around (1) how to support the influx of new authors, (2) how industry trends may affect academics, and (3) possible effects of (the lack of) collaboration.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .


Choose Your Test

Sat / act prep online guides and tips, 113 great research paper topics.

author image

General Education


One of the hardest parts of writing a research paper can be just finding a good topic to write about. Fortunately we've done the hard work for you and have compiled a list of 113 interesting research paper topics. They've been organized into ten categories and cover a wide range of subjects so you can easily find the best topic for you.

In addition to the list of good research topics, we've included advice on what makes a good research paper topic and how you can use your topic to start writing a great paper.

What Makes a Good Research Paper Topic?

Not all research paper topics are created equal, and you want to make sure you choose a great topic before you start writing. Below are the three most important factors to consider to make sure you choose the best research paper topics.

#1: It's Something You're Interested In

A paper is always easier to write if you're interested in the topic, and you'll be more motivated to do in-depth research and write a paper that really covers the entire subject. Even if a certain research paper topic is getting a lot of buzz right now or other people seem interested in writing about it, don't feel tempted to make it your topic unless you genuinely have some sort of interest in it as well.

#2: There's Enough Information to Write a Paper

Even if you come up with the absolute best research paper topic and you're so excited to write about it, you won't be able to produce a good paper if there isn't enough research about the topic. This can happen for very specific or specialized topics, as well as topics that are too new to have enough research done on them at the moment. Easy research paper topics will always be topics with enough information to write a full-length paper.

Trying to write a research paper on a topic that doesn't have much research on it is incredibly hard, so before you decide on a topic, do a bit of preliminary searching and make sure you'll have all the information you need to write your paper.

#3: It Fits Your Teacher's Guidelines

Don't get so carried away looking at lists of research paper topics that you forget any requirements or restrictions your teacher may have put on research topic ideas. If you're writing a research paper on a health-related topic, deciding to write about the impact of rap on the music scene probably won't be allowed, but there may be some sort of leeway. For example, if you're really interested in current events but your teacher wants you to write a research paper on a history topic, you may be able to choose a topic that fits both categories, like exploring the relationship between the US and North Korea. No matter what, always get your research paper topic approved by your teacher first before you begin writing.

113 Good Research Paper Topics

Below are 113 good research topics to help you get you started on your paper. We've organized them into ten categories to make it easier to find the type of research paper topics you're looking for.


  • Discuss the main differences in art from the Italian Renaissance and the Northern Renaissance .
  • Analyze the impact a famous artist had on the world.
  • How is sexism portrayed in different types of media (music, film, video games, etc.)? Has the amount/type of sexism changed over the years?
  • How has the music of slaves brought over from Africa shaped modern American music?
  • How has rap music evolved in the past decade?
  • How has the portrayal of minorities in the media changed?


Current Events

  • What have been the impacts of China's one child policy?
  • How have the goals of feminists changed over the decades?
  • How has the Trump presidency changed international relations?
  • Analyze the history of the relationship between the United States and North Korea.
  • What factors contributed to the current decline in the rate of unemployment?
  • What have been the impacts of states which have increased their minimum wage?
  • How do US immigration laws compare to immigration laws of other countries?
  • How have the US's immigration laws changed in the past few years/decades?
  • How has the Black Lives Matter movement affected discussions and view about racism in the US?
  • What impact has the Affordable Care Act had on healthcare in the US?
  • What factors contributed to the UK deciding to leave the EU (Brexit)?
  • What factors contributed to China becoming an economic power?
  • Discuss the history of Bitcoin or other cryptocurrencies  (some of which tokenize the S&P 500 Index on the blockchain) .
  • Do students in schools that eliminate grades do better in college and their careers?
  • Do students from wealthier backgrounds score higher on standardized tests?
  • Do students who receive free meals at school get higher grades compared to when they weren't receiving a free meal?
  • Do students who attend charter schools score higher on standardized tests than students in public schools?
  • Do students learn better in same-sex classrooms?
  • How does giving each student access to an iPad or laptop affect their studies?
  • What are the benefits and drawbacks of the Montessori Method ?
  • Do children who attend preschool do better in school later on?
  • What was the impact of the No Child Left Behind act?
  • How does the US education system compare to education systems in other countries?
  • What impact does mandatory physical education classes have on students' health?
  • Which methods are most effective at reducing bullying in schools?
  • Do homeschoolers who attend college do as well as students who attended traditional schools?
  • Does offering tenure increase or decrease quality of teaching?
  • How does college debt affect future life choices of students?
  • Should graduate students be able to form unions?


  • What are different ways to lower gun-related deaths in the US?
  • How and why have divorce rates changed over time?
  • Is affirmative action still necessary in education and/or the workplace?
  • Should physician-assisted suicide be legal?
  • How has stem cell research impacted the medical field?
  • How can human trafficking be reduced in the United States/world?
  • Should people be able to donate organs in exchange for money?
  • Which types of juvenile punishment have proven most effective at preventing future crimes?
  • Has the increase in US airport security made passengers safer?
  • Analyze the immigration policies of certain countries and how they are similar and different from one another.
  • Several states have legalized recreational marijuana. What positive and negative impacts have they experienced as a result?
  • Do tariffs increase the number of domestic jobs?
  • Which prison reforms have proven most effective?
  • Should governments be able to censor certain information on the internet?
  • Which methods/programs have been most effective at reducing teen pregnancy?
  • What are the benefits and drawbacks of the Keto diet?
  • How effective are different exercise regimes for losing weight and maintaining weight loss?
  • How do the healthcare plans of various countries differ from each other?
  • What are the most effective ways to treat depression ?
  • What are the pros and cons of genetically modified foods?
  • Which methods are most effective for improving memory?
  • What can be done to lower healthcare costs in the US?
  • What factors contributed to the current opioid crisis?
  • Analyze the history and impact of the HIV/AIDS epidemic .
  • Are low-carbohydrate or low-fat diets more effective for weight loss?
  • How much exercise should the average adult be getting each week?
  • Which methods are most effective to get parents to vaccinate their children?
  • What are the pros and cons of clean needle programs?
  • How does stress affect the body?
  • Discuss the history of the conflict between Israel and the Palestinians.
  • What were the causes and effects of the Salem Witch Trials?
  • Who was responsible for the Iran-Contra situation?
  • How has New Orleans and the government's response to natural disasters changed since Hurricane Katrina?
  • What events led to the fall of the Roman Empire?
  • What were the impacts of British rule in India ?
  • Was the atomic bombing of Hiroshima and Nagasaki necessary?
  • What were the successes and failures of the women's suffrage movement in the United States?
  • What were the causes of the Civil War?
  • How did Abraham Lincoln's assassination impact the country and reconstruction after the Civil War?
  • Which factors contributed to the colonies winning the American Revolution?
  • What caused Hitler's rise to power?
  • Discuss how a specific invention impacted history.
  • What led to Cleopatra's fall as ruler of Egypt?
  • How has Japan changed and evolved over the centuries?
  • What were the causes of the Rwandan genocide ?


  • Why did Martin Luther decide to split with the Catholic Church?
  • Analyze the history and impact of a well-known cult (Jonestown, Manson family, etc.)
  • How did the sexual abuse scandal impact how people view the Catholic Church?
  • How has the Catholic church's power changed over the past decades/centuries?
  • What are the causes behind the rise in atheism/ agnosticism in the United States?
  • What were the influences in Siddhartha's life resulted in him becoming the Buddha?
  • How has media portrayal of Islam/Muslims changed since September 11th?


  • How has the earth's climate changed in the past few decades?
  • How has the use and elimination of DDT affected bird populations in the US?
  • Analyze how the number and severity of natural disasters have increased in the past few decades.
  • Analyze deforestation rates in a certain area or globally over a period of time.
  • How have past oil spills changed regulations and cleanup methods?
  • How has the Flint water crisis changed water regulation safety?
  • What are the pros and cons of fracking?
  • What impact has the Paris Climate Agreement had so far?
  • What have NASA's biggest successes and failures been?
  • How can we improve access to clean water around the world?
  • Does ecotourism actually have a positive impact on the environment?
  • Should the US rely on nuclear energy more?
  • What can be done to save amphibian species currently at risk of extinction?
  • What impact has climate change had on coral reefs?
  • How are black holes created?
  • Are teens who spend more time on social media more likely to suffer anxiety and/or depression?
  • How will the loss of net neutrality affect internet users?
  • Analyze the history and progress of self-driving vehicles.
  • How has the use of drones changed surveillance and warfare methods?
  • Has social media made people more or less connected?
  • What progress has currently been made with artificial intelligence ?
  • Do smartphones increase or decrease workplace productivity?
  • What are the most effective ways to use technology in the classroom?
  • How is Google search affecting our intelligence?
  • When is the best age for a child to begin owning a smartphone?
  • Has frequent texting reduced teen literacy rates?


How to Write a Great Research Paper

Even great research paper topics won't give you a great research paper if you don't hone your topic before and during the writing process. Follow these three tips to turn good research paper topics into great papers.

#1: Figure Out Your Thesis Early

Before you start writing a single word of your paper, you first need to know what your thesis will be. Your thesis is a statement that explains what you intend to prove/show in your paper. Every sentence in your research paper will relate back to your thesis, so you don't want to start writing without it!

As some examples, if you're writing a research paper on if students learn better in same-sex classrooms, your thesis might be "Research has shown that elementary-age students in same-sex classrooms score higher on standardized tests and report feeling more comfortable in the classroom."

If you're writing a paper on the causes of the Civil War, your thesis might be "While the dispute between the North and South over slavery is the most well-known cause of the Civil War, other key causes include differences in the economies of the North and South, states' rights, and territorial expansion."

#2: Back Every Statement Up With Research

Remember, this is a research paper you're writing, so you'll need to use lots of research to make your points. Every statement you give must be backed up with research, properly cited the way your teacher requested. You're allowed to include opinions of your own, but they must also be supported by the research you give.

#3: Do Your Research Before You Begin Writing

You don't want to start writing your research paper and then learn that there isn't enough research to back up the points you're making, or, even worse, that the research contradicts the points you're trying to make!

Get most of your research on your good research topics done before you begin writing. Then use the research you've collected to create a rough outline of what your paper will cover and the key points you're going to make. This will help keep your paper clear and organized, and it'll ensure you have enough research to produce a strong paper.

What's Next?

Are you also learning about dynamic equilibrium in your science class? We break this sometimes tricky concept down so it's easy to understand in our complete guide to dynamic equilibrium .

Thinking about becoming a nurse practitioner? Nurse practitioners have one of the fastest growing careers in the country, and we have all the information you need to know about what to expect from nurse practitioner school .

Want to know the fastest and easiest ways to convert between Fahrenheit and Celsius? We've got you covered! Check out our guide to the best ways to convert Celsius to Fahrenheit (or vice versa).

These recommendations are based solely on our knowledge and experience. If you purchase an item through one of our links, PrepScholar may receive a commission.

author image

Christine graduated from Michigan State University with degrees in Environmental Biology and Geography and received her Master's from Duke University. In high school she scored in the 99th percentile on the SAT and was named a National Merit Finalist. She has taught English and biology in several countries.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Join the Conversation

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

research paper topics languages

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

natural language processing Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages

Three different Indic/Indo-Aryan languages - Bengali, Hindi and Nepali have been explored here in character level to find out similarities and dissimilarities. Having shared the same root, the Sanskrit, Indic languages bear common characteristics. That is why computer and language scientists can take the opportunity to develop common Natural Language Processing (NLP) techniques or algorithms. Bearing the concept in mind, we compare and analyze these three languages character by character. As an application of the hypothesis, we also developed a uniform sorting algorithm in two steps, first for the Bengali and Nepali languages only and then extended it for Hindi in the second step. Our thorough investigation with more than 30,000 words from each language suggests that, the algorithm maintains total accuracy as set by the local language authorities of the respective languages and good efficiency.

Efficient Channel Attention Based Encoder–Decoder Approach for Image Captioning in Hindi

Image captioning refers to the process of generating a textual description that describes objects and activities present in a given image. It connects two fields of artificial intelligence, computer vision, and natural language processing. Computer vision and natural language processing deal with image understanding and language modeling, respectively. In the existing literature, most of the works have been carried out for image captioning in the English language. This article presents a novel method for image captioning in the Hindi language using encoder–decoder based deep learning architecture with efficient channel attention. The key contribution of this work is the deployment of an efficient channel attention mechanism with bahdanau attention and a gated recurrent unit for developing an image captioning model in the Hindi language. Color images usually consist of three channels, namely red, green, and blue. The channel attention mechanism focuses on an image’s important channel while performing the convolution, which is basically to assign higher importance to specific channels over others. The channel attention mechanism has been shown to have great potential for improving the efficiency of deep convolution neural networks (CNNs). The proposed encoder–decoder architecture utilizes the recently introduced ECA-NET CNN to integrate the channel attention mechanism. Hindi is the fourth most spoken language globally, widely spoken in India and South Asia; it is India’s official language. By translating the well-known MSCOCO dataset from English to Hindi, a dataset for image captioning in Hindi is manually created. The efficiency of the proposed method is compared with other baselines in terms of Bilingual Evaluation Understudy (BLEU) scores, and the results obtained illustrate that the method proposed outperforms other baselines. The proposed method has attained improvements of 0.59%, 2.51%, 4.38%, and 3.30% in terms of BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores, respectively, with respect to the state-of-the-art. Qualities of the generated captions are further assessed manually in terms of adequacy and fluency to illustrate the proposed method’s efficacy.

Model Transformation Development Using Automated Requirements Analysis, Metamodel Matching, and Transformation by Example

In this article, we address how the production of model transformations (MT) can be accelerated by automation of transformation synthesis from requirements, examples, and metamodels. We introduce a synthesis process based on metamodel matching, correspondence patterns between metamodels, and completeness and consistency analysis of matches. We describe how the limitations of metamodel matching can be addressed by combining matching with automated requirements analysis and model transformation by example (MTBE) techniques. We show that in practical examples a large percentage of required transformation functionality can usually be constructed automatically, thus potentially reducing development effort. We also evaluate the efficiency of synthesised transformations. Our novel contributions are: The concept of correspondence patterns between metamodels of a transformation. Requirements analysis of transformations using natural language processing (NLP) and machine learning (ML). Symbolic MTBE using “predictive specification” to infer transformations from examples. Transformation generation in multiple MT languages and in Java, from an abstract intermediate language.

A Computational Look at Oral History Archives

Computational technologies have revolutionized the archival sciences field, prompting new approaches to process the extensive data in these collections. Automatic speech recognition and natural language processing create unique possibilities for analysis of oral history (OH) interviews, where otherwise the transcription and analysis of the full recording would be too time consuming. However, many oral historians note the loss of aural information when converting the speech into text, pointing out the relevance of subjective cues for a full understanding of the interviewee narrative. In this article, we explore various computational technologies for social signal processing and their potential application space in OH archives, as well as neighboring domains where qualitative studies is a frequently used method. We also highlight the latest developments in key technologies for multimedia archiving practices such as natural language processing and automatic speech recognition. We discuss the analysis of both visual (body language and facial expressions), and non-visual cues (paralinguistics, breathing, and heart rate), stating the specific challenges introduced by the characteristics of OH collections. We argue that applying social signal processing to OH archives will have a wider influence than solely OH practices, bringing benefits for various fields from humanities to computer sciences, as well as to archival sciences. Looking at human emotions and somatic reactions on extensive interview collections would give scholars from multiple fields the opportunity to focus on feelings, mood, culture, and subjective experiences expressed in these interviews on a larger scale.

Which environmental features contribute to positive and negative perceptions of urban parks? A cross-cultural comparison using online reviews and Natural Language Processing methods

Natural language processing for smart construction: current status and future directions, attention-based unsupervised keyphrase extraction and phrase graph for covid-19 medical literature retrieval.

Searching, reading, and finding information from the massive medical text collections are challenging. A typical biomedical search engine is not feasible to navigate each article to find critical information or keyphrases. Moreover, few tools provide a visualization of the relevant phrases to the query. However, there is a need to extract the keyphrases from each document for indexing and efficient search. The transformer-based neural networks—BERT has been used for various natural language processing tasks. The built-in self-attention mechanism can capture the associations between words and phrases in a sentence. This research investigates whether the self-attentions can be utilized to extract keyphrases from a document in an unsupervised manner and identify relevancy between phrases to construct a query relevancy phrase graph to visualize the search corpus phrases on their relevancy and importance. The comparison with six baseline methods shows that the self-attention-based unsupervised keyphrase extraction works well on a medical literature dataset. This unsupervised keyphrase extraction model can also be applied to other text data. The query relevancy graph model is applied to the COVID-19 literature dataset and to demonstrate that the attention-based phrase graph can successfully identify the medical phrases relevant to the query terms.

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition. To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB .

An ensemble approach for healthcare application and diagnosis using natural language processing

Machine learning and natural language processing enable a data-oriented experimental design approach for producing biochar and hydrochar from biomass, export citation format, share document.

Read our research on: Gun Policy | International Conflict | Election 2024

Regions & Countries

Political typology quiz.

Notice: Beginning April 18th community groups will be temporarily unavailable for extended maintenance. Thank you for your understanding and cooperation.

Where do you fit in the political typology?

Are you a faith and flag conservative progressive left or somewhere in between.

research paper topics languages

Take our quiz to find out which one of our nine political typology groups is your best match, compared with a nationally representative survey of more than 10,000 U.S. adults by Pew Research Center. You may find some of these questions are difficult to answer. That’s OK. In those cases, pick the answer that comes closest to your view, even if it isn’t exactly right.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .


  1. the good research paper topics for high school students is shown in purple and white text

    research paper topics languages

  2. 👍 Good research paper topics for college english. 101 American History Research Paper Topics for

    research paper topics languages

  3. International Journal of Literature, Language and Linguistics (ISSN: 2689-9450)

    research paper topics languages

  4. 🌈 Easy paper topics. 162 Intriguing Science Research Paper Topics for you to Explore. 2022-10-13

    research paper topics languages

  5. English language a level coursework ideas for science

    research paper topics languages

  6. Best research paper topics in philippines language

    research paper topics languages


  1. Online Workshop on Research Paper Writing & Publishing Day 1

  2. Online Workshop on Research Paper Writing & Publishing Day 2

  3. Weeks 9-10

  4. 🔍 S2- Q2- Topic A

  5. Research Methodology: How To Write Research Paper in Hindi Manuscript Writing Skill

  6. Writing a Synthesis Essay Exam or Term Paper (CC)


  1. 55 Top-Rated Research Topics in Linguistics For an A+

    A critical evaluation of language and ethnicity. Analyzing language attrition among most English speakers. Distinct functions of language among different communities. Interesting Topics in ...

  2. 211 Interesting Research Topics in Linguistics For Your Thesis

    Linguistics Research Paper Topics. If you want to study how language is applied and its importance in the world, you can consider these Linguistics topics for your research paper. They are: An analysis of romantic ideas and their expression amongst French people. An overview of the hate language in the course against religion.

  3. 130+ Original Linguistics Research Topics: That Need To Know

    For these reasons, we offer quality research paper writing services for all students. We guarantee quality papers, timely deliveries, and originality. Reach out to our writers for top linguistics research papers today! Our original linguistics research topics focus on semantics, discourse, language acquisition, and sociolinguistics.

  4. 100+ Linguistic Topics

    100+ Linguistic Topics for Excellent Research Papers. Linguistics is an English language category that deals with logical dialectal analysis and interpretation. It seeks to reveal the form, meaning, and context of language. While most college students may perceive linguistics as a simple subject, it is pretty complex.

  5. Language and linguistics

    Drawing upon the philosophical theories of language—that the meaning and inference of a word is dependent on its use—we argue that the context in which use of the term patient occurs is ...

  6. PDF A Guide to Writing a Senior Thesis in Linguistics

    have your results . If you have your topic, that means you're ready to get started . If you don't know where to go next with your research, reach out to your adviser to talk about it! They're here to help . Likewise, if you need help with formatting, outlin-ing, organizing your writing, or any part of getting your ideas down on paper ...

  7. Second Language Research: Sage Journals

    Second Language Research is an international peer-reviewed, quarterly journal, publishing original theory-driven research concerned with second language acquisition and second language performance. This includes both experimental studies and contributions aimed at exploring conceptual issues. In addition to providing a forum for investigators in the field of non-native language learning...

  8. Trends and hot topics in linguistics studies from 2011 to 2021: A

    High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators ...

  9. Key Topics in Applied Linguistics

    About Key Topics in Applied Linguistics. Books in this series provide critical accounts of the most important topics in applied linguistics, conceptualised as an interdisciplinary field of research and practice dealing with practical problems of language and communication. Some topics have been the subject of applied linguistics for many years ...

  10. Natural language processing: state of the art, current trends and

    Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP ...

  11. 100+ Compelling Linguistics Research Topics for University ...

    Linguistics is a science of language in which fact-finding is done through some rational and systematic study. While digging into the information about the history of linguistics, two perspectives on languages are unveiled: prescriptive and descriptive views. ... Stimulating Research Paper Topics In Sociolinguistics. While seeking linguistics ...

  12. 55 Best Research Topics in Linguistics For Top Students

    55 Top-Rated Research Topics in Linguistics For an A+. The field of linguistics is one of the easiest yet challenging subjects for college and university students. Areas such as phonology, phonetics, syntax, morphology, and semantics in linguistics can keep you up all night. That is why we came up with these quality language research topics.

  13. Key Topics in Second Language Acquisition

    He is the co-author of Key Topics in Second Language Acquisition and co-editor of the Multilingual Matters SLA book series. In 2015 he received the EUROSLA Distinguished Scholar Award. Vivian Cook is Emeritus Professor, Newcastle University, UK. He has been researching in the fields of second language acquisition and writing systems for over 45 ...

  14. Multilingual Large Language Model: A Survey of Resources, Taxonomy and

    The contributions of this paper can be summarized: (1) First survey: to our knowledge, we take the first step and present a thorough review in MLLMs research field according to multi-lingual alignment; (2) New taxonomy: we offer a new and unified perspective to summarize the current progress of MLLMs; (3) New frontiers: we highlight several ...


    Join ResearchGate to discover and stay up-to-date with the latest research from leading experts in Language Learning and many other scientific topics. Join for free ResearchGate iOS App

  16. 50+ Linguistics Research Topics For Papers And Projects

    Linguistics Topics on Politics. Politics is an ever-present phenomenon in any society. These dissertation topics in linguistics examine the issues surrounding language in the field of politics. We have explained samples of Ph.D. thesis topics in linguistics in this field. The reality of hate speech in selected communities.

  17. Top 100 Linguistic Research Topics for Students

    Here is a list of the most interesting linguistics topics that students can choose for their papers and essays. Explain how sociolinguistics help people understand multi-lingual language choices. A study of differences and similarities of Post-Tudor English. How language encourages gender differences.

  18. Research on learning and teaching of languages other than English in

    The very first article System published on teaching LOTEs was a discussion paper on teaching French phonetics at the University of Hawaii in 1973 (Niedzielski, 1973).Given the journal's original strong association with European readers and scholars, it is no surprise that European languages have been featured in the published studies on LOTEs throughout the journal's history.

  19. Vision, status, and research topics of Natural Language Processing

    Abstract. The field of Natural Language Processing (NLP) has evolved with, and as well as influenced, recent advances in Artificial Intelligence (AI) and computing technologies, opening up new applications and novel interactions with humans. Modern NLP involves machines' interaction with human languages for the study of patterns and obtaining ...

  20. Language and Social Interaction Research Paper Topics

    See our list of language and social interaction research paper topics. Language and social interaction (LSI) studies how language, gesture, voice, and other features of talk and written texts shape meaning-making. LSI defines itself by how it investigates questions about communication. It is the commitment to study of social life in its ...

  21. Topics, Authors, and Institutions in Large Language Model Research

    Large language models (LLMs) are dramatically influencing AI research, spurring discussions on what has changed so far and how to shape the field's future. To clarify such questions, we analyze a new dataset of 16,979 LLM-related arXiv papers, focusing on recent trends in 2023 vs. 2018-2022. First, we study disciplinary shifts: LLM research increasingly considers societal impacts, evidenced by ...

  22. 55 Research Paper Topics to Jump-Start Your Paper

    55 Research Paper Topics to Jump-Start Your Paper. Matt Ellis. Updated on October 9, 2023 Students. Coming up with research paper topics is the first step in writing most papers. While it may seem easy compared to the actual writing, choosing the right research paper topic is nonetheless one of the most important steps.

  23. 113 Great Research Paper Topics

    113 Great Research Paper Topics. One of the hardest parts of writing a research paper can be just finding a good topic to write about. Fortunately we've done the hard work for you and have compiled a list of 113 interesting research paper topics. They've been organized into ten categories and cover a wide range of subjects so you can easily ...

  24. natural language processing Latest Research Papers

    Hindi Language. Image captioning refers to the process of generating a textual description that describes objects and activities present in a given image. It connects two fields of artificial intelligence, computer vision, and natural language processing. Computer vision and natural language processing deal with image understanding and language ...

  25. Political Typology Quiz

    Take our quiz to find out which one of our nine political typology groups is your best match, compared with a nationally representative survey of more than 10,000 U.S. adults by Pew Research Center. You may find some of these questions are difficult to answer. That's OK. In those cases, pick the answer that comes closest to your view, even if ...