essay on online hate speech

Search Menu
Advance articles
Featured Content
Author Guidelines
Open Access
About The British Journal of Criminology
About the Centre for Crime and Justice Studies
Editorial Board
Advertising and Corporate Services
Journals Career Network
Self-Archiving Policy
Dispatch Dates
Terms and Conditions
Journals on Oxford Academic
Books on Oxford Academic

Article Contents

Introduction, prevalence of online hate speech on social media, theoretical framework, data and methods.

< Previous

Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime

Article contents
Figures & tables
Supplementary Data

Matthew L Williams, Pete Burnap, Amir Javed, Han Liu, Sefa Ozalp, Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime, The British Journal of Criminology , Volume 60, Issue 1, January 2020, Pages 93–117, https://doi.org/10.1093/bjc/azz049

Permissions Icon Permissions

National governments now recognize online hate speech as a pernicious social problem. In the wake of political votes and terror attacks, hate incidents online and offline are known to peak in tandem. This article examines whether an association exists between both forms of hate, independent of ‘trigger’ events. Using Computational Criminology that draws on data science methods, we link police crime, census and Twitter data to establish a temporal and spatial association between online hate speech that targets race and religion, and offline racially and religiously aggravated crimes in London over an eight-month period. The findings renew our understanding of hate crime as a process, rather than as a discrete event, for the digital age.

Hate crimes have risen up the hierarchy of individual and social harms, following the revelation of record high police figures and policy responses from national and devolved governments. The highest number of hate crimes in history was recorded by the police in England and Wales in 2017/18. The 94,098 hate offences represented a 17 per cent increase on the previous year and a 123 per cent increase on 2012/13. Although the Crime Survey for England and Wales has recorded a consistent decrease in total hate crime victimization (combining race, religion, sexual orientation, disability and transgender), estimations for race and religion-based hate crimes in isolation show an increase from a 112,000 annual average (April 13–March 15) to a 117,000 annual average (April 15–March 17) ( ONS, 2017 ). This increase does not take into account the likely rise in hate victimization in the aftermath of the 2017 terror attacks in London and Manchester. Despite improvements in hate crime reporting and recording, the consensus is that a significant ‘dark figure’ remains. There continues a policy and practice need to improve the intelligence about hate crimes, and in particular to better understand the role community tensions and events play in patterns of perpetration. The HMICFRS (2018) inspection on police responses to hate crimes evidenced that forces remain largely ill-prepared to handle the dramatic increases in racially and religiously aggravated offences following events like the United Kingdom-European Union (UK-EU) referendum vote in 2016 and the terror attacks in 2017. Part of the issue is a significant reduction in Police Community Support Officers throughout England, and in particular London ( Greig-Midlane (2014) indicates a circa 50 per cent reduction since 2010). Fewer officers in neighbourhoods gathering information and intelligence on community relations reduces the capacity of forces to pre-empt and mitigate spates of inter-group violence, harassment and criminal damage.

Technology has been heralded as part of the solution by transforming analogue police practices into a set of complementary digital processes that are scalable and deliverable in near real time ( Williams et al. , 2013 ; Chan and Bennett Moses, 2017 ; Williams et al. , 2017a ). In tandem with offline hate crime, online hate speech posted on social media has become a pernicious social problem ( Williams et al. , 2019 ). Thirty years on from the Home Office (1989) publication ‘ The Response to Racial Attacks and Harassment ’ that saw race hate on the streets become priority for six central Whitehall departments, the police, Crown Prosecution Service (CPS) and courts ( Bowling, 1993 ), the government is now making similar moves to tackle online hate speech. The Home Secretary in 2016 established the National Online Hate Crime Hub, a Home Affairs Select Committee in 2017 established an inquiry into hate crime, including online victimization, and a review by the Law Commission was launched by the prime minister to address the inadequacies in legislation relating to online hate. Social media giants, such as Facebook and Twitter, have been questioned by national governments and the European Union over their policies that provided safe harbour to hate speech perpetrators. Previous research shows hate crimes offline and hate speech online are strongly correlated with events of significance, such as terror attacks, political votes and court cases ( Hanes and Machin, 2014 ; Williams and Burnap, 2016 ). It is therefore acceptable to assume that online and offline hate in the immediate wake of such events are highly correlated. However, what is unclear is if a more general pattern of correlation can be found independent of ‘trigger’ events. To test this hypothesis, we collected Twitter and police recorded hate crime data over an eight-month period in London and built a series of statistical models to identify whether a significant association exists. At the time of writing, no published work has shown such an association. Our models establish a general temporal and spatial association between online hate speech targeting race and religion and offline racially and religiously aggravated crimes independent of ‘trigger’ events . Our results have the potential to renew our understanding of hate crime as a process, rather than a discrete event ( Bowling, 1993 ), for the digital age.

Since its inception, the Internet has facilitated the propagation of extreme narratives often manifesting as hate speech targeting minority groups ( Williams, 2006 ; Perry and Olsson, 2009 ; Burnap and Williams, 2015 , 2016 ; Williams and Burnap, 2016 ; Williams et al. , 2019 ). Home Office (2018) data show that 1,605 hate crimes were flagged as online offences between 2017 and 2018, representing 2 per cent of all hate offences. This represents a 40 per cent increase compared to the previous year. Online race hate crime makes up the majority of all online hate offences (52 per cent), followed by sexual orientation (20 per cent), disability (13 per cent), religion (12 per cent) and transgender online hate crime (4 per cent). Crown Prosecution Service data show that in the year April 2017/18, there were 435 prosecutions related to online hate, a 13 per cent increase on the previous year ( CPS, 2018 ). These figures are a significant underestimate. 1 HMICFRS (2018) found that despite the Home Office introducing a requirement for police forces to flag cyber-enabled hate crime offences, uptake on this practice has been patchy and inconsistent, resulting in unreliable data on prevalence.

Hawdon et al. (2017) , using representative samples covering 15- to 30-year-olds in the United States, United Kingdom, Germany and Finland, found on average 43 per cent respondents had encountered hate material online (53 per cent for the United States and 39 per cent for the United Kingdom). Most hate material was encountered on social media, such as Twitter and Facebook. Ofcom (2018b) , also using a representative UK sample, found that near half of UK Internet users reported seeing hateful content online in the past year, with 16- to 34-year-olds most likely to report seeing this content (59 per cent for 16–24s and 62 per cent for 25–34s). Ofcom also found 45 per cent of 12- to 15-year-olds in 2017 reported encountering hateful content online, an increase on the 2016 figure of 34 per cent ( Ofcom, 2018a ; 2018c ).

Administrative and survey data only capture a snapshot of the online hate phenomenon. Data science methods pioneered within Computational Criminology (see Williams and Burnap, 2016 ; Williams et al. , 2017a ) facilitate a real-time view of hate speech perpetration in action, arguably generating a more complete picture. 2 In 2016 and 2017, the Brexit vote and a string of terror attacks were followed by significant and unprecedented increases in online hate speech (see Figures 1 and 2 ). Although the production of hate speech increased dramatically in the wake of all these events, statistical models showed it was least likely to be retweeted in volume and to survive for long periods of time, supporting a ‘half-life’ hypothesis. Where hate speech was retweeted, it emanated from a core group of like-minded individuals who seek out each other’s messages ( Williams and Burnap, 2016 ). Hate speech produced around the Brexit vote in particular was found to be largely driven by a small number of Twitter accounts. Around 50 per cent of anti-Muslim hate speech was produced by only 6 per cent users, many of whom were classified as politically anti-Islam ( Demos, 2017 ).

UK anti-black and anti-Muslim hate speech on Twitter around the Brexit vote

Global anti-Muslim hate speech on Twitter during 2017 (gaps relate to breaks in data collection)

The role of popular and politically organized racism in fostering terrestrial climates of intimidation and violence is well documented ( Bowling, 1993 ). The far right, and some popular right-wing politicians, have been pivotal in shifting the ‘Overton window’ of online political discussion further to the extremes ( Lehman, 2014 ), creating spaces where hate speech has become the norm. Early research shows the far right were quick to take to the Internet largely unhindered by law enforcement due to constitutional protections around free speech in the United States. The outcome has been the establishment of extreme spaces that provide a collective virtual identity to previously fragmented hateful individuals. These spaces have helped embolden domestic hate groups in many countries, including the United States, United Kingdom, Germany, the Netherlands, Italy and Sweden ( Perry and Olsson, 2009 ).

In late 2017, social media giants began introducing hate speech policies, bowing under pressure from the German government and the European Commission ( Williams et al. , 2019 ). Up to this point, Facebook, Instagram, YouTube and Twitter were accused of ‘shielding’ far right pages as they generated advertising income due to their high number of followers. The ‘Tommy Robinson’ Facebook page, with 1 million followers, held the same protections as media and government pages, despite having nine violations of the platform’s policy on hate speech, whereas typically only five were tolerated by the content review process ( Hern, 2018 ). The page was eventually removed in March 2019, a year after Twitter removed the account of Stephen Yaxley-Lennon (alias Tommy Robinson) from their platform.

Social media was implicated in the Christchurch, New Zealand extreme-right wing terror attack in March 2019. The terrorist was an avid user of social media, including Facebook and Twitter, but also more subversive platforms, such as 8chan. 8chan was the terrorist’s platform of choice when it came to publicizing his live Facebook video of the attack. His message opened by stating he was moving on from ‘shit-posting’—using social media to spread hatred of minority groups—to taking the dialogue offline, into action. He labelled his message a ‘real life effort post’—the migration of online hate speech to offline hate crime/terrorism ( Figure 3 ). The live Facebook video lasted for 17 minutes, with the first report to the platform being made after the 12th minute. The video was taken down within the hour, but it was too late to stop the widespread sharing. It was re-uploaded more than 2 million times on Facebook, YouTube, Instagram and Twitter and it remained easily accessible over 24 hours after the attack. Facebook, Twitter, but particularly 8chan, flooded with praise and support for the attack. Many of these posts were removed, but those on 8chan remain due to its lack of moderation.

Christchurch extreme right terror attacker’s post on 8chan, broadcasting the live Facebook video

In the days following the terror attack spikes in hate crimes were recorded across the United Kingdom. In Oxford, Swastikas with the words “sub 2 PewDiePie” were graffitied on a school wall. In in his video ahead of the massacre, the terrorist asked viewers to ‘subscribe to PewDiePie’. The social media star who earned $15.5 million in 2018 from his online activities has become known for his anti-Semitic comments and endorsements of white supremacist conspiracies ( Chokshi, 2019 ). In his uploaded 74-page manifesto, the terrorist also referenced Darren Osborne, the perpetrator of the Finsbury Park Mosque attack in 2017. Osborne is known to have been influenced by social media communications ahead of his attack. His phone and computers showed that he accessed the Twitter account of Stephen Yaxley-Lennon two days before the attack, who he only started following two weeks prior. The tweet from Robinson read ‘Where was the day of rage after the terrorist attacks. All I saw was lighting candles’. A direct Twitter message was also sent to Osborne by Jayda Fransen of Britain First ( Rawlinson, 2018 ). Other lone actor extreme right-wing terrorists, including Pavlo Lapshyn and Anders Breivik, are also known to have self-radicalized via the Internet ( Peddell et al. 2016 ).

Far right and popular right-wing activity on social media, unhindered for decades due to free-speech protections, has shaped the perception of many users regarding what language is acceptable online. Further enabled by the disinhibiting and deindividuating effects of Internet communications, and the ineffectiveness of the criminal justice system to keep up with the pace of technological developments ( Williams, 2006 ), social media abounds with online hate speech. Online controversies, such as Gamergate, the Bank of England Fry/Austen fiasco and the Mark Meechan scandal, among many others, demonstrate how easily users of social media take to antagonistic discourse ( Williams et al. , 2019 ). In recent times, these users have been given further licence by the divisive words of popular right-wing politicians wading into controversial debates, in the hopes of gaining support in elections and leadership contests. The offline consequences of this trend are yet to be fully understood, but it is worth reminding ourselves that those who routinely work with hate offenders agree that although not all people who are exposed to hate material go on to commit hate crimes on the streets, all hate crime criminals are likely to have been exposed to hate material at some stage ( Peddell et al. , 2016 ).

The study relates to conceptual work that examines the role of social media in political polarization ( Sunstein, 2017 ) and the disruption of ‘hierarchies of credibility’ ( Greer and McLaughlin, 2010 ). In the United States, online sources, including social media, now outpace traditional press outlets for news consumption ( Pew Research Centre, 2018 ). The pattern in the United Kingdom is broadly similar, with only TV news (79 per cent) leading over the Internet (64 per cent) for all adults, and the Internet, in particular social media taking first place for those aged 16–24 ( Ofcom, 2018b ). In the research on polarization, the general hypothesis tested is disinformation is amplified in partisan networks of like-minded social media users, where it goes largely unchallenged due to ranking algorithms filtering out any challenging posts. Sunstein (2017) argues that ‘echo chambers’ on social media reflecting increasingly extreme viewpoints are breeding grounds for ‘fake news’, far right and left conspiracy theories and hate speech. However, the evidence on the effect of social media on political polarization is mixed. Boxell et al. (2017) and Debois and Blank (2017) , both using offline survey data, found that social media had limited effect on polarization on respondents. Conversely, Brady et al. (2017) and Bail et al. (2018) , using online and offline data, found strong support for the hypothesis that social media create political echo chambers. Bail et al. found that republicans, and to a lesser extent democrats, were likely to become more entrenched in their original views when exposed to opposing views on Twitter, highlighting the resilience of echo chambers to destabilization. Brady et al. found that emotionally charged (e.g. hate) messages about moral issues (e.g. gay marriage) increased diffusion within echo chambers, but not between them, indicating this as a factor in increasing polarization between liberals and conservatives.

A recently exposed factor that is a likely candidate for increasing polarization around events is the growing use of fake accounts and bots to spread divisive messages. Preliminary evidence shows that these automated Twitter accounts were active in the UK-EU referendum campaign, and most influential on the leave side ( Howard and Kollanyi, 2016 ). Twitter accounts linked to the Russian Internet Research Agency (IRA) were also active in the Brexit debate following the vote. These accounts also spread fake news and promoted xenophobic messages in the aftermath of the 2017 UK terror attacks ( Crest, 2017 ). Accounts at the extreme-end of right-wing echo chambers were routinely targeted by the IRA to gain traction via retweets. Key political and far right figures have also been known to tap into these echo chambers to drum-up support for their campaigns. On Twitter, Donald Trump has referred to Mexican immigrants as ‘criminals and rapists’ and retweeted far right activists after Charlottesville, and Islamophobic tweets from the far right extremist group, Britain First. The leaders of Britain First, and the ex-leader of the English Defence League, all used social media to spread their divisive narrative before they were banned from most platforms between December 2017 and March 2019. These extremist agitators and others like them have used the rhetoric of invasion, threat and otherness in an attempt to increase polarization online, in the hope that it spills into the offline, in the form of votes, financial support and participation in rallies. Research by Hope Not Hate (2019) shows that at the time of the publication of their report, 5 of the 10 far-right social media activists with the biggest online reach in the world were British. The newest recruits to these ideologies (e.g. Generation Identity) are highly technically capable and believe social media to be essential to building a larger following.

Whatever the effect of social media on polarization, and how this may vary by individual-level factors, the role of events, bots and far right agitators, there remains limited experimental research that pertains to the key aim of this article: its impact on the behaviour of the public offline. Preliminary unpublished work suggests a link between online polarizing activity and offline hate crime ( Müller and Shwarz, 2018a , 2018b ). But what remains under-theorized is why social media has salience in this context that overrides the effect of other sources (TV, newspapers, radio) espousing arguably more mainstream viewpoints. Greer and Mclaughlin (2010) have written about the power of social media in the form of citizen journalism, demonstrating how the initially dominant police driven media narrative of ‘protestor violence’ in the reporting of the G20 demonstration was rapidly disrupted by technology-driven alternative narratives of ‘police violence’. They conclude “the citizen journalist provides a valuable additional source of real-time information that may challenge or confirm the institutional version of events” (2010: 1059). Increasingly, far right activists like Stephen Yaxley-Lennon are adopting citizen journalism as a tactic to polarize opinion. Notably, Lennon live-streamed himself on social media outside Leeds Crown Court hearing the Huddersfield grooming trials to hundreds of thousands of online viewers. His version of events was imbued with anti-Islam rhetoric, and the stunt almost derailed the trial. Such tactics take advantage of immediacy, manipulation, partisanship and a lack of accountability rarely found in mainstream media. Such affordances can provide a veil of authenticity and realism to stories, having the power to reframe their original casting by the ‘official’ establishment narrative, further enabled by dramatic delivery of ‘evidence’ of events as they occur. The ‘hacking’ of the information-communications marketplace enabled by social media disrupts the primacy of conventional media, allowing those who produce subversive “fake news” anti-establishment narratives to rise up the ‘hierarchy of credibility’. The impact of this phenomenon is likely considerable knowing over two-thirds of UK adults, and eight in ten 16- to 24-year-olds now use the Internet as their main source of news ( Ofcom, 2018b ).

The hypotheses test if online hate speech on Twitter, an indicator of right-wing polarization, can improve upon the estimations of offline hate crimes that use conventional predictors alone.

H1 : Conventional census regressors associated with hate crime in previous research will emerge as statistically significant.

‘Realistic’ threats are often associated with hate crimes (Stephan and Stephan, 2000 ; Roberts et al., 2013). These relate to resource threats, such as competition over jobs and welfare benefits. Espiritu (2004) shows how US census measures relating to economic context are statistically associated with hate crimes at the state level. In the United Kingdom, Ray et al. (2004) found that a sense of economic threat resulted in unacknowledged shame, which was experienced as rage directed toward the minority group perceived to be responsible for economic hardship. Demographic ecological factors, such as proportion of the population who are black or minority ethnic and age structure, have also been associated with hate crime ( Green, 1998 ; Nandi et al. , 2017 ; Williams and Tregidga, 2014 ; Ray et al. , 2004 ). In addition, educational attainment has been shown to relate to tolerance, even among those explicitly opposed to minority groups ( Bobo and Licari, 1989 ).

H2 : Online hate speech targeting race and religion will be positively associated with police recorded racially and religiously aggravated crimes in London.

Preliminary unpublished work focusing on the United States and Germany has showed that posts from right-wing politicians that target minority groups, deemed as evidence of extreme polarization, are statistically associated with variation in offline hate crimes recorded by the police. Müller and Shwarz (2018a) found an association between Trump’s tweets about Islam-related topics and anti-Muslim hate in US state counties. The same authors also found anti-refugee posts on the far-right Alternative für Deutschland’s Facebook page predicted offline-violent crime against immigrants in Germany ( Müller and Shwarz, 2018b ). This hypothesis tests for the first time if these associations are replicated in the United Kingdom’s largest metropolitan area.

H3 : Estimation models including the online hate speech regressor will increase the amount of offline hate crime variance explained in panel-models compared to models that include census variables alone.

Williams et al. (2017a) found that tweets mentioning terms related to the concept of ‘broken windows’ were statistically associated with police recorded crime (hate crime was not included) in London boroughs and improved upon the variance explained compared to census regressors alone. This hypothesis tests whether these results hold for the estimation of hate crimes.

The study adopted methods from Computational Criminology (see Williams et al. , 2017a for an overview). Data were linked from administrative, survey and social media sources to build our statistical models. Police recorded racially and religiously aggravated offences data were obtained from the Metropolitan Police Service for an eight-month period between August 2013 and August 2014. UK census variables from 2011 were derived from the Nomis web portal. London-based tweets were collected over the eight-month period using the Twitter streaming Application Programming Interface via the COSMOS software ( Burnap et al. , 2014 ). All sources were linked by month and Lower Layer Super Output Area (LSOA) in preparation for a longitudinal ecological analysis.

Dependent measures

Police recorded crime.

Police crime data were filtered to ensure that only race hate crimes related to anti-black/west/south Asian offences, and religious hate crimes related to anti-Islam/Muslim offences were included in the measures. In addition to total police recorded racially and religiously aggravated offences ( N = 6,572), data were broken down into three categories: racially and religiously aggravated violence against the person, criminal damage and harassment reflecting Part II of the Crime and Disorder Act 1998.

Independent measures

Social media regressors.

Twitter data were used to derive two measures. Count of Geo-coded Twitter posts —21.7 million posts were located within the 4720 London LSOAs over the study window as raw counts (Overall: mean 575; s.d. 1,566; min 0; max 75,788; Between: s.d. 1,451; min 0; max 53,345; Within: s.d. 589; min –23,108; max 28,178). Racial and Religious Online Hate Speech —the London geo-coded Twitter corpus was classified as ‘hateful’ or not (Overall: mean 8; s.d. 15.84; min 0; max 522; Between: s.d. 12.57; min 0; max 297; Within: s.d. 9.63; min –120; max 440). Working with computer scientists, a supervised machine learning classifier was built using the Weka tool to distinguish between ‘hateful’ Twitter posts with a focus on race (in this case anti-black/middle-eastern) and religion (in this case anti-Islam/Muslim), and more general non-‘hateful’ posts. A gold standard dataset of human-coded annotations was generated to train the machine classifier based on a sample of 2,000 tweets. In relation to each tweet, human coders were tasked with selecting from a ternary set of classes (‘yes’, ‘no’, and ‘undecided’) in response to the following question: ‘is this text offensive or antagonistic in terms of race, ethnicity or religion?’ Tweets that achieved 75 per cent agreement and above from four human coders were transposed into a machine learning training dataset (undecided tweets were dropped). Support Vector Machine with Bag of Words feature extraction emerged as most accurate machine learning model, with a precision of 0.89, a retrieval of 0.69 and an overall F-measure of 0.771, above the established threshold of 0.70 in the field of information retrieval ( van Rijsbergen, 1979 ). The final hate dataset consisted of 294,361 tweets, representing 1.4 per cent of total geo-coded tweets in the study window (consistent with previous research, see Williams and Burnap, 2016 ; Williams and Burnap, 2018 ). Our measure of online hate speech is not designed to correspond directly to online hate acts deemed as criminal in the UK law. The threshold for criminal hate speech is high, and legislation is complex (see CPS guidance and Williams et al. , 2019 ). Ours is a measure of online inter-group racial and/or religious tension, akin to offline community tensions that are routinely picked up by neighborhood policing teams. Not all manifestations of such tension are necessarily criminal, but they may be indicative of pending activity that may be criminal. Examples of hate speech tweets in our sample, include: ‘Told you immigration was a mistake. Send the #Muzzies home!’; ‘Integrate or fuck off. No Sharia law. #BurntheQuran’; and ‘Someone fucking knifed on my street! #niggersgohome’. 3

Census regressors

Four measures were derived from 2011 census data based on the literature that estimated hate crime using ecological factors (e.g. Green, 1998 ; Espiritu, 2004 ). These include proportion of population: (1) with no qualifications, (2) aged 16–24, (3) long-term unemployed, and (4) black and minority ethnic (BAME). 4

Methods of estimation

The estimation process began with a single-level model that collapsed the individual 8 months worth of police hate crime and Twitter data into one time period. Because of the skewed distribution of the data and the presence of over-dispersion, a negative binomial regression model was selected. These non-panel models provide a baseline against which to compare the second phase of modelling. To incorporate the temporal variability of police recorded crime and Twitter data, the second phase of modelling adopted a random- and fixed-effects regression framework. The first step was to test if this framework was an improvement upon the non-panel model that did not take into account time variability. The Breusch–Pagan Lagrange multiplier test revealed random-effects regression was favourable over single-level regression. Random effects modelling allows for the inclusion of time-variant (police and Twitter data) and time-invariant variables (census measures). Both types of variable were grouped into the 4720 LSOA areas that make up London. Using LSOA as the unit of analysis in the models allowed for an ‘ecological’ appraisal of the explanatory power of race and religious hate tweets for estimating police recorded racially and religiously aggravated offences ( Sampson, 2012 ). When the error term of an LSOA is correlated with the variables in the model, selection bias results from time-invariant unobservables, rendering random effects inconsistent. The alternative fixed-effects model that is based on within-borough variation removes such sources of bias by controlling for observed and unobserved ecological factors. Therefore, both random- and fixed-effects estimates are produced for all models. 5 A Poisson model was chosen over negative binomial, as the literature suggests the latter does not produce genuine fixed-effects (FE) estimations. 6 In addition, Poisson random-/fixed-effects (RE/FE) estimation with robust standard errors is recognized as the most reliable option in the presence of over-dispersion ( Wooldridge, 1999 ). There were no issues with multicollinearity in the final models.

Figures 4–7 show scatterplots with a fitted lined (95% confidence interval in grey) of the three types of racially and religiously aggravated offences (plus combined) by race and religious hate speech on Twitter over the whole eight-month period. Scatterplots indicated a positive relationship between the variables. Two LSOAs emerged as clear outliers (LSOA E01004736 and E01004763: see Figures 8–9 ) and required further inspection (not included in scatter plots). A jackknife resampling method was used to confirm if these LSOAs (and others) were influential points. This method fits a negative binomial model in 4,720 iterations while suppressing one observation at a time, allowing for the effect of each suppression on the model to be identified; in plain terms, it allows us to see how much each LSOA influences the estimations. Inspection of a scatterplot of dfbeta values (the amount that a particular parameter changes when an observation is suppressed) confirmed the above LSOAs as influential points, and in addition E01002444 (Hillingdon, in particular Heathrow Airport) and E01004733 (Westminster). The decision was made to build all models with and without outliers to identify any significant differences. The inclusion of all four outliers did change the magnitude of effects, standard errors and significance levels for some variables and model fit, so they were removed in the final models.

Hate tweets by R & R aggravated violence against the person

Hate tweets by R & R aggravated harassment

Hate tweets by R & R aggravated criminal damage

Hate tweets by R & R aggravated offences combined

Outlier LSOA E01004736

Outlier LSOA E01004763

Table 1 presents results from the negative binomial models for each type of racially and religiously aggravated crime category. These models do not take into account variation over time, so estimates should be considered as representing statistical associations covering the whole eight-month period of data collection, and a baseline against which to compare the panel models presented later. The majority of the census regressors emerge as significantly predictive of all racially and religiously aggravated crimes, broadly confirming previous hate crime research examining similar factors and partly supporting Hypothesis 1. Partly supporting Green (1998) and Nandi (2017) the proportion of the population that is BAME emerged as positively associated with all race and religious hate crimes, with the greatest effect emerging for racially or religiously aggravated violence against the person. Partly confirming work by Bobo and Licari (1989) models shows a positive relationship between the proportion of the population with no qualifications and racially and religiously aggravated violence, criminal damage and total hate crime, but the association only emerged as significant for criminal damage. Proportion of the population aged 16–24 only emerged as significant for criminal damage and total hate crimes, and the relationship was negative, partly contradicting previous work ( Ray et al. , 2004 ; Williams and Tregidga, 2014 ). Like Espiritu (2004) and Ray et al. (2004) , the models show that rates of long-term unemployment were positively associated with all race and religious hate crimes. Although this variable had the greatest effect in the models, we found an inverted U-shape curvilinear relationship (indicated by the significant quadratic term). Figure 10 graphs the relationship, showing as the proportion of the long-term unemployed population increases victimization increases to a mid-turning point of 3.56 per cent where victimization begins to decrease.

Negative binomial models (full 8-month period, N = 4,270)

Notes: Because of the presence of heteroskedasticity robust standard errors are presented. * p < 0.05; ** p < 0.01; *** p < 0.001. All models significant at the 0.0000 level.

Plot of curvilinear relationship between long term unemployment and racially and religiously aggravated crime.

This finding at first seems counter-intuitive, but a closer inspection of the relationship between the proportion of the population that is long-term unemployed and the proportion of the population that is BAME reveals a possible explanation. LSOAs with very high long-term unemployment and BAME populations overlap. Where this overlap is significant, we find relatively low rates of hate crime. For example, LSOA E01001838 in Hackney, in particular the Frampton Park Estate area has 6.1 per cent long-term unemployment, a 68 per cent BAME population and only 2 hate crimes, and LSOA E01003732 in Redbridge has 5.6 per cent long-term unemployment, a 76 per cent BAME population, and only 2 hate crimes. These counts of hate crime either are below or are only slightly above the mean for London (mean = 1.39, maximum = 390). We know from robust longitudinal analysis by Nandi et al. (2017) that minority groups living in very high majority white areas are significantly more likely to report experiencing racial harassment. This risk decreases in high multicultural areas where there is low support for far right groups, such as London. Simple regression (not shown here) where the BAME population proportion was included as the only regressor does show an inverted U-shape relationship with all hate crimes, with the risk of victimization decreasing when the proportion far outweighs the white population. However, this curve was smoothed out when other regressors were included in the models. This analysis therefore suggests that LSOAs with high rates of long-term unemployment but lower rates of hate crime are likely to be those with high proportions of BAME residents, some of whom will be long-term unemployed themselves but unlikely to be perpetrating hate crimes against the ingroup.

Supporting Hypotheses 2, all negative binomial models show online hate speech targeting race and religion is positively associated with all offline racially and religiously aggravated offences, including total hate crimes in London over an eight-month period. The magnitude of the effect is relatively even across offence category. When considering the effect of the Twitter regressors against census regressors, it must be borne in mind the unit of change needed with each regressor to affect the outcome. For example, a percentage change in the BAME population proportion in an LSOA is quite different from a change in the count of hate tweets in the same area. The latter is far more likely to vary to a much greater extent and far more rapidly (see later in this section). The associations identified in these non-panel models indicate a strong link between hateful Twitter posts and offline racially and religiously aggravated crimes in London. Yet, it is not possible with these initial models to state direction of association: We cannot say if online hate speech precedes rather than follows offline hate crime.

Table 2 presents results from RE/FE Poisson models that incorporate variation over space and time . RE/FE models have been used to indicate causal pathways in previous criminological research; however, we suggest such claims in this article would stretch the data beyond their limits. As we adopt an ecological framework, using LSOAs as our unit of analysis, and not individuals, we cannot state with confidence that area-level factors cause the outcome. There are likely sub-LSOA factors that account for causal pathways, but we were unable to observe these in this study design. Nevertheless, the results of the RE/FE models represent a significant improvement over the negative binomial estimations presented earlier and are suitable for subjecting these earlier findings to a more robust test. Indeed, FE models are the most robust test given they are based solely on within-LSOA variation, allowing for the elimination of potential sources of bias by controlling for observed and unobserved ecological characteristics ( Allison, 2009 ). In contrast, RE models only take into account the factors included as regressors. These models therefore allow us to determine if online hate speech precedes rather than follows offline hate crime.

Random and fixed-effects Poisson regression models

Notes: Table shows results of separate random and fixed effects models. To determine if RE or FE is preferred the Hausman test can be used. However, this has been shown to be inefficient, and we prefer not to rely on it for interpreting our models (see Troeger, 2008 ). Therefore, both RE and FE results should be considered together. Because of the presence of heteroskedasticity robust standard errors are presented. Adjusted R 2 for random effects models only. * p < 0.05; ** p < 0.01; *** p < 0.001. All models significant at the 0.0000 level.

The RE/FE modelling was conducted in three stages (Models A to C) to address Hypothesis 3—to assess the magnitude of the change in the variance explained in the outcomes when online hate speech is added as a regressor. Model A includes only the census regressors for the RE estimations, and for all hate crime categories, broadly similar patterns of association emerge compared to the non-panel models. The variance explained by the set of census regressors ranges between 2 per cent and 6 per cent. Such low adjusted R-square values are not unusual for time-invariant regressors in panel models ( Allison, 2009 ).

Models B and C were estimated with RE and FE and introduce the Twitter variables of online hate speech and total count of geo-coded tweets. Model B introduces online hate speech alone, and both RE and FE results show positive significant associations with all hate crime categories. The largest effect in the RE models emerges for harassment (IRR 1.004). For every unit increase in online hate speech a corresponding 0.004 per cent unit increase is observed in the dependent. Put in other terms, an increase of 100 hate tweets would correspond to a 0.4 per cent increase, and an increase of 1,000 tweets would correspond to a 4 per cent increase in racially or religiously aggravated harassment in a given month within a given LSOA. Given we know hate speech online increases dramatically in the aftermath of trigger events (Williams and Burnap, 2015), the first example of an increase of 100 hate tweets in an LSOA is not fanciful. The magnitude of the effect with harassment, compared to the other hate offences, is also expected, given hate-related public order offences, that include causing public fear, alarm and distress, also increased most dramatically in the aftermath the ‘trigger’ events alluded to above (accounting for 56 per cent of all hate crimes recorded by police in 2017/18 ( Home Office, 2018) ). The adjusted R-square statistic for Model B shows large increases in the variance explained in the dependents by the inclusion of online hate speech as a regressor, ranging between 13 per cent and 30 per cent. Interpretation of these large increases should be tempered given time-variant regressors can exert a significant effect in panel models ( Allison, 2009 ). Nonetheless, the significant associations in both RE and FE models and the improvement in the variance explained provide strong support for Hypotheses 2 and 3.

Model C RE and FE estimations control for total counts of geo-coded Tweets, therefore eradicating any variance explained by the hate speech regressor acting as a proxy for population density ( Malleson and Andresen, 2015 ). In all models, the direction of relationship and significance between online hate speech and hate crimes does not change, but the magnitude of the effect does decrease, indicating the regressor was likely also acting, albeit to a small extent, as proxy for population density. The FE models also include an interaction variable between the time-invariant regressor proportion of the population that is BAME and the time-variant regressor online hate speech. The interaction term was significant for all hate crime categories with the strongest effect emerging for racially and religiously aggravated violence against the person. Figure 11 presents a predicted probability plot combining both variables for the outcome of violent hate crime. In an LSOA with a 70 per cent BAME population with 300 hate tweets posted a month, the incidence rate of racially and religiously aggravated violence is predicted to be between 1.75 and 2. However, it must be borne in mind when interpreting these predictions, the skewed distribution of the sample. Just over 70 per cent of LSOAs have a BAME population of 50 per cent or less and 150 or less hate tweets per month, therefore the probability for offences in these areas is between 1 and 1.25 (lower-left dark blue region of the plot). This plot provides predictions based on the model estimates, meaning if in the future populations and hate tweets were to increase toward the upper end of the spectrums, these are the probabilities of observing the racially and religiously aggravated violence in London.

Predicted probability of R & R agg. violence by BAME population proportion and hate tweet count

Our results indicate a consistent positive association between Twitter hate speech targeting race and religion and offline racially and religiously aggravated offences in London. Previous published work indicated an association around events that acted as ‘triggers’ for on and offline hate acts. This study confirms this association is consistent in the presence and absence of events. The models allowed us to provide predictions of the incidence rate of offline offences by proportion of the population that is BAME and the count of online hate tweets. The incidence rate for near three-quarters of LSOAs within London when taking into account these and other factors in the models remains below 1.25. Were the number of hate tweets sent per month to increase dramatically in an area with a high BAME population, our predictions suggest much higher incidence rates. This is noteworthy, given what we know about the impact of ‘trigger’ events and hate speech, and indicates that the role of social media in the process of hate victimization is non-trivial.

Although we were not able to directly test the role of online polarization and far right influence on the prevalence of offline hate crimes, we are confident that our focus on online hate speech acted as a ‘signature’ measure of these two phenomena. Through the various mechanisms outlined in the theoretical work presented in this article, it is plausible to conclude that hate speech posted on social media, an indicator of extreme polarization, influences the frequency of offline hate crimes. However, it is unlikely that online hate speech is directly causal of offline hate crime in isolation. It is more likely the case that social media is only part of the formula, and that local level factors, such as the demographic make-up of neighbourhoods (e.g. black and minority ethnic population proportion, unemployment) and other ecological level factors play key roles, as they always have in estimating hate crime ( Green, 1998 ; Espiritu, 2004 ; Ray et al. , 2004 ). What this study contributes is a data and theory-driven understanding of the relative importance of online hate speech in this formula. If we are to explain hate crime as a process and not a discrete act, with victimization ranging from hate speech through to violent victimization, social media must form part of that understanding ( Bowling, 1993 ; Williams and Tregidga, 2014 ).

Our results provide an opportunity to renew Bowling’s (1993) call to see racism as a continuity of violence, threat and intimidation. We concur that hate crimes must be conceptualized as a process set in geographical, social, historical and political context. We would add that ‘technological’ context is now a key part of this conceptualization. The enduring quality of hate victimization, characterized by repeated or continuous insult, threat, or violence now extends into the online arena and can be linked to its offline manifestation. We argue that hate speech on social media extends ‘climates of unsafety’ experienced by minority groups that transcend individual instances of victimization ( Stanko, 1990 ). Online hate for many minorities is part and parcel of everyday life—as Pearson et al. (1989 : 135) state ‘A black person need never have been the actual victim of a racist attack, but will remain acutely aware that she or he belongs to a group that is threatened in this manner’. This is no less true in the digital age. Social media, through various mechanisms such as unfettered use by the far right, polarization, events, and psychological processes such as deindividuation, has been widely infected with a casual low-level intolerance of the racial Other .

Our study informs the ongoing debate on ‘predictive policing’ using big data and algorithms to find patterns at scale and speed, hitherto unrealizable in law enforcement ( Kaufmann et al. , 2019 ). Much of the criminological literature is critical. The process of pattern identification further embeds existing power dynamics and biases, sharpens the focus on the symptoms and not the causes of criminality, and supports pre-emptive governance by new technological sovereigns ( Chan and Bennett Moses, 2017 ). These valid concerns pertain mainly to predictive policing efforts that apply statistical models to data on crime patterns, offender histories, administrative records and demographic area profiles. These models and data formats tend to produce outcomes that reflect existing patterns and biases because of their historical nature. Our work mitigates some of the existing pitfalls in prediction efforts in three ways: (1) The data used in estimating patterns are not produced by the police, meaning they are immune from inherent biases normally present in the official data generation process; (2) social media data are collected in real-time, reducing the error introduced by ‘old’ data that are no longer reflective of the context; and (3) viewing minority groups as likely victims and not offenders, while not addressing the existing purported bias in ongoing predictive policing efforts, demonstrates how new forms of data and technology can be tailored to achieve alternative outcomes. However, the models reported in this article are not without their flaws, and ahead of their inclusion in real-life applications, we would warn that predictions alone do not necessarily lead to good policing on the streets. As in all statistics, there are degrees of error, and models are only a crude approximation of what might be unfolding on the ground. In particular, algorithmic classification of hate speech is not perfect, and precision, accuracy and recall decays as language shifts over time and space. Therefore, any practical implementation would require a resource-intensive process that ensured algorithms were updated and tested frequently to avoid unacceptable levels of false positives and negatives.

Finally, we consider the methodological implications of this study are as significant as those outlined by Bowling (1993) . Examining the contemporary hate victimization dynamic requires methods that are able to capture both time and space variations in both online and offline data. Increasing sources of data on hate is also important due to continued low rates of reporting. We demonstrated how administrative (police records), survey (census) and new forms of data (Twitter) can be linked to study hate in the digital age. Surveys, interviews and ethnographies should be complemented by these new technological methods of enquiry to enable a more complete examination of the social processes which give rise to contemporary hate crimes. In the digital age, computational criminology, drawing on dynamic data science methods, can be used to study the patterning of online hate speech victimization and associated offline victimization. However, before criminologists and practitioners incorporate social media into their ‘data diets’, awareness of potential forms of bias in these new forms of data is essential. Williams et al . (2017a) identified several sources of bias, including variations in the use of social media (e.g. Twitter being much more popular with younger people). This is particularly pertinent given the recent abandonment of Twitter by many far right users following a clamp-down on hate speech in Europe. A reduction in this type of user may see a corresponding decrease in hate tweets, as they flock to more underground platforms, such as 8chan, 4chan, Gab and Voat, that are currently more difficult to incorporate into research and practical applications. The data used in this study were collected at a time before the social media giants introduced strict hate speech policies. Nonetheless, we would expect hate speech to be displaced, and in time data science solutions will allow us to follow the hate wherever it goes.

The government publication of ‘The Response to Racial Attacks and Harassment’ in 1989 saw a sea-change in the way criminal justice agencies and eventually the public viewed hate crime the United Kingdom ( Home Office, 1989 ). In 2019, the government published its Online Harms White Paper that tries to achieve the same with online hate ( Cabinet Office, 2019 ). Over the past decade, online hate victims have failed to convince others that they are undeserved targets of harm that is sufficiently serious to warrant collective concern, due to insufficient empirical credibility and their subsequent unheard calls for recognition. This research shows that online hate victimization is part of a wider process of harm that can begin on social media and then migrate to the physical world. Qualitative work shows direct individual level links between online and offline hate victimization ( Awan and Zempi, 2017 ). Our study extends this to the ecological level at the scale of the UK’s largest metropolitan area. Despite this significant advancement, we were unable to examine sub-LSOA factors, meaning the individual level mechanisms responsible for the link between online and offline hate incidents remain to be established by more forensic and possibly qualitative work. The combination of the data science-driven results of this study and future qualitative work has the potential to address the reduced capacity of the police to gain intelligence on terrestrial community tensions that lead to hate crimes. Such a technological solution may even assist in the redressing of the bias reportedly present in ‘predictive policing’ efforts, by refocussing the algorithmic lens away from those historically targeted by police, onto those that perpetrate harms against minorities.

This work was supported by the Economic and Social Research Council grant: ‘Centre for Cyberhate Research and Policy: Real-Time Scalable Methods & Infrastructure for Modelling the Spread of Cyberhate on Social Media’ (grant number: ES/P010695/1) and the US Department of Justice National Institute for Justice grant: ‘Understanding Online Hate Speech as a Motivator for Hate Crime’ (grant number: 2016-MU-MU-0009)

Allison , D. P . ( 2009 ), Fixed Effects Regression Models . Sage .

Google Scholar

Google Preview

Awan , I. and Zempi , I . ( 2017 ), ‘I Will Blow Your Face Off’—Virtual and Physical World Anti-Muslim Hate Crime’, British Journal of Criminology , 57 : 362 – 80

Burnap , P. , Rana , O. , Williams , M. , Housley , W. , Edwards , A. , Morgan , J. , Sloan , L. and Conejero , J . ( 2014 ), ‘COSMOS: Towards an Integrated and Scalable Service for Analyzing Social Media on Demand’, IJPSDS , 30 : 80 – 100 .

Burnap , P. and Williams , M. L . ( 2015 ), ‘Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making’ , Policy & Internet. 7 : 223 – 42 .

———. ( 2016 ), ‘Us and Them: Identifying Cyber Hate on Twitter across Multiple Protected Characteristics’ . EPJ Data Science, 5 : 1 – 15

Bail , C. A. , Argyle , L. P. , Brown , T. W. , Bumpus , J. P. , Chen , H. , Hunzaker , M. B. F. , Lee , J. , Mann , M. , Merhout , F. and Volfovsky , A . ( 2018 ), ‘Exposure to Opposing Views on Social Media Can Increase Political Polarization’ PNAS , 115 : 9216 – 21 .

Bobo , L. and Licari , F. C . ( 1989 ), ‘Education and Political Tolerance: Testing The Effects of Cognitive Sophistication and Target Group Affect’ , Public Opinion Quarterly 53 : 285 – 308 .

Bowling , B . ( 1993 ), ‘Racial Harassment and The Process of Victimisation: Conceptual and Methodological Implications for The Local Crime Survey’ , British Journal of Criminology , 33 : 231 – 50 .

Boxell , L. , Gentzkow , M. , and Shapiro , J. M . ( 2017 ), ‘Greater Internet Use Is Not Associated With Faster Growth In Political Polarization Among Us Demographic Groups’ , PNAS , 114 : 10612 – 0617 .

Brady , W. J. , Wills , J. A. , Jost , J. T. , Tucker , J. A. and Van Bavel , J. J . ( 2017 ), ‘Emotion Shapes The Diffusion of Moralized Content in Social Networks’ , PNAS , 114 : 7313 – 18 .

Cabinet Office. ( 2019 ) Internet Safety White Paper . Cabinet Office

Chan , J. and Bennett Moses , L . ( 2017 ), ‘Making Sense of Big Data for Security’ , British Journal of Criminology , 57 : 299 – 319 .

Chokshi , N . ( 2019 ), PewDiePie in Spotlight After New Zealand Shooting . New York Times .

CPS. ( 2018 ), Hate Crime Report 2017–18 . Crown Prosecutions Service .

Crest ( 2017 ), Russian Influence and Interference Measures Following the 2017 UK Terrorist Attacks . Centre for Research and Evidence on Security Threats .

Debois , E. and Blank , G . ( 2017 ), ‘The Echo Chamber is Over-Stated: The Moderating Effect of Political Interest and Diverse Media’ , Information, Communication & Society , 21 : 729 – 45 .

Demos. ( 2017 ), Anti-Islamic Content on Twitter . Demos

Espiritu , A . ( 2004 ), ‘Racial Diversity and Hate Crime Incidents’ , The Social Science Journal , 41 : 197 – 208 .

Green , D. P. , Strolovitch , D. Z. and Wong , J. S . ( 1998 ), ‘Defended Neighbourhoods, Integration and Racially Motivated Crime’ , American Journal of Sociology , 104 : 372 – 403 .

Greer , C. and McLaughlin , E . ( 2010 ), ‘We Predict a Riot? Public Order Policing, New Media Environments and the Rise of the Citizen Journalist’ , British Journal of Criminology , 50 : 1041 – 059 .

Greig-Midlane , J . ( 2014 ), Changing the Beat? The Impact of Austerity on the Neighbourhood Policing Workforce . Cardiff University .

Hanes , E. and Machin , S . ( 2014 ), ‘Hate Crime in the Wake of Terror Attacks: Evidence from 7/7 and 9/11’ , Journal of Contemporary Criminal Justice , 30 : 247 – 67 .

Hawdon , J. , Oksanen , A. and Räsänen , P . ( 2017 ), ‘Exposure To Online Hate In Four Nations: A Cross-National Consideration’ , Deviant Behavior , 38 : 254 – 66 .

Hern , A . ( 2018 ), Facebook Protects Far-Right Activists Even After Rule Breaches . The Guardian .

HMICFRS. ( 2018 ), Understanding the Difference: The Initial Police Response to Hate Crime . Her Majesty’s Inspectorate of Constabulary and Fire and Rescue Service .

Home Office. ( 1989 ), The Response to Racial Attacks and Harassment: Guidance for the Statutory Agencies, Report of the Inter-Departmental Racial Attacks Group . Home Office .

———. ( 2018 ), Hate Crime, England and Wales 2017/18 . Home Office .

Hope Not Hate. ( 2019 ), State of Hate 2019 . Hope Not Hate .

Howard , P. N. and Kollanyi , B . ( 2016 ), Bots, #StringerIn, and #Brexit: Computational Propeganda during the UK-EU Referendum . Unpublished Research Note. Oxford University Press.

Kaufmann , M. , Egbert , S. and Leese , M . ( 2019 ), ‘Predictive Policing and the Politics of Patterns’ , British Journal of Criminology , 59 : 674 – 92 .

Lehman , J . ( 2014 ), A Brief Explanation of the Overton Window . Mackinac Center for Public Policy .

Malleson , N. and Andresen , M. A . ( 2015 ), ‘Spatio-temporal Crime Hotspots and The Ambient Population’ , Crime Science , 4 : 1 – 8 .

Müller , K. and Schwarz , C. ( 2018a ), Making America Hate Again? Twitter and Hate Crime Under Trump . Unpublished working paper. University of Warwick.

———. ( 2018b ), Fanning the Flames of Hate: Social Media and Hate Crime . Unpublished working paper. University of Warwick.

Nandi , A. , Luthra , R. , Saggar , S. and Benzeval , M . ( 2017 ), The Prevalence and Persistence of Ethnic and Racial Harassment and Its Impact on Health: A Longitudinal Analysis . University of Essex .

Ofcom. ( 2018a ), Children and Parents: Media Use and Attitudes . Ofcom

———. ( 2018b ), News Consumption in the UK: 2018 . Ofcom .

———. ( 2018c ), Adults’ Media Use and Attitudes Report . Ofcom

ONS. ( 2017 ), CSEW Estimates of Number of Race and Religion Related Hate Crime in England and Wales, 12 Months Averages, Year Ending March 2014 to Year Ending March 2017 . Office for National Statistics .

Pearson , G. , Sampson , A. , Blagg , H. , Stubbs , P. and Smith , D. J . ( 1989 ), ‘Policing Racism’, in R. Morgan and D. J. Smith , eds., Coming to Terms with Policing: Perspectives on Policy . Routledge .

Peddell , D. , Eyre , M. , McManus , M. and Bonworth , J . ( 2016 ), ‘Influences and Vulnerabilities in Radicalised Lone Actor Terrorists: UK Practitioner Perspectives’ , International Journal of Police Science and Management , 18 : 63 – 76 .

Perry , B. and Olsson , P . ( 2009 ), ‘Cyberhate: The Globalisation of Hate’ , Information & Communications Technology Law , 18 : 185 – 99 .

Pew Research Centre. ( 2018 ), Americans Still Prefer Watching to Reading the News . Pew Research Centre .

Rawlinson , K . ( 2018 ), Finsbury Park-accused Trawled for Far-right Groups Online, Court Told . The Guardian .

Ray , L. , Smith , D. and Wastell , L . ( 2004 ), ‘Shame, Rage and Racist Violence’ , British Journal of Criminology , 44 : 350 – 68 .

Roberts, C., Innes, M., Williams, M. L., Tregidga, J. and Gadd, D. (2013), Understanding Who Commits Hate Crimes and Why They Do It [Project Report]. Welsh Government.

van Rijsbergen , C. J . ( 1979 ), Information Retrieval (2nd ed.), Butterworth .

Sampson , R. J . ( 2012 ), Great American City: Chicago and the Enduring Neighborhood Effect . University of Chicago Press .

Stanko . ( 1990 ), Everyday Violence . Pandora .

Stephan , W. G. and Stephan , C. W . ( 2000 ), An Integrated Threat Theory of Prejudice . Lawrence Erlbaum Associates .

Sunstein , C. R . ( 2017 ), #Republic: Divided Democracy in the Age of Social Media . Princeton University Press .

Troeger, V. E. (2008), ‘Problematic Choices: Testing for Correlated Unit Specific Effects in Panel Data’, Presented at 25th Annual Summer Conference of the Society for Political Methodology, 9–12 July 2008.

Williams, M. L. (2006), Virtually Criminal: Crime, Deviance and Regulation Online . Routledge.

Williams , M. and Burnap , P . ( 2016 ), ‘Cyberhate on Social Media in the Aftermath of Woolwich: A Case Study in Computational Criminology and Big Data’ , British Journal of Criminology , 56 : 211 – 38 .

———. ( 2018 ), Antisemitic Content on Twitter . Community Security Trust .

Williams , M. and Tregidga , J . ( 2014 ), ‘Hate Crime Victimisation in Wales: Psychological and Physical Impacts Across Seven Hate Crime Victim-types’ , British Journal of Criminology , 54 : 946 – 67 .

Williams , M. L. , Burnap , P. and Sloan , L. ( 2017a ), ‘Crime Sensing With Big Data: The Affordances and Limitations of Using Open-source Communications to Estimate Crime Patterns’ , The British Journal of Criminology , 57 : 320 – 40.

———. ( 2017b ), ‘Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation’ , Sociology , 51 : 1149 – 68 .

Williams, M. L., Eccles-Williams, H. and Piasecka, I. (2019), Hatred Behind the Screens: A Report on the Rise of Online Hate Speech . Mishcon de Reya.

Williams, M. L., Edwards, A. E., Housley, W., Burnap, P., Rana, O. F., Avis, N. J., Morgan, J. and Sloan, L. (2013), ‘Policing Cyber-Neighbourhoods: Tension Monitoring and Social Media Networks’, Policing and Society , 23: 461–81.

Wooldridge , J. M . ( 1999 ), ‘Distribution-Free Estimation of Some Nonlinear Panel Data Models’ , Journal of Econometrics , 90 : 77 – 97 .

For current CPS guidance on what constitutes an online hate offence see: https://www.cps.gov.uk/legal-guidance/social-media-guidelines-prosecuting-cases-involving-communications-sent-social-media .

Not all hate speech identified reaches the threshold for a criminal offence in England and Wales.

These are not actual tweets from the dataset but are instead constructed illustrations that maintain the original meaning of authentic posts while preserving the anonymity of tweeters (see Williams et al. 2017b for a fuller discussion of ethics of social media research).

Other census measures were excluded due to multicollinearity, including religion.

To determine if RE or FE is preferred, the Hausman test can be used. However, this has been shown to be inefficient, and we prefer not to rely on it for interpreting our models (see Troeger, 2008 ). Therefore, both RE and FE results should be considered together.

See https://www.statalist.org/forums/forum/general-stata-discussion/general/1323497-choosing-between-xtnbreg-fe-bootstrap-and-xtpoisson-fe-cluster-robust .

Email alerts

Citing articles via.

Recommend to your Library

Affiliations

Online ISSN 1464-3529
Print ISSN 0007-0955
Copyright © 2024 Centre for Crime and Justice Studies (formerly ISTD)
About Oxford Academic
Publish journals with us
University press partners
What we publish
New features
Open access
Institutional account management
Rights and permissions
Get help with access
Accessibility
Advertising
Media enquiries
Oxford University Press
Oxford Languages
University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

Copyright © 2024 Oxford University Press
Cookie settings
Cookie policy
Privacy policy
Legal notice

This Feature Is Available To Subscribers Only

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Hate speech, toxicity detection in online social media: a recent survey of state of the art and opportunities

Regular Contribution
Published: 25 September 2023
Volume 23 , pages 577–608, ( 2024 )

Cite this article

Anjum 1 &
Rahul Katarya ORCID: orcid.org/0000-0001-7763-291X 1

818 Accesses

Explore all metrics

Information and communication technology has evolved dramatically, and now the majority of people are using internet and sharing their opinion more openly, which has led to the creation, collection and circulation of hate speech over multiple platforms. The anonymity and movability given by these social media platforms allow people to hide themselves behind a screen and spread the hate effortlessly. Online hate speech (OHS) recognition can play a vital role in stopping such activities and can thus restore the position of public platforms as the open marketplace of ideas. To study hate speech detection in social media, we surveyed the related available datasets on the web-based platform. We further analyzed approximately 200 research papers indexed in the different journals from 2010 to 2022. The papers were divided into various sections and approaches used in OHS detection, i.e., feature selection, traditional machine learning (ML) and deep learning (DL). Based on the selected 111 papers, we found that 44 articles used traditional ML and 35 used DL-based approaches. We concluded that most authors used SVM, Naive Bayes, Decision Tree in ML and CNN, LSTM in the DL approach. This survey contributes by providing a systematic approach to help researchers identify a new research direction in online hate speech.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Hate Speech Detection in Multi-social Media Using Deep Learning

A literature survey on multimodal and multilingual automatic hate speech identification

Anusha Chhabra & Dinesh Kumar Vishwakarma

Multi-step Online Hate Speech Detection and Classification Using Sentiment and Sarcasm Features

Data availability statements.

Data generated or analyzed during this study are included in this published article.

https://hatespeechdata.com/ .

https://semeval.github.io/SemEval2021/tasks.html .

https://hasocfire.github.io/hasoc/2020/index.html .

https://swisstext-and-konvens-2020.org/shared-tasks/ .

https://sites.google.com/view/trac2/live?authuser=0 .

https://ai.Facebook.com/blog/hateful-memes-challenge-and-data-set/ .

Newman, N., Fletcher, R., Kalogeropoulos, A. et al.: Reuters Institute Digital News Report 2018 (2018)

Global social media ranking (2019). https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/

Diwhu, G., Ghdwk, W.K.H., Ihpdoh, R.I.D., Vwxghqw, X.: Automated detection of hate speech towards woman on Twitter. In: International Conference On Computer Science And Engineering. pp 7–10 (2018)

Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput Surv (2018). https://doi.org/10.1145/3232676

Article Google Scholar

bbc Facebook launches initiative to fight online hate speech. In: bbc. ps:// www.bbc.com/news/technology-40371869

Organisation International Alert (2016) A plugin to counter hate speech online. https://europeanjournalists.org/mediaagainsthate/hate-checker-plugin-to-counter-hate-speech-online/

Salminen, J., Guan, K., Jung, S.G. et al.: A literature review of quantitative persona creation. In: Conf Hum Factors Comput Syst - Proc 1–15 (2020). https://doi.org/10.1145/3313831.3376502

Biere, S., Analytics, M.B.: Hate speech detection using natural language processing techniques. VRIJE Univ AMSTERDAM 30 (2018)

DePaula, N., Fietkiewicz, K.J., Froehlich, T.J. et al.: Challenges for social media: misinformation, free speech, civic engagement, and data regulations. In: Proceedings of the Association for Information Science and Technology, pp. 665–668 (2018)

Varade, R.S., Pathak, V.: Detection of hate speech in hinglish language. In: ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (2020)

Djuric, N., Zhou, J., Morris, R. et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web. Association for Computing Machinery, New York, NY, USA, pp. 29–30 (2015)

Davidson, T., Warmsley, D,. Macy, M., Weber, I.: Automated Hate Speech Detection and the Problem of Offensive Language. (2017). arXiv170304009v1 [csCL] 11 Mar 2017 Autom

Miró-Llinares, F., Moneva, A., Esteve, M.: Hate is in the air! But where? Introducing an algorithm to detect hate speech in digital microenvironments. Crime Sci. 7 , 1–12 (2018). https://doi.org/10.1186/s40163-018-0089-1

Daniel Burke The four reasons people commit hate crimes. In: CNN. https://edition.cnn.com/2017/06/02/us/who-commits-hate-crimes/index.html

Equality and Diversity Forum (2018) Hate Crime: Cause and effect | A research synthesis. Equal Divers Forum

ONTARIO PO, GENERAL MOA: CROWN POLICY MANUAL (2005). https://files.ontario.ca/books/crown_prosecution_manual_english_1.pdf

Räsänen, P., Hawdon, J., Holkeri, E., et al.: Targets of online hate: examining determinants of victimization among young finnish Facebook users. Violence Vict. 31 , 708–725 (2016)

Contributors, W.: Hate crime. In: Wikipedia (2020). https://en.wikipedia.org/wiki/Hate_crime

twitter Twitter policy against Hate speech. https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy

facebook Hate speech. https://www.facebook.com/communitystandards/hate_speech

Instagram Instagram policy for hate speech. https://help.instagram.com/477434105621119

Youtube YouTube hate policy. https://support.google.com/youtube/answer/2801939?hl=en

Dr. Amarendra Bhushan Dhiraj: Countries Where Cyber-bullying Was Reported The Most In 2018 (2018)

United nations: Universal Declaration of Human Rights (1948)

Nations S-G of the U: European Convention on Human Rights, the International Covenant on Civil and Political Rights (1966)

Gagliardone, I., Patel, A., Pohjonen, M.: Mapping and analysing hate speech online. In: SSRN Electronic Journal. p 41 (2015)

Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, Pp. 1–10 (2017)

Nastiti, F.E., Prastyanti, R.A., Taruno, R.B., Hariyadi, D.: Social media warfare in Indonesia political campaign: a survey. In: Proceedings - 2018 3rd International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2018. IEEE, pp 49–53 (2019)

Kumar, A., Sachdeva, N.: Cyberbullying detection on social multimedia using soft computing techniques: a meta-analysis. Multimed. Tools Appl. (2019). https://doi.org/10.1007/s11042-019-7234-z

Waqas, A., Salminen, J., Jung, S., et al.: Mapping online hate: a scientometric analysis on research trends and hotspots in research on online hate. PLoS ONE 14 , 1–21 (2019). https://doi.org/10.1371/journal.pone.0222194

Waseem, Z., Hovy, D.: Hateful symbols or hateful people ? Predictive features for hate speech detection on Twitter. In: Association for Computational Linguistics Proceedings of NAACL-HLT. pp 88–93 (2016)

Vigna, F. Del, C. A., Orletta, F.D. et al.: Hate me , hate me not : Hate speech detection on Facebook. In: In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17), Venice, Italy. pp 86–95 (2017)

Agarwal S, Sureka A (2017) But I did not mean it! - Intent classification of racist posts on tumblr. In: Proceedings - 2016 European Intelligence and Security Informatics Conference, EISIC 2016. IEEE, pp 124–127

CodaLab Competition. https://competitions.codalab.org/competitions/19935 .

Wang, G., Wang, B., Wang, T. et al: Whispers in the dark: Analysis of an anonymous social network. In: Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC. pp 137–149 (2014)

Ziai, A.: cohen kappa. In: Medium (2017). https://towardsdatascience.com/inter-rater-agreement-kappas-69cd8b91ff75

Gambäck. B,, Sikdar, U.K.: Using Convolutional Neural Networks to Classify Hate-Speech. In: Proceedings ofthe First Workshop on Abusive Language Online. pp 85–90 (2017)

Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep Learning for Hate Speech Detection in Tweets. In: arXiv:1706.00188v1 [cs.CL]. p 2 (2017)

Park, J.H., Fung, P.: One-step and two-step classification for abusive language detection on Twitter. In: Association for Computational Linguistics Proceedings of the First Workshop on Abusive Language Online, pages 41–45, Vancouver, Canada, July 30. pp 41–45 (2017)

Waseem, Z.: Are you a racist or am i seeing things ? Annotator influence on hate speech detection on Twitter. In: Proceedings of2016 EMNLP Workshop on Natural Language Processing and Computational Social Science. pp 138–142 (2016)

Jha, A: When does a Compliment become Sexist ? Analysis and Classification of Ambivalent Sexism using Twitter Data. In: Proceedings ofthe Second Workshop on Natural Language Processing. pp 7–16 (2017)

Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language ∗. In: arXiv (2017)

Alorainy, W., Burnap, P., Liu, H.A.N., Williams, M.L.: “ The Enemy Among Us ”: detecting cyber hate speech with threats-based othering language embeddings. ACM Trans. Web 13 (2019)

Nobata, C., Tetreault, J.: Abusive language detection in online user content. In: International World Wide Web Conference. Pp. 145–153 (2016)

Al, Z., Amr, M.: Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Springer Comput. (2019) https://doi.org/10.1007/s00607-019-00745-0

Detecting Insults in Social Commentary. https://www.kaggle.com/c/detecting-insults-in-social-commentary

MacAvaney, S., Yao, H.-R., Yang, E., Russell, K., Goharian, N.F.O. (2019) Hate speech detection: challenges and solutions. PLoS ONE 14(8): e0221152. https://doi.org/10.1371/journal.pone.0221152 . https://sites.google.com/view/trac1/shared-task

Timothy Quinn: Hatebase database. (2017). https://www.hatebase.org/

Charitidis, P., Doropoulos, S., Vologiannidis, S., et al.: Towards countering hate speech against journalists on social media. Online Soc. Netw. Media 17 , 10 (2020). https://doi.org/10.1016/j.osnem.2020.100071

Albadi, N., Kurdi, M., Mishra, S.: Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018. IEEE, (pp. 69–76) (2018)

Al-Hassan, A., Al-Dossari, H.: Detection of hate speech in Arabic tweets using deep learning. Multimed. Syst. (2021). https://doi.org/10.1007/s00530-020-00742-w

Ousidhoum, N., Lin, Z., Zhang, H. et al.: Multilingual and multi-aspect hate speech analysis. EMNLP-IJCNLP 2019 - 2019 Conf Empir Methods Nat Lang Process 9th Int Jt Conf Nat Lang Process Proc Conf 4675–4684 (2020). https://doi.org/10.18653/v1/d19-1474

Mulki, H., Haddad, H., Bechikh Ali, C., Alshabani, H.: L-HSAB: A Levantine Twitter dataset for hate speech and abusive language, pp. 111–118 (2019). https://doi.org/10.18653/v1/w19-3512

Ljubešić, N., Erjavec, T., Fišer, D.: Datasets of Slovene and Croatian moderated news comments, pp. 124–131 (2019). https://doi.org/10.18653/v1/w18-5116

Dinakar, K.: Modeling the detection of textual cyberbullying. In: 2011, Association for the Advancement of Artificial Intelligence, pp 11–17 (2011)

Greevy, E., Smeaton, A.F.: Classifying racist texts using a support vector machine. In: ACM Proceeding, pp 468–469 (2004)

Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6 , 13825–13835 (2018). https://doi.org/10.1109/ACCESS.2018.2806394

Rodriguez, A., Argueta, C., Chen, Y.L.: Automatic detection of hate speech on facebook using sentiment and emotion analysis. In: 1st International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2019. Pp. 169–174 (2019)

Hall, L.O., WPKNVCKWB,: snopes.com: Two-striped Telamonia Spider. J Artif Intell Res 2009 , 321–357 (2006). https://doi.org/10.1613/jair.953

Raufi, B., Xhaferri, I.: Application of machine learning techniques for hate speech detection in mobile applications. In: 2018 International Conference on Information Technologies, InfoTech 2018 - Proceedings. IEEE, pp 1–4 (2018)

Waseem, Z., Thorne, J., Bingel, J.: Bridging the gaps: multi task learning for domain transfer of hate speech detection. In: Online Harassment, Human–Computer Interaction Series, pp 29–55 (2018)

Lynn, T., Endo, P.T., Rosati, P., et al.: Data set for automatic detection of online misogynistic speech. Data Br. 26 , 104223 (2019). https://doi.org/10.1016/j.dib.2019.104223

Plaza-Del-Arco, F.-M., Molina-González, M.D., Ureña-López, L.A., Martín-Valdivia, M.T.: Detecting Misogyny and Xenophobia in Spanish Tweets using language technologies. ACM Trans. Internet Technol. 20 , 1–19 (2020). https://doi.org/10.1145/3369869

Pelzer, B., Kaati, L., Akrami, N.: Directed digital hate. In: 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018, pp. 205–210 (2018)

Martins, R., Gomes, M., Almeida, J.J. et al.: Hate speech classification in social media using emotional analysis. In: Proceedings - 2018 Brazilian Conference on Intelligent Systems, BRACIS 2018, pp. 61–66 (2018)

Basak, R., Sural, S., Ganguly, N., Ghosh, S.K.: Online public shaming on Twitter: detection, analysis, and mitigation. IEEE Trans. Comput. Soc. Syst. 6 , 208–220 (2019). https://doi.org/10.1109/TCSS.2019.2895734

Sreelakshmi, K., Premjith, B., Soman, K.P.: Detection of hate speech text in Hindi-English Code-mixed Data. Procedia Comput. Sci. 171 , 737–744 (2020). https://doi.org/10.1016/j.procs.2020.04.080

Andreou, A., Orphanou, K., Pallis, G.: MANDOLA : A Big-Data Processing and Visualization. ACM Trans. Internet Technol. 20 (2020)

Zimbra, D., Abbasi, A., Zeng, D., Chen, H.: The state-of-the-art in Twitter sentiment analysis. ACM Trans. Manag. Inf. Syst. 9 , 1–29 (2018). https://doi.org/10.1145/3185045

Mariconti, E., Suarez-Tangil, G., Blackburn, J., et al.: “You know what to do”: proactive detection of YouTube videos targeted by coordinated hate attacks. Proc ACM Hum.-Comput. Interact (2019). https://doi.org/10.1145/3359309

Gitari ND, Zuping Z, Damien H, Long J (2015) A Lexicon-based approach for hate speech detection a Lexicon-based approach for hate speech detection. Int. J. Multimed. Ubiquitous Eng. https://doi.org/10.14257/ijmue.2015.10.4.21

Lima, L., Reis, J.C.S., Melo, P. et al.: Inside the right-leaning echo chambers: characterizing gab, an unmoderated social system. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ASONAM 2018. pp 515–522 (2018)

Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter : a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access, pp. 13825–13835 (2018)

Ruwandika, N.D.T., Weerasinghe, A.R.: Identification of hate speech in social media. In: 2018 International Conference on Advances in ICT for Emerging Regions (ICTer) : Identification. IEEE, pp. 273–278 (2018)

Alorainy W, Burnap P, Liu H, et al.: Suspended accounts : a source of tweets with disgust and anger emotions for augmenting hate speech data sample. In: Proceeding of the 2018 International Conference on Machine L̥earning and Cybernetics. IEEE (2018)

Setyadi, N.A., Nasrun, M., Setianingsih, C.: Text analysis for hate speech detection using backpropagation neural network. In: The 2018 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC). IEEE, pp 159–165 (2018)

Alfina, I., Mulia, R., Fanany, M.I., Ekanata, Y.: Hate speech detection in the Indonesian language: A dataset and preliminary study. In: 2017 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2017. pp 233–237 (2018)

Sharma, H.K., Singh, T.P., Kshitiz, K., et al.: Detecting hate speech and insults on social commentary using NLP and machine learning. Int. J. Eng. Technol. Sci. Res. 4 , 279–285 (2017)

Google Scholar

Sutejo, T.L., Lestari, D.P.: Indonesia hate speech detection using deep learning. In: International Conference on Asian Language Processing. IEEE, pp 39–43 (2018)

Lekea, I.K.: Detecting hate speech within the terrorist argument : a greek case. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, pp 1084–1091 (2018)

Liu, H., Burnap, P., Alorainy, W., Williams, M.L.: A fuzzy approach to text classification with two-stage training for ambiguous instances. IEEE Trans. Comput. Soc. Syst. 6 , 227–240 (2019). https://doi.org/10.1109/TCSS.2019.2892037

Wang, J., Zhou, W., Li, J., et al.: An online sockpuppet detection method based on subgraph similarity matching. In: Proceedings - 16th IEEE International Symposium on Parallel and Distributed Processing with Applications, 17th IEEE International Conference on Ubiquitous Computing and Communications, 8th IEEE International Conference on Big Data and Cloud Computing, 11t. IEEE, pp. 391–398 (2019)

Wu, K., Yang, S., Zhu, K.Q.: False rumors detection on Sina Weibo by propagation structures. In: Proc - Int Conf Data Eng 2015-May:651–662 (2015). https://doi.org/10.1109/ICDE.2015.7113322

Saksesi, A.S., Nasrun, M., Setianingsih, C.: Analysis text of hate speech detection using recurrent neural network. In: The 2018 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC) Analysis. IEEE, pp. 242–248 (2018)

Sazany, E.: Deep learning-based implementation of hate speech identification on texts in Indonesian : Preliminary Study. In: 2018 International Conference on Applied Information Technology and Innovation (ICAITI) Deep. IEEE, pp 114–117 (2018)

Son, L.H., Kumar, A., Sangwan, S.R., et al.: Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access 7 , 23319–23328 (2019). https://doi.org/10.1109/ACCESS.2019.2899260

Salminen, J., Hopf, M., Chowdhury, S.A., et al.: Developing an online hate classifier for multiple social media platforms. Human-centric Comput. Inf. Sci. 10 , 1–34 (2020). https://doi.org/10.1186/s13673-019-0205-6

Coste, R.L. (2000) Fighting speech with speech: David Duke, the anti-defamation league, online bookstores, and hate filters. In: Proceedings of the Hawaii International Conference on System Sciences. p 72

Gelber, K.: Terrorist-extremist speech and hate speech: understanding the similarities and differences. Ethical Theory Moral Pract. 22 , 607–622 (2019). https://doi.org/10.1007/s10677-019-10013-x

Zhang, Z.: Hate speech detection: a solved problem ? The challenging case of long tail on Twitter. Semant WEB IOS Press 1 , 1–5 (2018)

Hara, F.: Adding emotional factors to synthesized voices. In: Robot and Human Communication - Proceedings of the IEEE International Workshop, Pp. 344–351 (1997)

Fatahillah, N.R., Suryati, P., Haryawan, C.: Implementation of Naive Bayes classifier algorithm on social media (Twitter) to the teaching of Indonesian hate speech. In: Proceedings—2017 International Conference on Sustainable Information Engineering and Technology, SIET 2017, pp. 128–131 (2018)

Ahmad Niam, I.M., Irawan, B., Setianingsih, C., Putra, B.P.: Hate speech detection using latent semantic analysis (LSA) method based on image. In: Proceedings - 2018 International Conference on Control, Electronics, Renewable Energy and Communications, ICCEREC 2018. IEEE, pp. 166–171 (2019)

Gitari, N.D., Zuping, Z., Damien, H., Long, J.: A lexicon-based approach for hate speech detection. Int. J. Multimed. Ubiquitous Eng. 10 , 215–230 (2015)

Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012. IEEE, pp. 71–80 (2012)

Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Effective hate-speech detection in Twitter data using recurrent neural networks. Appl. Intell., Pp. 4730–4742 (2018)

Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Detecting offensive language in Tweets using deep learning (2018). arXiv:180104433v1 1–17. https://doi.org/10.1007/s10489-018-1242-y

Warner, W., Hirschberg, J.: Detecting hate speech on the World Wide Web. In: Association for Computational Linguistics Proceedings of the 2012 Workshop on Language in Social Media (LSM 2012), pp. 19–26 (2012)

Dinakar, K., Jones, B., Havasi, C., Lieberman, H.: Common sense reasoning for detection, prevention, and mitigation of cyberbullying. ACM Trans. Interact. Intell. Syst. 2 , 30 (2012). https://doi.org/10.1145/2362394.2362400

Burnap, P., Williams, M.L.: Cyber hate speech on twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7 , 223–242 (2015). https://doi.org/10.1002/poi3.85

Garc, A: Hate speech dataset from a white supremacy forum. In: Proceedings of the Second Workshop on Abusive Language Online, pp. 11–20 (2018)

Ombui, E., Karani, M., Muchemi, L.: Annotation framework for hate speech identification in Tweets : Case Study of Tweets During Kenyan Elections. In: 2019 IST-Africa Week Conference (IST-Africa). IST-Africa Institute and Authors, pp. 1–9 (2019)

Hosseinmardi, H., Mattson, S.A., Rafiq, R.I. et al.: Detection of cyberbullying incidents on the Instagram Social Network. In: arXiv:1503.03909v1 [cs.SI] 12 Mar 2015 Abstract (2015)

Raufi, B., Xhaferri, I.: Application of machine learning techniques for hate speech detection in mobile applications. In: 2018 International Conference on Information Technologies (InfoTech-2018), IEEE Conference Rec. No. 46116 20–21 September 2018, St. St. Constantine and Elena, Bulgaria. IEEE (2018)

Warner, W., Hirschberg, J.: Detecting hate speech on the World Wide Web. In: 19 Proceedings of the 2012 Workshop on Language in Social Media (LSM. pp 19–26) (2012)

Wang, G., Wang, B., Wang, T. et al.: Whispers in the dark : analysis of an anonymous social network categories and subject descriptors. ACM 13 (2014)

Mathew, B., Saha, P., Yimam, S.M. et al.: HateXplain: a benchmark dataset for explainable hate speech detection. In: ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). p 12 (2020)

Kiilu, K.K., Okeyo, G., Rimiru, R., Ogada, K.: Using Naïve Bayes Algorithm in detection of Hate Tweets. Int. J. Sci. Res. Publ. 8:99–107. https://doi.org/10.29322/ijsrp.8.3.2018.p7517 (2018)

Sanchez, H.: Twitter Bullying Detection, pp. 1–7 (2016). In: https://www.researchgate.net/publication/267823748

Gröndahl, T., Pajola, L., Juuti, M. et al.: All you need is “love”: Evading hate speech detection. In: Proceedings of the ACM Conference on Computer and Communications Security. pp 2–12 (2018)s

Correa, D., Silva, L.A., Mondal, M., et al.: The many shades of anonymity : characterizing anonymous social media content. Assoc Adv. Artif. Intell. 10 (2015)

Paetzold, G.H., Malmasi, S., Zampieri, M.: UTFPR at SemEval-2019 Task 5: Hate Speech Identification with Recurrent Neural Networks. In: arXiv:1904.07839v1 . p 5 (2019)

Miro-Llinares, F., Rodriguez-Sala, J.J.: Cyber hate speech on twitter: analyzing disruptive events from social media to build a violent communication and hate speech taxonomy. Int. J. Design Nat. Ecodyn. pp 406–415 (2016)

Rizoiu, M.-A., Wang, T., Ferraro, G., Suominen, H.: Transfer learning for hate speech detection in social media. arXiv:190603829v1 (2019)

Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Effective hate-speech detection in Twitter data using recurrent neural networks. Appl. Intell. 48 , 4730–4742 (2018). https://doi.org/10.1007/s10489-018-1242-y

Varade, R.S., Pathak, V.B.: Detection of hate speech in hinglish language. Adv. Intell. Syst. Comput. 1101 , 265–276 (2020). https://doi.org/10.1007/978-981-15-1884-3_25

Modha, S., Majumder, P., Mandl, T., Mandalia, C.: For surveillance detecting and visualizing hate speech in social media: a cyber watchdog for surveillance. Expert Syst. Appl. (2020). https://doi.org/10.1016/j.eswa.2020.113725

Maxime: What is a Transformer?No Title. In: Medium (2019). https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04

Horev R BERT Explained: State of the art language model for NLP Title. https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270

Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. Stud. Comput. Intell. 881 SCI:928–940 (2020). https://doi.org/10.1007/978-3-030-36687-2_77

Mutanga, R.T., Naicker, N., Olugbara, O.O. (2020) Hate speech detection in twitter using transformer methods. Int. J. Adv. Comput. Sci. Appl.; 11, 614–620 . https://doi.org/10.14569/IJACSA.2020.0110972

Plaza-del-Arco, F.M., Molina-González, M.D., Ureña-López, L.A., Martín-Valdivia, M.T.: Comparing pre-trained language models for Spanish hate speech detection. Expert Syst. Appl. 166 (2021)

Pandey, P.: Deep generative models. In: medium. https://towardsdatascience.com/deep-generative-models-25ab2821afd3

Wullach, T., Adler, A., Minkov, E.M.: Towards hate speech detection at large via deep generative modeling. IEEE Internet Comput. (2020). https://doi.org/10.1109/MIC.2020.3033161

Dugas, D., Nieto, J., Siegwart, R., Chung, J.J.: NavRep : Unsupervised representations for reinforcement learning of robot navigation in dynamic human environments (2021)

Behzadi, M., Harris, I.G., Derakhshan, A.: Rapid cyber-bullying detection method using compact BERT models. In: Proc - 2021 IEEE 15th Int Conf Semant Comput ICSC 2021 199–202. (2021) https://doi.org/10.1109/ICSC50631.2021.00042

Araque, O., Iglesias, C.A.: An ensemble method for radicalization and hate speech detection online empowered by sentic computing. Cognit. Comput. (2021). https://doi.org/10.1007/s12559-021-09845-6

Plaza-del-Arco, F.M., Molina-González, M.D., Ureña-López, L.A., Martín-Valdivia, M.T.: Comparing pre-trained language models for Spanish hate speech detection. Expert Syst. Appl. 166 , 114120 (2021). https://doi.org/10.1016/j.eswa.2020.114120

Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: 26th International World Wide Web Conference 2017, WWW 2017 Companion (2019)

Mossie, Z., Wang, J.H.: Vulnerable community identification using hate speech detection on social media. Inf. Process Manag. 57 , 102087 (2020). https://doi.org/10.1016/j.ipm.2019.102087

Magu, R., Joshi, K., Luo, J.: Detecting the hate code on social media. In: Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. pp 608–611 (2017)

Qian, J., Bethke, A., Liu, Y., et al.: A benchmark dataset for learning to intervene in online hate speech. In: EMNLP-IJCNLP 2019 - 2019 Conf Empir Methods Nat Lang Process 9th Int Jt Conf Nat Lang Process Proc Conf 4755–4764 (2020). https://doi.org/10.18653/v1/d19-1482

Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 21 , 1–13 (2020). https://doi.org/10.1186/s12864-019-6413-7

Lee, K., Ram, S.: PERSONA: Personality-based deep learning for detecting hate speech. In: International Conference on Information Systems, ICIS 2020 - Making Digital Inclusive: Blending the Local and the Global. Association for Information Systems (2021)

Download references

Author information

Authors and affiliations.

Big Data Analytics and Web Intelligence Laboratory, Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India

Anjum & Rahul Katarya

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rahul Katarya .

Ethics declarations

Conflict of interest.

All the authors of this paper declare that he/she has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Anjum, Katarya, R. Hate speech, toxicity detection in online social media: a recent survey of state of the art and opportunities. Int. J. Inf. Secur. 23 , 577–608 (2024). https://doi.org/10.1007/s10207-023-00755-2

Download citation

Accepted : 02 September 2023

Published : 25 September 2023

Issue Date : February 2024

DOI : https://doi.org/10.1007/s10207-023-00755-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Deep learning
Natural language processing (NLP)
Machine learning
Online hate speech (OHS)
Social media
Toxicity detection
Find a journal
Publish with us
Track your research

clock This article was published more than 5 years ago

How online hate turns into real-life violence

Social media sites have become hubs for the proliferation of white-supremacist propaganda..

About US is a new initiative by The Washington Post to cover issues of identity in the United States. Sign up for the newsletter .

White-supremacist groups use social media as a tool to distribute their message, where they can incubate their hate online and allow it to spread. But when their rhetoric reaches certain people, the online messages can turn into real-life violence.

Several incidents in recent years have shown that when online hate goes offline, it can be deadly. White supremacist Wade Michael Page posted in online forums tied to hate before he went on to murder six people at a Sikh temple in Wisconsin in 2012. Prosecutors said Dylann Roof “self-radicalized” online before he murdered nine people at a black church in South Carolina in 2015. Robert Bowers, accused of murdering 11 elderly worshipers at a Pennsylvania synagogue in October, had been active on Gab , a Twitter-like site used by white supremacists.

And just a few weeks ago , a 30-year-old D.C. man who described himself as a white nationalist was arrested on a gun charge after concerned relatives alerted police to his violent outbursts, including saying that the victims at the synagogue “deserved it.” Police say the man was online friends with Bowers.

“I think that the white-supremacist movement has used technology in a way that has been unbelievably effective at radicalizing people,” said Adam Neufeld, vice president of innovation and strategy for the Anti-Defamation League.

“We should not kid ourselves that online hate will stay online,” Neufeld added. “Even if a small percentage of those folks active online go on to commit a hate crime, it’s something well beyond what we’ve seen for America.”

Hate speech is showing up in many schools. More censorship isn’t the answer.

In 2017, white supremacists committed the majority of domestic extremist-related killings in the United States, according to a report from the Anti-Defamation League . They were responsible for 18 of the 34 murders documented by domestic extremists that year.

The influence of the Internet in fostering white-supremacist ideas can’t be underestimated, said Shannon Martinez, who helps people leave extremist groups as program director of the Free Radicals Project . The digital world gives white supremacists a safe space to explore extreme ideologies and intensify their hate without consequence, she said. Their rage can grow under the radar until the moment when it explodes in the real world.

“There’s a lot of romanticization of violence among the far-right online, and there aren’t consequences to that,” said Martinez, who was a white-power skinhead for about five years. “In the physical world, if you’re standing in front of someone and you say something abhorrent, there’s a chance they’ll punch you. Online, you don’t have that, and you escalate into further physical violence without a threat to yourself.”

How hate spreads

Internet culture often categorizes hate speech as “trolling,” but the severity and viciousness of these comments has evolved into something much more sinister in recent years, said Whitney Phillips, an assistant professor of communications at Syracuse University. Frequently, the targets of these comments are people of color, women and religious minorities, who have spoken out about online harassment and hateful attacks for as long as the social media platforms have existed, calling for tech companies to take action to curb them.

“The more you hide behind ‘trolling,’ the more you can launder white supremacy into the mainstream,” said Phillips, who released a report this year, “ The Oxygen of Amplification ,” that analyzed how hate groups have spread their messages online.

Phillips described how white-supremacist groups first infiltrated niche online communities such as 4chan, where trolling is a tradition. But their posts on 4chan took a more vicious tone after Gamergate, the Internet controversy that began in 2013 with a debate over increasing diversity in video games and that snowballed into a full-on culture war. Leaders of the Daily Stormer, a white-supremacist site, became a regular presence on 4chan as the rhetoric got increasingly nasty, Phillips said, and stoked already-present hateful sentiments on the site.

Phillips said it’s unclear how many people were radicalized through 4chan, but the hateful content spread like a virus to more mainstream sites such as Facebook, Twitter and Instagram through shared memes and retweets, where they reach much larger audiences.

White parents teach their children to be colorblind. Here’s why that’s bad for everyone.

Unlike hate movements of the past, extremist groups are able to quickly normalize their messages by delivering a never-ending stream of hateful propaganda to the masses.

“One of the big things that changes online is that it allows people to see others use hateful words, slurs and ideas, and those things become normal,” Neufeld said. “Norms are powerful because they influence people’s behaviors. If you see a stream of slurs, that makes you feel like things are more acceptable.”

While Facebook and Twitter have official policies prohibiting hate speech, some users say that their complaints often go unheard.

“You have policies that seem straightforward, but when you flag [hate speech], it doesn't violate the platform’s policies,” said Adriana Matamoros Fernández, a lecturer at the Queensland University of Technology in Australia who studies the spread of racism on social media platforms.

Facebook considers hate speech to be a “direct attack” on users based on “protected characteristics,” including race, ethnicity, national origin, sexual orientation and gender identity, Facebook representative Ruchika Budhraja said, adding that the company is developing technology that better filters comments reported as hate speech.

Twitter’s official policy also states that it is committed to combating online abuse.

In an email, Twitter spokesman Raki Wane said, “We have a global team that works around the clock to review reports and help enforce our rules consistently.”

Both platforms have taken action to enforce these rules. Writer Milo Yiannopoulos was banned on Twitter in 2016 after he led a racist campaign against “Ghostbusters” actor Leslie Jones. In August, Facebook banned Alex Jones from its platform for violating its hate speech policy . The following month, Twitter also banned him .

But bad actors have slipped through the cracks. Before Cesar Sayoc allegedly sent 13 homemade explosives to prominent Democrats and media figures in October, political analyst Rochelle Ritchie says he targeted her on Twitter. She said she reported Sayoc to the social media site after he sent her a threatening message , telling her to “hug your loved ones real close every time you leave home.” At the time, Twitter told her that the comment did not violate its policy , but after Sayoc was arrested, the social media site said that it was “deeply sorry” and that the original tweet “clearly violated our rules.”

The rules themselves, even when followed, can fall short. Users who are banned for policy violations can easily open a new account, Matamoros Fernández said. And while technologies exist to moderate text-based hate speech, monitoring image-based posts, such as those on Instagram, is trickier. On Facebook, where some groups are private, it’s even more difficult for those who track hate groups to see what is happening.

Tech companies “have been too slow to realize how influential their platforms are in radicalizing people, and they are playing a lot of catch-up,” Neufeld said. “Even if they were willing to do everything possible, it’s an uphill battle. But it’s an uphill battle that we have to win.”

Learning from the past

While hate speech today proliferates online, the methods used by these hate groups is nothing new. The path to radicalization is similar to that used by the Nazis in the early 20th century, said Steven Luckert, a curator at the United States Holocaust Memorial Museum who focuses on Nazi propaganda.

“Skillful propagandists know how to play on people’s emotions,” Luckert said. “You play upon people’s fears that their way of life is going to disappear, and you use this propaganda to disseminate fear. And often, that can be very successful.”

Most white Americans will never be affected by affirmative action. So why do they hate it so much?

The Nazis did not start their rise to power with the blatantly violent and murderous rhetoric now associated with Nazi Germany. It began with frequent, quieter digs at Jewish people that played on fears of “the other” and ethnic stereotypes. They used radio — what Luckert calls “the Internet of its time” — to spread their dehumanizing messages.

“They created this climate of indifference to the plight of the Jews, and that was a factor of the Holocaust,” Luckert said. “Someone didn’t have to hate Jews, but if they were indifferent, that’s all that was often needed.”

The antidote, Luckert says, is for people to not become immune to hate speech.

“It’s important to not be indifferent or a passive observer,” Luckert said. “People need to stand up against hate and not sit back and do nothing.”

Martinez, of Free Radicals, said that to combat the spread of hate, white Americans need to be more proactive in learning about the history of such ideologies.

She said she recently took her 11-year-old son to see the new lynching memorial in Alabama that memorializes the 4,000 victims.

She said her son was overwhelmed by what he saw. Security guards who saw the boy attempting to process the display suggested that he ask his mother to get ice cream, a treat to ease the emotional weight of the museum. Martinez refused.

“He’s a white man in America. I’m not going to let him ‘ice cream’ his way out of it,” Martinez said. “We have to shift this idea that we are somehow protecting our children by not talking about racism and violence. We can’t ice cream it away. We have to be forthcoming about our legacy of violence.”

New UN policy paper launched to counter and address online hate

Governments and Internet companies are failing to meet challenges of online hate.

Facebook Twitter Print Email

The UN Office on Genocide Prevention and the Responsibility to Protect launched a new policy paper on Wednesday aimed at countering and addressing hate speech online.

The policy paper, Countering and Addressing Online Hate Speech: A Guide for Policy Makers and Practitioners , was developed jointly by the UN Office with the Economic and Social Research Council (ESRC) Human Rights, Big Data and Technology Project, at the UK’s University of Essex.

‘Unprecedented speed’

“We have seen across the world, and time, how social media has become a major vehicle in spreading hate speech at an unprecedented speed, threatening freedom of expression and a thriving public debate,” said Alice Wairimu Nderitu, Special Adviser to the UN Secretary-General on the Prevention of Genocide, who is the global focal point on the issue.

“We saw how the perpetrators in the incidents of identity-based violence used online hate to target, dehumanize and attack others, many of whom are already the most marginalized in society, including ethnic, religious, national or racial minorities, refugees and migrants, women and people with diverse sexual orientation, gender identity, gender expression, and sex characteristics,” said Ms. Nderitu. 

Key recommendations include:

The need to ensure respect for human rights and the rule of law when countering online hate speech, and apply these standards to content moderation, content curation and regulation.
Enhancing transparency of content moderation, content curation and regulation.
Promoting positive narratives to counter online hate speech, and foster user engagement and empowerment.
Ensuring accountability, strengthen judicial mechanisms and enhance independent oversight mechanisms.
Strengthening multilateral and multi-stakeholder cooperation.
Advancing community-based voices and formulating context-sensitive and knowledge-based policymaking and good practice to protect and empower vulnerable groups and populations to counter online hate speech.

The policy paper builds upon earlier initiatives, including The UN Strategy and Plan of Action on Hate Speech , which seeks to enhance the UN’s response to the global spread and impact of hate speech.

The Strategy makes a firm commitment to step up coordinated action to tackle hate speech, both at global and national levels, including the use of new technologies and engaging with social media to address online hate speech and promote positive narratives.

Role for tech, social media

“Digital technologies and social media play a crucial role in tackling hate speech, through outreach, awareness-raising, providing access to information, and education,” noted the Special Adviser.

“The transformation of our lives into a hybrid format, with the share of our life spent online ever increasing, ensuring that we all enjoy the same rights online as we do offline has become ever more important,” noted Dr. Ahmed Shaheed, Deputy Director, Essex Human Rights, Big Data and Technology Project and former UN Special Rapporteur on Freedom of Religion or Belief .

‘Mass atrocities’

He warned of “the acts of violence that follow from online incitement to violence, including mass atrocities”, beyond the digital divides created by online hate.

“Unfortunately, our investment in countering online hate has not yet matched the reality of its dissemination and impact online. And it remains our responsibility – all relevant stakeholders – to step up our efforts to preserve the hard-won gains achieved to-date in advancing non-discrimination and equality,” concluded Special Adviser Nderitu.

Hate Speech

Researchers leverage AI to fight online hate speech

Black and white closeup of a hand on a computer keyboard surrounded by several red speech bubbles with examples of hateful speech (a thumbs down, an angry face, special characters implying cursing, etc.)

Any frequent denizen of cyberspace can confirm that online hate speech is a widespread issue. With our daily lives becoming increasingly virtual, and would-be perpetrators emboldened by the anonymity of the digital space, online hate and harassment have risen to unprecedented heights.

A recent survey on online hate and harassment by the Anti-Defamation League shows that over half of American adults report being harassed online at some point in their lives; over a quarter have experienced online harassment just within the last year. “Overall, reports of each type of hate and harassment increased by nearly every measure and within almost every demographic group,” the survey finds.

Efforts to address online hate speech have faced significant hurdles, however. Online content moderation is often the subject of controversy, straddling a fine line between protecting free speech and safeguarding internet users from harm. Emerging innovations are tasked with fostering digital environments free from toxicity and discrimination while also avoiding censorship of inaccurately flagged language.

Rising to this challenge is an innovative solution recently developed by researchers at the University of Michigan and Microsoft, which combines cutting-edge deep learning models with traditional rule-based approaches to better identify hateful speech online. This approach is reported in their paper titled Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection , presented at the Annual Meeting of the Association for Computational Linguistics (ACL).

“More and more tech companies and online platforms are developing automated tools to detect and moderate harmful content,” said Christopher Clarke, doctoral student in computer science and engineering at U-M and lead author of the study. “But the methods we’ve seen so far have a lot of room for improvement.”

Traditional methods for hate speech detection are rule-based, meaning they operate based on set guidelines concerning what constitutes harmful speech, often taking the form of block lists or keyword flagging. Although attractive due to their transparency and customizability, these methods have proven largely insufficient; they are difficult to scale and the rules they rely on to dictate what is flagged do not adequately capture the context and nuances of online content.

“Detecting hate speech and toxicity is a subjective task, and language is very ambiguous,” said Clarke. “It is relatively easy for users to switch around tokens and bypass a rule-based system, so these methods are quite fragile.”

Graphic from the paper showing the generalization problem of rule-based content moderation approaches. The graphic shows that the model appropriately flags certain statements (e.g., "I hate women") as hateful while inappropriately flagging others (e.g., "I loathe people who hate women").

Data-driven deep learning methods have emerged as a promising alternative for online content moderation. Such models are trained on large amounts of data and leverage deep neural networks to learn richer, more accurate representations and generalize these representations to new data. Despite their improvements in accuracy compared to one-size-fits-all rule-based approaches, the application of deep learning techniques is not without challenges.

“With out-of-the-box, pre-trained models, the user inputs some text and the model generates a prediction and a probability score as to whether the content is hateful or not,” said Clarke. “The main issue with these models is a lack of transparency—users aren’t able to see or understand what the model is learning and how it could be improved.”

The failure of deep learning models to give users any explanation or guidance as to the reasoning behind their choices has hindered their widespread adoption and has added to growing distrust among consumers.

Seeking to harness the enhanced performance of deep learning models while preserving the transparency and customizability of rule-based methods, Clarke and his coauthors developed Rule By Example (RBE), an exemplar-based approach that uses deep learning to compare text inputs with examples of hateful content.

“RBE is a contrastive learning framework that pairs rules with what we call exemplars, examples of text that defines a given rule,” said Clarke. “The framework pairs logical rules that are very explainable with these exemplars and then encodes and learns them.”

To accomplish this, RBE relies on two neural networks: a rule encoder as well as a text encoder. Using these tools, the model learns robust, accurate embeddings of hateful content and the rules behind them, enabling it to accurately predict and classify online hate speech.

Graphic showing the structure of the RBE framework.

This groundbreaking two-part framework also solves the persistent issue of transparency that other deep learning models demonstrate. It gives users a clear picture of how its predictions are formed as well as the option to revise the rules being used.

“RBE displays what we call rule grounding, allowing users to trace back model predictions directly to the rules that govern them as well as the examples tied to those rules,” said Clarke.

Instead of only receiving a prediction and probability score, as with other deep learning methods, RBE allows users to see the factors that influence the model’s precision, building in transparency without sacrificing performance.

RBE also boasts exceptional customizability. Its unique grounding feature means that customers can edit the rules and examples influencing predictions in real-time without having to retrain the model.

RBE’s unique two-step approach combines the best of both worlds, preserving the advantages of deep learning models and rule-based techniques, ultimately yielding a hate speech detection system that is robust, accurate, and transparent.

“Our approach not only ensures greater transparency, but also enhances performance,” said Clarke. “RBE outperforms several benchmarks and actually shows better performance than existing hate speech classifiers.”

In fact, compared to the closest competing classifier, RBE showed a 2% increase in accuracy, a substantial difference considering the massive amounts of online data these models process. Together with its built-in transparency mechanism, this performance boost demonstrates RBE’s potential to significantly improve online content moderation and make digital spaces safer for everyone.

The goal is to integrate RBE into Microsoft Cloud, and plans are underway to patent this new technology. In the future, Clarke and his collaborators are also hoping to push RBE’s capabilities even further, testing its ability to accommodate more complex rules and extending it to other classification tasks, such as predicting the intent behind a given piece of text.

In all, the hope is that RBE and its future iterations will benefit end users by broadening protections against online hate speech and harassment while preserving openness and transparency.

Clarke’s coauthors on the above paper are Prof. Jason Mars of U-M as well as Matthew Hall, Gaurav Mittal, Ye Yu, Sandra Sajeev, and Mei Chen of Microsoft.

Search the United Nations

What is hate speech.

Hate speech vs freedom of speech
Hate speech and real harm
Why tackle hate speech?
A pandemic of hate
The many challenges of tracking hate
Targets of hate
The preventive role of education
UN Strategy and Plan of Action
The role of the United Nations
International human rights law
Further UN initiatives
Teach and learn
Test yourself
International Day
Key human rights instruments
Thematic publications
Fact sheets [PDF]
UN offices, bodies, agencies, programmes

Understanding hate speech

In common language, “hate speech” refers to offensive discourse targeting a group or an individual based on inherent characteristics (such as race, religion or gender) and that may threaten social peace.

To provide a unified framework for the United Nations to address the issue globally, the UN Strategy and Plan of Action on Hate Speech defines hate speech as… “ any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are , in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.”

However, to date there is no universal definition of hate speech under international human rights law. The concept is still under discussion, especially in relation to freedom of opinion and expression, non-discrimination and equality.

While the above is not a legal definition and is broader than “incitement to discrimination, hostility or violence” – which is prohibited under international human rights law -- it has three important attributes:

It’s important to note that hate speech can only be directed at individuals or groups of individuals. It does not include communication about States and their offices, symbols or public officials, nor about religious leaders or tenets of faith.

Challenges raised by online hate speech

We must confront bigotry by working to tackle the hate that spreads like wildfire across the internet.” ANTÓNIO GUTERRES , United Nations Secretary-General, 2023

“We must confront hatred wherever and whenever it rears its ugly head. This includes working to tackle hate speech that spreads like wildfire across the internet.”

— United Nations Secretary-General António Guterres, 2023

Misinformation spreads faster when we’re upset. Pause and #TakeCareBeforeYouShare .

The growth of hateful content online has been coupled with the rise of easily shareable disinformation enabled by digital tools. This raises unprecedented challenges for our societies as governments struggle to enforce national laws in the virtual world's scale and speed.

Unlike in traditional media, online hate speech can be produced and shared easily, at low cost and anonymously. It has the potential to reach a global and diverse audience in real time. The relative permanence of hateful online content is also problematic, as it can resurface and (re)gain popularity over time.

Understanding and monitoring hate speech across diverse online communities and platforms is key to shaping new responses. But efforts are often stunted by the sheer scale of the phenomenon, the technological limitations of automated monitoring systems and the lack of transparency of online companies.

Meanwhile, the growing weaponization of social media to spread hateful and divisive narratives has been aided by online corporations’ algorithms. This has intensified the stigma vulnerable communities face and exposed the fragility of our democracies worldwide. It has raised scrutiny on Internet players and sparked questions about their role and responsibility in inflicting real world harm. As a result, some States have started holding Internet companies accountable for moderating and removing content considered to be against the law, raising concerns about limitations on freedom of speech and censorship.

Despite these challenges, the United Nations and many other actors are exploring ways of countering hate speech. These include initiatives to promote greater media and information literacy among online users while ensuring the right to freedom of expression.

Qais Akbar Omar is the author of A Fort of Nine Towers (Farrar, Straus & Giroux, 2013), which has been published in more than twenty languages, and the co-author of A Night in the Emperor’s Garden (Haus Publishing, 2015). He has written for The New York Times , The Atlantic , The Sunday Times in London, and The Globe and Mail in Canada, and published short stories in The Southern Review , AGNI , and elsewhere. In 2014–15 Omar was a Scholars at Risk Fellow at Harvard University. (updated 4/2017)

Share full article

Supported by

Scottish Hate Crime Law Takes Effect as Critics Warn It Will Stifle Speech

The legislation expands protections and creates a new charge of “stirring up hatred.” Critics, including J.K. Rowling, said the law was “wide open to abuse.”

By Sopan Deb

A sweeping law targeting hate speech went into effect in Scotland on Monday, promising protection against threats and abuse but drawing criticism that it could have a chilling effect on free speech.

The law, which was passed by the Scottish Parliament in 2021, expands protections for marginalized groups and creates a new charge of “stirring up hatred,” which makes it a criminal offense to communicate or behave in a way that “a reasonable person would consider to be threatening, abusive or insulting.”

A conviction could lead to a fine and a prison sentence of up to seven years.

The protected classes as defined in the law include age, disability, religion, sexual orientation and transgender identity. Racial hatred was omitted because it is already covered by a law from 1986. The new law also does not include women among the protected groups; a government task force has recommended that misogyny be addressed in separate legislation.

J.K. Rowling, the “Harry Potter” author who has been criticized as transphobic for her comments on gender identity , said the law was “wide open to abuse by activists,” and took issue with its omission of women.

Ms. Rowling, who lives in Edinburgh, said in a lengthy social media post on Monday that Scotland’s Parliament had placed “higher value on the feelings of men performing their idea of femaleness, however misogynistically or opportunistically, than on the rights and freedoms of actual women and girls.”

“I’m currently out of the country, but if what I’ve written here qualifies as an offense under the terms of the new act,” she added, “I look forward to being arrested when I return to the birthplace of the Scottish Enlightenment.”

On Tuesday, the police in Scotland said that while Ms. Rowling’s post had generated complaints, the author would not be facing criminal charges.

Rishi Sunak, the Conservative prime minister of the United Kingdom, expressed support for Ms. Rowling, telling the British newspaper The Telegraph that “people should not be criminalized for stating simple facts on biology. We believe in free speech in this country, and Conservatives will always protect it.”

Although Scotland is part of Britain, it enjoys political and fiscal autonomy on many matters, including economy, education, health, justice and more.

The new law has long had the support of Scotland’s first minister, Humza Yousaf, but it has raised concerns about the effect it might have on free speech. Mr. Yousaf, who was Scotland’s justice secretary when the bill was passed, was asked directly on Monday about the criticism from Ms. Rowling and others who oppose the law.

“It is not Twitter police. It is not activists, it is not the media. It is not, thank goodness, even politicians who decide ultimately whether or not crime has been committed,” Mr. Yousaf told Sky News . He said that it would be up to “the police to investigate and the crown, and the threshold for criminality is incredibly high.”

The law was introduced after a 2018 study by a retired judge recommend consolidating the country’s hate crime’s laws and updating the Public Order Act of 1986, which covers Britain and Northern Ireland. Scotland’s Parliament approved the new law 82-32 in March 2021.

Supporters of the legislation have spent years rallying support for it, saying it is crucial to combating harassment.

“We know that the impact on those on the receiving end of physical, verbal or online attacks can be traumatic and life-changing,” Siobhan Brown, Scotland’s minister for victims and community safety, said in a statement celebrating the law. “This legislation is an essential element of our wider approach to tackling that harm.”

But there has been fierce pushback against the law, including from Ms. Rowling, and the Scottish Conservative Party, whose leader, Douglas Ross, told Mr. Yousaf during first minister’s questions on March 14 that “the controversial new law is ripe for abuse.” In a separate questions exchange on March 21, Mr. Ross said that the law was “dangerous and unworkable” and that he expected it to “quickly descend into chaos.”

“People like J.K. Rowling could have police at their door every day for making perfectly reasonable statements,” he said.

Mr. Yousaf, who is of Pakistani descent, has cited the 1986 law as proper precedent for the new bill.

“If I have the protection against somebody stirring up hatred because of my race — and that has been the case since 1986 — why on earth should these protections not exist for someone because of their sexuality, or disability or their religion?” he told Parliament on March 21.

The issue of how the Scottish government should handle misogyny has been examined by a government-commissioned task force, which recommended in 2022 that protections for women be added in a separate bill with elements similar to the hate crimes bill that was passed the previous year.

The first minister at the time, Nicola Sturgeon, welcomed the report , promising that her government would give it full consideration. Mr. Yousaf, her successor, has also indicated his support, but there has been no serious movement in Parliament yet.

Claire Moses contributed reporting from London.

Sopan Deb is a Times reporter covering breaking news and culture. More about Sopan Deb

Scotland's controversial new hate crime laws come into force

The Hate Crime and Public Order (Scotland) Act aims to tackle the harm caused by hatred and prejudice but has come under fire from opponents who claim the new laws could stifle free speech and be weaponised to "settle scores".

Scotland reporter @Jenster13

Monday 1 April 2024 14:35, UK

Scotland's controversial new hate crime laws have come into force – with a Holyrood minister saying people "could be investigated" for misgendering someone online.

The new measures aim to tackle the harm caused by hatred and prejudice but have come under fire from opponents who claim they could stifle free speech and be weaponised to "settle scores" .

The Hate Crime and Public Order (Scotland) Act came into force on Monday 1 April and aims to provide greater protection for victims and communities.

It consolidates existing legislation and introduces new offences for threatening or abusive behaviour which is intended to stir up hatred based on prejudice towards characteristics such as age, disability, religion, sexual orientation and transgender identity.

The new provisions add to the laws on the statute book for race, which have been in place UK-wide since 1986.

Sex has been omitted from the act as a standalone bill designed to tackle misogyny is expected to be laid before the Scottish parliament at a later date.

But when asked whether misgendering someone on the internet was a crime under the new law, Siobhian Brown MSP, minister for victims and community safety, said on Monday morning: "It would be a police matter for them to assess what happens.

More on Scotland

Gang jailed after running drugs operation from Greenock housing estate

Angela Keenan: Family 'desperate' as police search for Lenzie woman

Steven Hutton death: Four people now charged in murder probe after man and woman arrested

JK Rowling will not be arrested under new Scottish hate law, say police

‘No further action’ over posts by author and gender-critical activist despite complaints

Comments by JK Rowling challenging police to arrest her for online misgendering do not amount to a crime, Police Scotland said.

As the Scottish government’s contentious hate crime law came into force on Monday, the author and gender-critical activist posted a thread on X saying the legislation was “wide open to abuse” after listing sex offenders who had described themselves as transgender alongside well-known trans women activists, describing them as “men, every last one of them”.

She stated that “freedom of speech and belief are at an end in Scotland if the accurate description of biological sex is deemed criminal”.

On Tuesday afternoon, Police Scotland confirmed they had received complaints about the social media post but added: “The comments are not assessed to be criminal and no further action will be taken.”

The act brings together existing laws. Under the Hate Crime and Public Order (Scotland) Act 2021, it is a crime to make derogatory comments based on age, disability, religion, sexual orientation, transgender identity or being intersex.

JK Rowling at the premiere of a Fantastic Beasts film

As concerns continue about officers being overwhelmed, reports suggest Police Scotland has received at least 3,000 complaints under the new act in the two days since it came into force.

Responding to the decision, Rowling said: “I hope every woman in Scotland who wishes to speak up for the reality and importance of biological sex will be reassured by this announcement, and I trust that all women – irrespective of profile or financial means – will be treated equally under the law.”

Earlier on Tuesday, the force also confirmed that racist graffiti found on Monday near Humza Yousaf’s family home in Broughty Ferry had been recorded under the new act.

The first minister said the graffiti, which contained a racial slur against him, was a reminder of why Scotland must take a “zero-tolerance” approach to hatred. On X, he said: “I do my best to shield my children from the racism and Islamophobia I face on a regular basis. That becomes increasingly difficult when racist graffiti targeting me appears near our family home.”

The Scottish National party leader robustly defended the legislation, which has prompted a barrage of criticism about how it will be policed and how it could affect freedom of speech, as well as fears that it could be used maliciously against certain groups for expressing their opinions, in particular gender-critical feminists.

Yousaf said it “absolutely protects people in their freedom of expression” while guarding “people from a rising tide of hatred that we’ve seen far too often in our society”.

The prime minister, Rishi Sunak, asked about Rowling’s comment on Tuesday morning, said that while he would not comment on a police matter, “nobody should be criminalised for saying commonsense things about biological sex”.

Robbie de Santos, the director of campaigns and human rights at Stonewall, said: “The prime minister and high-profile commentators are simply incorrect when they suggest that misgendering or ‘stating facts on biology’ would be criminalised.

“This is no more true than stating that the existing law has criminalised the criticism of religion. This kind of misrepresentation about the act and its purpose only serves to trivialise the very real violence committed against us in the name of hate.”

He called on political leaders to address the trend of “rising hate and escalating violence” facing LGBTQ+ people. “We already have longstanding laws preventing the incitement of hatred on the basis of race and religion, and the new Hate Crime Act creates parity in the law in Scotland by expanding these protections to cover sexual orientation, transgender identity, age and disability,” he said.

Scottish politics
Scottish National party (SNP)
Freedom of speech

More on this story

JK Rowling’s posts on X will not be recorded as non-crime hate incident

JK Rowling tells of fear former husband would burn Harry Potter manuscript

Humza Yousaf criticises ‘disinformation’ over new Scottish hate crime law

Michael Matheson faces suspension as MSP after £11,000 iPad data bill claim

JK Rowling launches support centre for female victims of sexual violence

Scotland is in decline because of SNP’s independence obsession, says Gove

Scottish health secretary resignation prompts mini cabinet reshuffle

COMMENTS

Online Hate Speech (Chapter 4)
Defining Online Hate Speech . There is no single agreed on definition of hate speech - online or offline - and the topic has been hotly debated by academics, legal experts, and policymakers alike. Most commonly, hate speech is understood to be bias-motivated, hostile, and malicious language targeted at a person or group because of their actual or perceived innate characteristics (Reference ...
Prevalence of Online Hate Speech on Social Media
The role of popular and politically organized racism in fostering terrestrial climates of intimidation and violence is well documented (Bowling, 1993).The far right, and some popular right-wing politicians, have been pivotal in shifting the 'Overton window' of online political discussion further to the extremes (Lehman, 2014), creating spaces where hate speech has become the norm.
Can We Counteract Hate? Effects of Online Hate Speech and Counter
When it comes to the extent of intolerance or hate speech in user comments, we can differentiate between the prevalence of hate speech measured in online discussions and personal experiences with and perceptions of hate speech that Internet users report. Concerning the first, findings very much depend on the definition of hate speech, the topic ...
Hate Speech on Social Media: Global Comparisons
Summary. Hate speech online has been linked to a global increase in violence toward minorities, including mass shootings, lynchings, and ethnic cleansing. Policies used to curb hate speech risk ...
Hate speech, toxicity detection in online social media: a ...
This paper presents a survey of online hate speech identification using different Artificial Intelligence techniques. This review study looks into a number of research questions shown in Table 1 that will help us to learn about the most recent trends in online hate speech in the field of artificial intelligence. It also includes an overview of recently used machine learning and deep learning ...
Racism, Hate Speech, and Social Media: A Systematic Review and Critique
From the critical literature, less than half of the papers examine how whiteness plays out on social media. Mason (2016) ... Twitter is important, as social media platforms' specific designs and policies play a key role in shaping racism and hate speech online (Pater et al. 2016; Noble 2018a).
Perspective
But when their rhetoric reaches certain people, the online messages can turn into real-life violence. Several incidents in recent years have shown that when online hate goes offline, it can be ...
The virtual stages of hate: Using Goffman's work to conceptualise the
This article will attempt to critically understand the motivational factors encouraging online hate speech. It will therefore draw on Goffman's (1959) ground-breaking work on self-presentation and then reconceptualise the model after taking into account the differences between online and offline communication. It will assess how these differences affect performances, and thus individuals ...
The regulation of hate speech online and its enforcement
Sophie Turenne. On the initiative of the British Association of Comparative Law, this issue develops a broad comparative perspective on aspects of the legal regulation of hate speech online in China, France, Germany, the UK, Europe and the US. This editorial introduces the key lines of debates running through the papers.
Viral sticks, virtual stones: addressing anonymous hate speech online
Woods and Ruscher provide an overview of research on the types and harms of hate speech, and a consideration of the impact both of anonymity and of the Internet on these harms, as well as describe an illustrative example of a type of hate speech online that is often anonymous—derogatory Internet memes—in an effort to encourage further ...
Hate speech or free speech: an ethical dilemma?
Evolution of hate speech perception and its legal regulation. Today most of the (but not only) European countries regulate, limit or ban hate speech, thus reflecting a post-WW2 commitment of States to the protection of human rights, to the promotion of the right to personal dignity and freedom from discrimination, as enshrined in many international covenants and declarations.
Report: Online hate increasing against minorities, says expert
Last year's theme focused on the disturbing and rising occurrence of online hate speech against minorities, and the summary report delivers a range of recommendations. A call for an international framework driven by human rights . Online hate speech is being addressed by many, however the report highlights the frequent inadequacies of the process.
Say #NoToHate
Hate speech - including online - has become one of the most common ways of spreading divisive rhetoric on a global scale, threatening peace around the world.
Internet, social media and online hate speech. Systematic review
3.1. Types of online hate3.1.1. Online religious hate speech. This type of hate speech is defined as the use of inflammatory and sectarian language to promote hatred and violence against people on the basis of religious affiliation through the cyberspace (Albadi, Kurdi, & Mishra, 2018; Răileanu, 2016).According to the results, the most attacked religion in world is Islam and it seems to be ...
PDF Online Hate Speech: Hate or Crime?
The essay finally reviews different mechanisms for combating hate speech and attempts to answer who is responsible for leading the fight. The concept of online hate speech Before discussing the legislation regarding online hate speech, the first step should be to define the term itself. The meaning behind hate speech is not as self-evident as ...
Internet, social media and online hate speech. Systematic review
This systematic review aimed to explore the research papers related to how Internet and social media may, or may not, constitute an opportunity to online hate speech. 67 studies out of 2389 papers found in the searches, were eligible for analysis. We included articles that addressed online hate speech or cyberhate between 2015 and 2019. Meta-analysis could not be conducted due to the broad ...
Hate speech detection in social media: Techniques, recent trends, and
This analysis aims to create a valuable resource by summarizing the methods and strategies used to combat hate speech in social media. We perform a detailed review to achieve a deep knowledge of the hate speech detection landscape from 2018 to 2023, revealing global incidents of hate speech in 2022-2023.
New UN policy paper launched to counter and address online hate
The policy paper, Countering and Addressing Online Hate Speech: A Guide for Policy Makers and Practitioners, was developed jointly by the UN Office with the Economic and Social Research Council (ESRC) Human Rights, Big Data and Technology Project, at the UK's University of Essex. 'Unprecedented speed' "We have seen across the world, and time, how social media has become a major vehicle ...
Online hate speech and hate crime
An educational youth campaign, called the " No Hate Speech Movement ", was run by the Council of Europe between 2012-2018. This campaign aimed at combating online hate speech by mobilising young people and youth organisations to recognise and act against these human rights violations. The No Hate Speech Movement developed among other things ...
Researchers leverage AI to fight online hate speech
RBE's unique two-step approach combines the best of both worlds, preserving the advantages of deep learning models and rule-based techniques, ultimately yielding a hate speech detection system that is robust, accurate, and transparent. "Our approach not only ensures greater transparency, but also enhances performance," said Clarke.
'Online Hate Speech in Inda: Legal Reforms and Social Impact on Social
It explores the constitutional and statutory provisions governing hate speech, whilst additionally studying the effectiveness of modern felony mechanisms in addressing online hate speech. Furthermore, the paper investigates the function of social media systems in facilitating the spread of hate speech, in addition to their processes to content ...
What is hate speech?
Hate speech is "discriminatory" (biased, bigoted or intolerant) or "pejorative" (prejudiced, contemptuous or demeaning) of an individual or group. Hate speech calls out real or perceived ...
When Online Hate Speech Has Real World Consequences
This mini-lesson looks at the impacts of online hate, celebrity influence, and one concerning trend in online hate--rising antisemitism. Online hate speech can harm the mental health of those whose identities are targeted, making them feel fearful, or anxious, and alone. Additionally, it has been linked to violent attacks around the world ...
Scotland's new hate crime law: what does it cover and why is it
The government insists the law, coming into force on Monday, is needed to protect victims but critics say it limits freedom of expression A new law to tackle hate crime in Scotland will be ...
I Hate WhatsApp
He has written for The New York Times , The Atlantic , The Sunday Times in London, and The Globe and Mail in Canada, and published short stories in The Southern Review, AGNI, and elsewhere. In 2014-15 Omar was a Scholars at Risk Fellow at Harvard University. (updated 4/2017) A literary magazine named after the Vedic fire-god. Transformative.
Scotland's new hate crime law comes into force
By James Cook. Scotland editor. Scotland's new hate crime law has come into force, with JK Rowling and Elon Musk among its critics. The Harry Potter author and the owner of social media platform X ...
Scotland caught up in heated debate over new hate speech law
04/07/2024 April 7, 2024. A change in the law from April 1 aims to better protect people in Scotland against hate crimes. But days after its introduction, the new law remains controversial.
Scottish Hate Crime Law Takes Effect as Critics Warn It Will Stifle Speech
The law was introduced after a 2018 study by a retired judge recommend consolidating the country's hate crime's laws and updating the Public Order Act of 1986, which covers Britain and ...
Scotland's controversial new hate crime laws come into force
The Hate Crime and Public Order (Scotland) Act aims to tackle the harm caused by hatred and prejudice but has come under fire from opponents who claim the new laws could stifle free speech and be ...
JK Rowling will not be arrested under new Scottish hate law, say police
Comments by JK Rowling challenging police to arrest her for online misgendering do not amount to a crime, Police Scotland said.. As the Scottish government's contentious hate crime law came into ...