The Fake Review Challenge

Fake amazon

Can you tell a real book review from a fake book review? Below are five tips that I have learned from my research. It is likely to be fake if:

1) It reads like a press release

2) It reads like the blurb on the back of the book

3) It is either extremely positive or extremely negative. Authentic reviews are more nuanced and tend to mention possible negatives even in 5-star reviews.

4) Fake reviewers are more likely to talk about themselves

5) Fake reviewers are more likely to address you the reader

The more of these signs you find in a review, the more likely it is to be fake. Now take the Fake Review Challenge and see how you get on!

 

Did Hillary Clinton’s PR team solicit fake Amazon book reviews for ‘What Happened’? Part #2

In the first post of this two-part linguistic investigation, we set up an unsupervised analytical approach; factor analysis to identify latent dimensions of linguistic variation in the ‘What’s Happened’ reviews then feeding these dimensions into a cluster analysis in order to identify a small number of distinct text types. We know that the reviewing patterns for ‘What Happened’ displayed ‘burstiness’ i.e. a high frequency of reviews within a short period of time (see Figure 1 below).  As Figure 2 below hypothesises, if there is a text type cluster that displays similar ‘burstiness’, we can infer that there there was probably some level of coordination of reviewing behaviour and identify linguistic features associated with less-than-authentic reviews.

Screenshot 2018-10-28 11.00.04

Figure 1: Quantity, frequency and rating of ‘What Happened’ book reviews in the first month after launch.

Screenshot 2018-10-27 20.29.15

Figure 2: Hypothesis for fake review detection using cluster analysis with time series. 

The factor analysis found four dimensions of language variation in the ‘What Happened’ reviews: Engagement; Reflection; Emotiveness; Commentary.

Dimenson 1: Engagement

One linguistic dimension of these reviews describes levels of Engagement. In engaging reviews, writers directly address (using ‘you’ pronouns) either the reader or Hillary Clinton. The style is conversational and persuasive with exclamations, questions and hypotheticals used to interact with the reader.

THANK YOU for telling your story Secretary Clinton! You have accomplished so much and are a genuine inspiration. If they weren’t so afraid of you, they wouldn’t work so hard to shut you up. Keep fighting and I will too!

It’s her side of the story. That’s what it claims to be, and that’s what it is. For those who don’t like it because you disagree with her, you’re missing the point. After reading it, did you get a better feel for who the candidate was, what she was thinking, and even what her biases were and are? If so, then the book does what it claims to do.

Non-engaging reviews are more  linguistically dense, using longer words and giving complex descriptions of the content.

The second chapter describes the days after the election, when she first isolated herself from the deluge of texts and emails from well-wishers. Eventually, however, she threw herself back into the fray, writing letters of thanks to supporters, attending galas, and spending time with her family.

Dimension 2: Reflection

A second linguistic dimension sees reviewers reflect on their personal experience of reading the book. This may include autobiographical elements, narratives related to the book purchase and reading occasions as well as feelings had while reading. The key linguistic features here are ‘I’-pronoun and past tense:

Like many other people, I wondered if this book would really be worth reading. I voted for Clinton but I wondered how much value there could be in her account of the 2016 Presidential election campaign. Luckily, this book is so much more. It hit my Kindle on Tuesday and as it happens I had three airplane flights (including two very long ones) on Wednesday and Thursday, so I made it my project for those flights. I didn’t have to force myself to keep going; once I started, her narrative and the force of her ideas and anecdotes kept me reading.

Dimension 3: Emotiveness

Reviews with a high Emotiveness score were extremely positive in their praise of the book and, especially, Hillary Clinton. This was signalled by use of long strings of positive adjectives that might reasonably be considered excessive:

A funny, dark, and honest book by one of the truest public servants of her generation. Her writing on her marriage was deeply heartfelt and true. The sad little haters will never keep this woman down, and history will remember her as a trailblazer and a figure of remarkable courage.

The People’s President, Hillary delivers her heartfelt, ugly cry inducing account about What Happened when she lost the Electoral College to the worst presidential candidate in modern history. Politics aside, America lost when they elevated Russia’s Agent Orange to the presidency. Think what you will, but America missed the chance to have a level headed, intelligent and resilient leader, and yes the first female president.

Hillary’s a smart, insightful, resilient, inspiring, kind, caring, pragmatic human being. This book is a journey through her heart and soul.

Dimension 4: Commentary

Reviews with high Commentary focused on Hillary Clinton and the other actors in the election story (high use of third person pronouns). The reviews analyse and evaluate Clinton’s perspective and explanation of what happened in 2016, in a conversational manner much like a TV commentator or pundit.

I disagree with the reviewers who says Hillary doesn’t take responsibility for her mistakes. She analyzes all the reasons she thinks she lost the election–yes, she talks about Russian interference, malpractice by the FBI, and false equivalence by the mainstream press IN ADDITION TO missteps she thinks she made. My own take is that she doesn’t pay enough attention to the reasons why Bernie Sanders was able to command so strong a following with so few resources; but that is part and parcel of who she is.

Historical memoir from the first female candidate for a major political party…a unique perspective and platform to write from. She does recount her successes as well as her failures…she was mostly shut down during the campaigns by repetitious questions and by over-coverage of Trump by the media. She is intelligent and well-informed and states her case without self-pity.

Having the identified these four linguistic functions in the ‘What Happened’ reviews, the trick is to see how they combined to form clusters of review text types – and whether any one of these clusters is more strongly correlated with the high frequency and early reviews.

As Figure 4 shows, hierarchical cluster analysis identifed four review text types: ‘Tribute’ reviews, the largest cluster, have high Emotiveness; ‘Pundit’ reviews have high levels of Commentary and Engagement; Content descriptive’ or ‘spoiler’ reviews talk about what’s in the book in an objective manner i.e. without Reflection or Engagement; ‘Experiential’ reviews narrate the writer’s personal Reflection on the experience of reading the book.

Screenshot 2018-10-28 13.44.25.png

Figure 4: 4-Cluster solution with mean factor loadings, interpretations and percentage of total reviews.   

So, we have these four review text types…do any of these correlate with the bursty reviewing patterns identified? Figure 5 below shows that the actual linguistic pattern of ‘What Happened’ reviews appears to correlate with the burstiness pattern; a large proportion of the first day reviews are Tribute reviews and most of this review type occurs within the first week before tailing off during the rest of the month. The fact that no other review type is particularly time sensitive suggests that, at the very least, Tribute reviews are correlated with early reviewing and are potentially evidence of  coordinated recruitment of Hillary Clinton’s ‘fans’ as book reviewers.

What Happened Cluster Time Series

Figure 5: Distribution of ‘What Happened’ review text types during first month following book launch, compared to hypothetical deceptive and non-deceptive distributions. 

If Hillary Clinton’s PR team did solicit positive reviews in the early days of the book launch, perhaps it is not surprising; they would have been responding to an extensive negative campaign against her book which included manipulating review helpfulness metrics (i.e. massive upvoting of low-rated reviews) as well writing fake negative reviews.

From an investigative linguistic perspective, this analysis shows that: a) suspicious activity can be detected using linguistic data as well as network or platform metadata; b) unqualified praise and intense positive emotions are deception indicators in the online review genre; and c) cluster analysis is an effective way of recognising linguistic deception features in an unsupervised learning setting.

Did Hillary Clinton’s PR team solicit fake Amazon book reviews for ‘What Happened’? Part #1.

September 12, 2017, was the launch day for Hillary Clinton’s autobiographical account of the 2016 election she lost to Donald Trump, definitively entitled ‘What Happened’. By midday 1669 reviews had been written on Amazon.com. By 3pm over half of the reviews, all with 1-star ratings, had been deleted by Amazon and a new review page for the book had been set up. After Day 1, ‘What Happened’ had over 600 reviews and an almost perfect 5 rating. What happened?!

What Happened Indeed

Figure 1: Genuine support or fake reviews? Hillary Clinton’s ‘What Happened’ Amazon rating 1 day after launch (and after all the negative reviews were deleted )

There were good reasons to view the flood of negative reviews as suspicious. Only 20% of the reviews had a verified purchase and the ratio of 5-star to 1-star reviews – 44%-51% – was highly irregular; the vast majority of products reviewed on Amazon.com display an asymmetric bimodal (J-shaped) ratings distribution (see  Hu, Pavlou and Zhang, 2009), in which there is a concentration of 4 or 5 star reviews, a number of 1-star reviews and very few 2 or 3 star reviews.  The charts in Figure 2 below, originally featured in this QZ article,  show the extent to which ‘What Happened’ was initially a ratings and purchase pattern outlier.

atlas_HJnVJAL5Z@2xatlas_ryGnJaL5W@2x

Figure 2: Two charts indicating the unusual reviewing behaviour for ‘What Happened’. Source: Ha, 2017 

Faced with accusations of pro-Clinton bias as a result of deleting only negative reviews, an Amazon spokesperson confirmed that the company, in taking action against “content manipulation”, looks at indicators such as the ‘burstiness’ of reviews (high rate of reviews in a short time period) and the relevance of the content –  but doesn’t delete reviews simply based on their rating or their verified status. (Hijazi, 2007).

It would appear that Amazon have taken on board the academic literature suggesting that burstiness is a feature of review spammers and deceptive reviews (e.g. this excellent paper by  Geli Fei, Arjun Mukherjee, Bing Liu et al. ) and that it is right to interpret a rush of consecutive negative reviews close to a book launch as suspicious.

But what about the subsequent burst of 600+ positive reviews? One might expect the Clinton PR machine to mobilize its own ‘positive review brigade’ in anticipation of , or in response to, a negative ‘astroturfing’ campaign against her book. One could even argue that it would be foolish not to manage perceptions of such a controversial and polarising book launch. If positive review spam is identified, should it also be deleted?

I tracked the number of Amazon reviews of ‘What Happened’ for a month after its launch on the new ‘clean’ book listing (the listings have since been merged but you can see my starting point here). Figure 3 below shows clear signs of ‘burstiness’; the rate of reviewing decreases exponentially over the first month even while the rate of 5-star reviews remained consistently high.

Reviews by day

Figure 3: Number and frequency of ‘What Happened’ reviews in the first 30 days following its launch and deletion of negative reviews. 

So, it is perfectly legitimate to ask whether the ‘What Happened’ reviews were manipulated through ‘planting’ of ‘fake’ 5 star reviews written for financial gain or otherwise incentivised e.g. in exchange for a free copy of the book, which would circumvent Amazon’s Verified Purchase requirement. With my investigative linguist hat on, I’m wondering if there are any linguistic patterns associated with this irregular – and potentially deceptive – behaviour? (If there are, these could be used to aid deception detection in the absence of – or in tandem with –  non-linguistic ‘metadata’.)

A line of fake review detection research has confirmed linguistic differences between authentic and deceptive reviews, although the linguistic deception cues are not consistent and vary depending on the domain and the audience (see my brief overview in this paper). Since we don’t know the deception features in advance and no ground truth has been established (i.e. we don’t know for sure if there was a deception), I’m going to use two unsupervised learning approaches appropriate for unlabeled data: factor analysis, to find the underlying dimensions of linguistic variation in all the reviews, followed by cluster analysis to segment the reviews into text types based on the dimensions with the hope of finding specific deception clusters.

If there is a text cluster that correlates with ‘burstiness’ – i.e. occurs more frequently in the reviews closest to the book launch date and/or occurs repeatedly within a short time frame – then that would suggest there are specific linguistic styles and/or strategies correlated with this deceptive reviewing behaviour. The existence of such a distinct deception cluster would strongly suggest that Clinton’s PR team gamed the Amazon review system (understandably, in order to counter the negative campaign against the book).  Alternatively, different reviewing strategies might be distributed randomly across the review corpus and unrelated to its proximity to the book launch date. This would weaken the argument that linguistic variation in the reviews is a potential deception cue. The two scenarios are illustrated in Figure 4 below:

Deception cluster hypothesis

Figure 4: Hypothetical illustration of how review text types (clusters) might be distributed over a 30 day period in the case of astroturfed fake reviews (top) or genuine positive reviews (bottom). 

My prediction? Surely, Hillary Clinton’s PR team would not so be so brazen as to solicit fake positive reviews in bulk and in an organised fashion. Yes, there were a disproportionate number of reviews written in the first few days but I believe this was a spontaneous groundswell of genuine support. I do expect there to be a few different types of linguistic review style, reflecting the different ways in which books can be reviewed (e.g. focus on book content; retell personal reading experience; address the reader – these are some of the review styles I presented at the ICAME39 (2018) conference in Tampere). However, if the support is spontaneous I would expect these review styles not to be correlated with burstiness or other deceptive phenomena but to occur randomly throughout the month.

Check back here in a few days for Part #2:  Results and Discussion!