Amazon.com reviews are rigged

Amazon.com user reviews are a critical factor in increasing the sales for the giant retailer. The number of times, I have heard my wife say that she selected a particular product because of great user reviews, is far too high to count. These reviews can have significant credibility issues, though…In their great scholarly paper – Six degrees of reputation: The use and abuse of online review and recommendation systems – Shay David and Trevor Pinch point to some of the follies of user-generated reviews at Amazon.com:

Evidently, in many areas of cultural production user reviews are mushrooming as an alternative to traditional expert reviews. If there was any doubt, it has long ago been established that reviews and recommender systems play a determining role in consumer purchasing (an early review is available in Resnick and Varian, 1997) and recent qualitative research adds weight to the claim that these review systems have causal and positive effects on sales; to nobody’s surprise, books with more and better reviews are shown to sell better (Chevalier and Mayzlin, 2004). With people in the culture industries increasingly realizing this truism, many of the reviews are thus positively biased and it becomes very hard to distinguish the ‘objective’ quality of the reviews. In addition, due to the large variance in the quality of the reviews, and the varied agendas of the reviewers, user input too often becomes untrustworthy leaving the consumers with little ability to gauge an item’s actual quality. Do we live in a cultural Lake Wobegon where “all the books are above average?” (to paraphrase Keillor, 1985) Is there a way to review the reviewers, to guard the guards? As will be discussed in details below, emerging systems like the one employed on sites like Amazon.com (2005) suggest that there are ways to try to solve this bias problem by offering a tiered reputation management system which offers a set of checks and balances. But these new options also bring with them new problems as the participants adjust to what is at stake in this new economy of reputation.

They offer examples with links to amazon.com reviews:

This instance concerned one of Pinch’s own books Analog Days: The Invention and Impact of the Moog Synthesizer (Pinch and Trocco, 2002). This book that chronicles the invention and early days of the electronic music synthesizer was well received by reviewers both offline and online, and the Amazon.com editors quote a review from the Library Journal that reads as follows:

… In this well–researched, entertaining, and immensely readable book, Pinch (science & technology, Cornell Univ.) and Trocco (Lesley Univ., U.K. [sic]) chronicle the synthesizer’s early, heady years, from the mid–1960s through the mid–1970s … . Throughout, their prose is engagingly anecdotal and accessible, and readers are never asked to wade through dense, technological jargon. Yet there are enough details to enlighten those trying to understand this multidisciplinary field of music, acoustics, physics, and electronics. Highly recommended. [link]

original review

A similar (but distinctly different) book that had appeared earlier — Electronic Music Pioneers by Ben Kettlewell (Vallejo, Calif.: ProMusic Press, 2002) — received the following user review on Amazon.com on 15 April 2003:

This book is a must. Highly recommended., April 15, 2003 / Alex Tremain (Hollywood, CA USA)

… In this well–researched, entertaining, and immensely readable book, Kettlewell chronicles the synthesizer’s early, years, from the turn of the 20th century — through the mid–1990s … . Throughout, his prose is engagingly anecdotal and accessible, and readers are never asked to wade through dense, technological jargon. Yet there are enough details to enlighten those trying to understand this multidisciplinary field of music, acoustics, physics, and electronics. Highly recommended. [link]

Copied review

The ‘similarity’, of course, is striking. The second review is simply a verbatim copy of the first one, replacing only the name of the authors and the period the book covers.

In another case a user reviewed several Tom Hanks/Meg Ryan movies. The user posted the same review for the movies Sleepless in Seattle and You’ve Got Mail. He found that each of those films was “a film about human relations, hope and second chances, but most importantly about trust, love, and inner strength.” [link link]

copied review

rev4

As we know, especially with the demands for producing one blockbuster after another, Hollywood movies are sometimes strikingly similar, and yet posting the same review for two different films suggests that the reviewer is interested less in accurate representation of the movie’s content or qualities and more in the sort of reputation and identity that he or she can build as someone who posts numerous reviews.

The authors point perverse incentives for different actors to game the review mechanisms:

  • Self-plagiarize in order to write reviews quickly and to build up reviewer reputation (see examples with links above)
  • Write unusually positive reviews to butter the publisher and the author in hope of landing a job as a full-time reviewer
  • People with vested financial interest in the success of a product, take advantage of the cloak of anonymity and try to game the system by having family members etc. create positive reviews for their product and negative reviews for competing products
  • Write reviews to see your name associated with a popular product. This can work as an ego boost for adolescents or even some adults
  • Write reviews to promote other web sites or substitute products (example)

The authors find the problems mentioned above, based on their very limited analysis (due to the limitations of Amazon.com APIs), to effect about 1% of the reviews. I suspect, though, that a thorough analysis will reveal a significantly higher level of problem reviews.

So what can be done to deal with these issues? I believe that any system – both online and offline – has limitations that can be exploited by bad actors for personal benefit. Anybody remember the Armstrong Williams fiasco as an example of problems with offline systems? Still, as long as these systems are transparent and user expectations of how the systems work are properly managed, such systems can be valuable. Some of the specific things Amazon.com can do are:

  • Provide more statistics about various issues with its user generated product reviews so as to properly set user expectations. They should point to all the various kinds of problems so that users take all the reviews on their site, with a pinch of salt. This, of course will be hard because doing this might reduce their revenue and potentially their influence, but in the long term pay off in terms of higher customer satisfaction.
  • Be more receptive to user complaints and create a mechanism to penalize the people who try and game the system. At present they seem to be taking a completely hands off approach to policing the user reviews and as a result customers end up paying a price.
  • Build up a more sophisticated notion of reputation which is based on reputation of users in other communities. Such a notion should include more elements then just the number of reviews a person has entered
  • Build up a more sophisticated meta-moderation system like the one built by Slashdot.

This is not a simple problem but one Amazon.com should tackle to maintain long-term trust relationship with their users.

Trust Vs Accountability

Interesting commentary in the Washington Post today titled ‘the decline of trust‘ by op-ed columnist Sebastian Mallaby:

In 1995 Francis Fukuyama came out with a book called “Trust,” in which he argued that a society’s capacity for cooperation underpins its prosperity. The same year, Robert Putnam’s famous article, “Bowling Alone,” lamented that the United States was depleting its stock of precious social capital. The question of trust — in government and also in communities — preoccupied politicians too. “It Takes a Village,” Hillary Rodham Clinton urged in the title of her 1996 book, which became a best seller.

You don’t hear much about trust these days. Instead, we want accountability.

There are powerful reasons trust tends to decline and accountability advances. Mobile societies tend to have weak bonds; the Internet makes it easier to hold people accountable and encourages acerbic negativity. And the absence of trust can feed on itself. Leaders function under stifling oversight; this causes them to perform sluggishly, so trust continues to stagnate. But occasionally there is a chance to escape this trap: A shock causes trust to rise, leaders have a chance to lead and there’s an opportunity to boost trust still further.

Interesting dynamics these and it sounds about right. Trust gives an opportunity to establish more trust and lack of trust begets more reasons to not trust…and shock events cause the pendulum to swing from one side to another. What do you think?

What is reputation?

There has been a lot of discussion about how to define and operationalize reputation for on-line communities. Bob Blakley in his beautifully written post – On the Absurdity of Owning One’s Identity – defines reputation as follows:

Your reputation is my story about you. You can’t own this by definition; as soon as you own it, it’s no longer my story about you; it instantly becomes an autobiography instead of a reputation.

James Kobielus has a different take on what reputation is:

Reputation isn’t an attribute of our identity, and it isn’t a story, really. It’s simply an assurance, confidence, or comfort level in which others regard our identity. It’s a vague, qualitative, holistic, often semi-conscious impression, calculated somewhere in the reptilian mind that has descended to us down through the ages. Quoting myself again:

“Relying parties—-the ultimate policy decision and enforcement points in any interaction—-need many levels of assurance if they’re going to do business with us. They gather assertions and data from many IdM “authorities” (authentication authorities, attribute authorities, etc.) before rendering their evaluations and opening their kimonos. They—-the relying parties—-make reputation evaluations based on information fed in from trusted authorities, from their own experiences with us, from whatever reputation-relevant data they can google across the vast field of received opinion and public record.”
Reputation is a computed halo—positive or negative–around our socially contextualized identities.

Reputation is a score computed by relying parties in order to determine whether or not to authorize the reputed party to access resources such as jobs, communities, romantic encounters, time of day, etc.
Reputation is an assurance that someone is worth our while.

This is an interesting take although it almost seems like James is defining the process of generating and evaluating trust based on reputation rather then reputation itself. Phil Windley et al in the paper “A Framework for Building Reputation Systems” have a multi-faceted definition of reputation

Reputation is one of the factors upon which trust is based. The is much confusion between trust and reputation. We consider reputation a building block for trust. We are not concerned in this paper with what other factors go into trust, how trust is built, or how trust is exchanged.

Reputation is someone else’s story about me. I can’t control what you say about me although I may be able to affect the factors you based your story on. Every person should be able to have their own story about me.

Reputation is based on identity. Reputation, as someone else’s story, isn’t part of your identity, but is based on an identity or set of identities.

Reputation exists in the context of community. Any given context will have specific factors for what is important in determining reputation. This is different than saying “communities have a reputation about someone.” Communities do not have beliefs, only people have beliefs including beliefs about what others believe.

Reputation is a currency. While you can’t change reputation directly, reputation can be used as a resource. For example, Paul Resnick et. al. has shown the value of a positive eBay reputation [Res00a].

Reputation is narrative. Put another way, reputation varies with time. Reputation is dynamic because the factors that affect it are always changing. Reputation may require weaving together plot lines from different contexts.

Reputation is based on claims (verified or not), transactions, ratings, and endorsements.. How these factors are used in determining reputation is up to each individual. Individuals may use various evidence in making claims or proposing a certain rating or endorsement. The penalty for making false claims or giving false endorsements varies from context to context.

Reputation is multi-level. A reputation isn’t just based on facts, but is also based on other’s beliefs about the target of the reputation. These beliefs are signaled to others in various ways depending on the context.

Multiple people holding the same opinion increases the weight of that opinion. Reputation systems should have some way of weighting scores. As a related issue, repeat behavior is another way of weighting reputation.

Most of these points make a lot of sense…Reputation is really a key ingredient for establishing trust…and trust really is grease to the wheels of commerce and social interactions. Reputation is more then sum total of a person’s transactions on sites like eBay, it is really more of a person’s interactions in various communities.

One of the other characteristics of reputation, which is not captured in any of the definitions, is that reputation can be transferred. E.g. if a high reputation person recommends a particular person, it improves the reputation of the recommended. It’s almost like the recommended person can bask in reflected glory of the high reputation recommender. We should know, as we are in the process of approaching a number of people for help with our startup. In this process, one of the key things we think about is, what is the best way to approach and get introduced to the target, such that we maximize our potential for success. It will be interesting to see on-line reputation systems account for this critical characteristic of reputation.

On-line Vs Real world communities

Yesterday, I went to the Silicon Valley Indian Professional’s association (SIPA) annual conference in Santa Clara.

SIPA ANNUAL EVENT 2006

It was a great event…well organized…great speakers, especially Mr. Azim Premji (CEO of Wipro) had a very refreshingly thoughtful presentation…large number of quality attendents (they were sold out)…Overall making for a very enjoyable event. My kudos to the SIPA team…I am going to keep my annual membership with them.

At the event, I was stuck by the number of motivated volunteers hustling about and making sure that everything was working as planned. Also most of the participants seemed to be engaged and interacting with other participants. This was in the stark contrast of the on-line communities where the participation levels are rather dismal. Some of reasons for this participation inequality has to do with the higher threshold for participation for real world event. In case of SIPA event, you had to pay about ~$50 for registering then wake up early (I got up at 7:00 AM which is kinda early for a Saturday), dress up and drive over to the event. These thresholds ensured that only the highly motivated participants were at the event.

Another factor driving higher level of participation was that most people were looking to meet interesting new people and exchange cards. I personally handed out close to 25 business cards and collected about the same number. In on-line virtual communities, there is no way to tell who is an interesting person as there typically don’t have a name tag with their professional credentials…Despite that, if you still manage to find somebody interesting, there is no way to exchange business cards with them as both parties are not sure each other’s credentials.

I believe this significantly limits the level of participation in on-line communities.

What do you think?

Karma in the Business World

The Times of India has an interesting article on the inflence of Bhagwat Gita on corporate America.

Suddenly, says Businessweek magazine in its latest issue, phrases from ancient Hindu texts such as the Bhagavad Gita are popping up in management tomes and on Web sites of consultants. Top business schools have introduced “self-mastery” classes that use Indian methods to help managers boost their leadership skills and find inner peace in lives dominated by work.

BW calls its “Karma Capitalism” — a gentler, more empathetic ethos that resonates in the post-tech-bubble, post-Enron zeitgeist. And where it used to be hip in management circles to quote from the sixth century B.C. Chinese classic The Art of War, it says, the trendy ancient Eastern text today is the more introspective Bhagavad Gita .

May be our blog byline had something to do with it too :-)…I have received a few emails about the meaning of the byline…I’ll explain it in a later post.

New Audience Metric

Robert Scoble has an interesting post on the need of measuring the engagement level of the users of a web site. He talks about the difference in user engagement between the Register and Digg:

Well, I’ve compared notes with several bloggers and journalists and when the Register links to us we get almost no traffic. But they claim to have millions of readers. So, if millions of people are hanging out there but no one is willing to click a link, that means their audience has low engagement. The Register is among the lowest that I can see.

Compare that to Digg. How many people hang out there every day? Maybe a million, but probably less. Yet if you get linked to from Digg you’ll see 30,000 to 60,000 people show up. And these people don’t just read. They get involved. I can tell when Digg links to me cause the comments for that post go up too.

One of the factors in determining the engagement level is whether the community is a read-only community (web 1.0 site) like the Register, USA Today or a participatory community (web 2.0 site) like Digg. I would expect the participatory communities like Digg to have a whole lot more engagement compared to a read-only community.

The existing web metrics of unique users and page views should be able to handle this difference though. In the Scoble example all you will need to look at are the number of page views per user over a week’s time and you will get a good idea of the engagement level of each of the user. The other existing metric that can be useful here is to see the distribution of the top pages. On Digg, my guess will be that their top pages graph is a lot steeper compared to a diffused graph at the Register, indicating the common interest of the people coming to Digg (mostly cutting edge web 2.0 tech folks) compared to the Register (more IT folks).

The situation becomes complex when you start looking deeper at the page views themselves as a good metric. In the article “Pageviews are obsolete“, Evan Williams points out the problems with pageviews:

But it’s this pageviews part that I think needs to be more seriously questioned. (This is not an argument that Blogger is as popular as MySpace—it’s not.) Pageview counts are as suseptible as hit counts to site design decisions that have nothing to do with actual usage. As Mike Davidson brilliantly analyzed in April, part of the reason MySpace drives such an amazing number of pageviews is because their site design is so terrible.

As Mike writes: “Here’s a sobering thought: If the operators of MySpace cleaned up the site and followed modern interface and web application principles tomorrow, here’s what the graph would look like:”

Mike assumes a certain amount of Ajax would be involved in this more-modern MySpace interface, which is part of the reason for the pageview drop. And, as the Kiko guys wrote in their eBay posting, their pageview numbers were misleading because the site was built with Ajax. (Note: It’s really easy to track Ajax actions in Google Analytics for your own edification.)

But Ajax is only part of the reason pageviews are obsolete. Another one is RSS. About half the readers of this blog do so via RSS. I can know how many subscribers I have to my feed, thanks to Feedburner. And I can know how many times my feed is downloaded, if I wanted to dig into my server logs. But I don’t get to count pageviews for every view in Google Reader or Bloglines or LiveJournal or anywhere else I’m syndicated.

Another reason: Widgets. The web is becoming increasingly widgetized—little bits of functionality from one site are displayed on many others. The purveyors of a widget can track how many times their javascript of flash file is loaded elsewhere—but what does that mean? If you get a widget loaded in a sidebar of a blog without anyone paying attention to it, that’s not worth anything. But if you’re YouTube, and someone’s watching a whole video and perhaps even an ad you’re getting paid for, that’s something else entirely. But is it a pageview?

One answer to the issue of measuring the customer is provided by folks at AttentionTrust. See my previous post of the subject. Still, though, the issue of how do you know who is coming to your site and what are they doing remains a critical unanswered question. What is needed is a global and user controlled way of share public identity of a user. Using the data for individual users we will have a better chance of handing this issue.

AttentionTrust

Came across this interesting non-profit organization called

When you pay attention to something (and when you ignore something), data is created. This “attention data” is a valuable resource that reflects your interests, your activities and your values, and it serves as a proxy for your attention.

AttentionTrust and our members support the following Principles regarding users’ control of attention data, and we invite you to join us in supporting these Principles by applying for AttentionTrust membership:

  1. Property

    You own your attention and can store it wherever you wish.

  2. Mobility

    You can securely move your attention wherever you want whenever you want to.

  3. Economy

    You can pay attention to whomever you wish and receive value in return.

  4. Transparency

    You can see exactly how your attention is being used.

To capture the attention data they have a browser plug-in that creates and stores click-stream data of your web activity. Users then have the option to either store this data on a local drive or put in an on-line “vault”. The “vault” service is provided by 4 different organizations (users can choose) that are approved by AttentionTrust.org. The idea behind these vault services is that it aggregates data and provide a platform, to other for-profit companies to come up with interesting personalized services for users. The users always controls the data and can release it to any service provider they find interesting. Its a really cool idea but I am having a difficult time imagining the kinds of useful services that can be provided to an individual by accessing their attention data. Still, this is a neat idea for enabling much needed research in the users browsing behavior, of course with user consent.

One of the other issues with the overall idea is how can users prevent companies from accessing potentially important or embarassing information from such logs? The attention recorder browser plug-in has a button to disable recording click-stream data but in my experience I found the button hard to use and remember (not that I was visiting any naughty sites :-)). I also looked at the data that attention recorder collected by looking into the XML file and did not find any data related to movement of the mouse…I don’t even know if that is feasible, but one of the things I do when I am reading (not scanning) a web page is follow my eye focus with my mouse movements. So the mouse movements on the browser might be interesting data to gather. The point here is that web browing data is so private (as evidenced by AOL search terms release fiasco) that there are a number of potential landmines here.

Another issue is how AttentionTrust can guarantee that one of these service providers are not going to misuse the information? Some other interesting liks to follow up for more information:

Attention Wiki

Attention Architecture

All-in-all an interesting idea that will develop with time. Thoughts?

Update: Upon further reflection, some of the services that could be made available will be similar to time-share deals in Las Vegas. The idea is that if you sit through a demo for an hour or so of targetted advertizing and you are rewarded for that attention. AttentionTrust provides a verfication mechanism for validating that time spend.

Six Apart and Enterprise Blogging

Six Apart came up with a new version of MoveableType with features for enterprise blogging. TechCrunch and Read/Write Web had a good review of the release. I had a good discussion with Anil Dash (one of the old-timers at Six Apart) on TechCrunch forum:

Jitendra

I am not sure I understand what Six Apart is trying to do here? I buy that enterprise blogging is a big deal and that the blogs are going more and more mainstream (See my old post on the growth of blogosphere) but the driving force behind the blogosphere is really that blogs are more personal and are not typically encumbered with extensive enterprise controls. Enabling enterprise level controls for blogging is gonna make them sound like enterprise press releases which is just not going to be popular…

Anil’s response

Anil
Jitendra, you raise an issue that commonly comes up, but that I don’t necessarily think is a valid concern. You see, offering better management tools for administrators to do things like create blogs more efficiently, or assign permissions and roles more easily, doesn’t inherently compromise the human voice or expressiveness of blogs. We think that giving IT managers better blogs tools makes it easier for them to get out of the way and let users express themselves in a human voice.

My response

The issue that worries me with the Six Apart direction is the central control of the blogging infrastrucutre at the enterprise level. To me blogging is essentially an individualistic endeavor where you have an opportunity to connect with your audience at a more personal level. From re-reading Six Apart’s positioning for enterprise blogging, it seems like you guys are trying to make blogging more of a collaboration tool. This might be benefitial to certain companies but its not clear how its different from Wikis in such situations?
Thanks,
Jitendra

I guess there is a possibility of getting some traction in collaboration related usage scenarios, but at what price? What about those crazy and expensive enterprise sales cycles? and what about all those requirements for integration with enterprise infrastructure and identity systems? CMS Wire has a great review.

I still don’t get the rationale behind this move…Do you?

Privacy is to be left alone

Interesting series from MSNBC…The first article in the series “Privacy under attack, but does anybody care?” does a good job of capturing the difficulty with the concept of Privacy. The article points to a survey of 6500 users where they try to define Privacy:

Most Americans struggle when asked to define privacy. More than 6,500 MSNBC readers tried to do it in our survey. The nearest thing to consensus was this sentiment, appropriately offered by an anonymous reader: “Privacy is to be left alone.”

The article looks at the issues with putting a value on Privacy and finds the price of privacy to be unassessable.

Perhaps a more important question, Acquisti says, is how do consumers measure the consequences of their privacy choices?

In a standard business transaction, consumers trade money for goods or services. The costs and the benefits are clear. But add privacy to the transaction, and there is really no way to perform a cost-benefit analysis.

If a company offers $1 off a gallon of milk in exchange for a name, address, and phone number, how is the privacy equation calculated? The benefit of surrendering the data is clear, but what is the cost? It might be nothing. It might be an increase in junk mail. It might be identity theft if a hacker steals the data. Or it might end up being the turning point in a divorce case. Did you buy milk for your lactose-intolerant child? Perhaps you’re an unfit mother or father.

“People can’t make intelligent (privacy) choices,” Acquisti said. “People realize there could be future costs, but they decide not to focus on those costs.

The issue with privacy is that human beings are essentially social beings. We are taught to value social interactions and to build relationships. In such an environment, its hard for a common person to value privacy too highly. What do you think?

Engagement Marketing

New York Times had an article, a few days back, on the problems with the traditional brand campaigns and the emerging field of “engagement” marketing. Check it out here (might be restricted content after today).

Marketers of all sorts are now being urged to give up the steering wheel to a new breed of consumers who want more control over the ways products are peddled to them. Exhortations to bring consumers into the tent dominated the agenda of the 96th annual conference of the Association of National Advertisers, which took place here Thursday through yesterday. The nearly 1,000 people who attended the conference — a record for the trade group — heard one speaker after another describe a need to replace decades worth of top-down marketing tactics with bottom-up, grass-roots approaches.

“Consumers are beginning in a very real sense to own our brands and participate in their creation,” he said. “We need to learn to begin to let go” and embrace trends like commercials created by consumers and online communities built around favorite products.

For example, Yahoo Music asked fans of the singer Shakira to contribute video clips of them performing her song “Hips Don’t Lie,” and the submissions were culled to produce a fans’ version of her music video.

It’s a good idea to engage your customers in building your brand. The issue though is “how”. Companies are still trying to figure out how to get engagement marketing to work for them. Just having customers send pictures of products, (that is what Acura is doing in its latest Ad in Newsweek)  is not enough. Companies need to get to the underlying story of who the users and how the product is an important part of their life, to have a better chance of relating to other users.

As noted in the previous post, with the 90-9-1% rule for Intent community participation, spending a whole lot of money to engage a small minority of users who are likely to work with a company, is really not likely to provide a good return on investment. Instead, what is needed is to generate brand messages, with community participation, that can be effective with a majority of the population. For this, companies need to get users to tell their story, with a good placement for the product, such that the story is interesting and potent as a brand message. But with the substrate of anonymity on the Internet this is really hard to do without expensive and explicit customer engagement?

youTube was a huge success of engagement marketing but it had the advantage in the area because their product was the platform for telling such stories but for other brand owners it’s a difficult challenge.