Online community numbers

The Center for the Digital Future at the USC Annenberg School recently came out with a new report titled, “Surveying the Digital Future”. This year’s report had an interesting module looking at on-line communities and social networking. The full report is available for sale but the summary has some interesting pieces of data.

Online communities and offline action — The Digital Future Project found that involvement in online communities leads to offline actions. More than one-fifth of online community members (20.3 percent) take actions offline at least once a year that are related to their online community. (An “online community” is defined as a group that shares thoughts or ideas, or works on common projects, through electronic communication only.)

Social activism – Participation in online communities leads to social activism. Almost twothirds of online community members who participate in social causes through the Internet (64.9 percent) say they are involved in causes that were new to them when they began partic ipating on the Internet. And more than 40 percent (43.7 percent) of online community members participate more in social activism since they started participating in online communities.

Online communities: daily use — A significant majority of members of online communities (56.6 percent) log into their community at least once a day.

Member interaction — Online communities are online havens for interaction among members; 70.4 percent of online community members say they sometimes or always interact with other members of their community while logged in.

This is fairly interesting set of data, especially the last bit related to member interactions. The 70.4% number mentioned above seems to contradict the 90-9-1 rule proposed by Jacob Nielsen. I followed up with the authors of the Annenberg study to find out an explanation for the significant difference. I got a prompt response back from Michael Suman of UCLA (Thanks Guys, you rock!!).

Seems that the 90-9-1 rule was developed back in the 90s, which is an
eternity ago in terms of the net. Maybe that has something to do with the
discrepancy.

Michael

Not an entirely satisfactory explanation…This is still a pretty big discrepancy. I followed up with Jacob Nielsen to get his take on this and got a prompt and detailed response (Thanks Jacob).

Based on their press release, the Annenberg project was only a survey, meaning that it’s based on what people say, not what they do. This means that responses are highly biased in favor of socially desirable statements: people are likely to self-report much more activity and engagement than they actually exhibit.

Also remember that the same person can be a lurker in many contexts and active in a few. If a simplistic survey question asks whether they participate, then the answer could be a truthful “yes”, even if were “no” for 90% of the contexts.

Going beyond the inherent weaknesses of any survey, this survey has been contacting the same people every year for six years. While this provides the benefit of longitudinal data, it also risks increasing the bias, as people may only stay with the survey if they are interested in the kinds of things that the survey asks about.

Finally, what do they mean by “members of online communities”? Quite likely this only includes the 10% of users who are active, and not the 90% who are lurkers, for any given topic. Certainly, the statement that “43.7 percent of online community members participate more in social activism since they started participating in online communities” seems to indicate a very narrow slice of society, since most people don’t participate in social activism.

You highlight the finding that “70.4 percent of online community members say they sometimes or always interact with other members of their community.” First, we don’t know whether they in fact do this, because it’s only self-reported responses to a survey, as opposed to empirical observation. Second, we don’t know how *much* people do this, even when they do. Let’s say that “community members” are the 10% non-lurkers. Then it could easily be the case that 9 of these 10 percent very rarely interact, making the last 1 percent of hyper-active users responsible for most of the interaction.

The fact that participation inequality was named in the 1990s is not a reason to disbelieve it. It’s a reason to believe that it’s a fundamental characteristic of human behavior since it has also been found in systems in the 1980s and 2000s, including the current website examples I mentioned in my article. The older an insight is, the more likely it is to be true, because if it were wrong, there would have been time for lots of studies to accumulate contrary evidence. It’s much more likely that the “latest, hot” press release has some methodological flaw if it contradicts a long-established finding.

Jakob Nielsen, Ph.D.
Nielsen Norman Group

This is getting interesting…What do you guys think?

Don’t trust, just verify

Francois Gossieaux, president of Corante, Inc., has an interesting post about lack of community spirit and trust in corporate cultures. He points to some interesting statistics from a study by American Management Association.

…76% of companies monitor employee web site connections and 55% retain and review email messages. The number of companies tracking telephone calls, including amount of time spent on the phone and phone numbers called has grown to 51%, up from 9% in 2001. And this does not include companies who require periodic medical checks and random drug usage tests.

This is a disturbing trend that runs quite contrary to the idea of karma capitalism (check out the BusinessWeek article that coined the phrase).

On the one hand, open source model of software development that relies on trust in the community, is gaining prominence. Google is a media and community darling based on its “don’t be evil” mantra. Wikipedia is becoming one of the most important and useful source of information on the web. Cooperation is becoming more and more important in the concept of coopetition. E.g. check out Matt Mullenweg’s take on competition in a recent interview.

On the other hand, enterprises are still looking for ways to extend their command and control influence. Remember the Walmart and Edelman PR fiasco or the lack of trust mentioned in the report above.

Best example of trust

This is going to be an interesting tussle…Already, the public opinion is changing to reflect a growing dissatisfaction with business-as-usual. I hope that overtime, this is going to force enterprises to face up to the limitation of command and control, and lead them to appreciate the power of trusting their employees and customer communities.

10 Minute Mail

10 Minute Mail is a new service for creating temporary email addresses. These addresses can be used for registering on sites that require users to provide an email address. The goal is to to rid users of a lot of unsolicited spam emails. Chris Null from Yahoo! has a review of the service:

Well here’s a brain-dead simple solution to the problem: 10 Minute Mail (Note: Web traffic from this story may be causing the 10 Minute Mail site to crash. If it doesn’t load, try it again later.), which provides, for free, exactly what is promised in the name: An email address that vanishes after 10 minutes. There’s no registration, no verification. Just click over to the site and hit “Get my 10 Minute Mail e-mail address.” You’ll instantly be given an address that ceases to exist after 10 minutes. You can then use this address in filling out web forms or whatnot, and a very simple web-based interface gives you full access to any mail the account receives. You can reply to any messages, but you can’t send mail to an account that hasn’t already emailed you. If you can’t get the job done in 10 minutes, you can reset the timer to 10 minutes at any time. There’s no need to login, no password to remember.

For safe surfing and spam avoidance, I haven’t found a simpler, more elegant solution than 10 Minute Mail. It works flawlessly and couldn’t be easier to use. It’s earned a place in my Favorites folder. Give it a spin and see what you think!

I can see this being useful when you want to register for some event or something but you don’t want to receive any follow on emails…Typically, though, most users (including me) have an email address just for the purpose of registering for services that could send spam emails.

Now, what happens if a site requires users to give a valid email address, as part of their term of service (TOS). Isn’t using 10minutemail generated addresses a violation of such terms? Also all the emails that this service generates are from domain 10minutemail.com…Couldn’t the sites that are asking for user email address just reject emails with 10minutemail.com domain, as part of email validation?

Why mask your identity to access a service

Overall, it just seems like a wrong solution to the problem. The real solution is to punish businesses or service providers that spam their users by signing out or boycotting them. Trying to fake one’s identity to avoid potential spam mail, just does not seem like the right way to address this issue.

On-line communities

There are more then 50 million active blogs and community sites like Digg and Slashdot are more popular then ever. But how do on-line communities compare to real-world communities? Is it even fair to compare real-world communities to on-line communities?

In the real world, people in a community typically interact based on geographical proximity. In the blogoshpere its easy for users to join new groups without geographical limitations, as the cost of travelling or joining a community is typically zero. This makes people of common interest to band together much more easily. But because the cost of joining a community is zero, it reduces the community spirit as people can participate without investing much of their time, their money or their reputation. Let’s look at a couple of examples to further explore these differences.

Participation inequality

As Jacob Nielsen pointed out in a recent alert box article – user participation in on-line communities often follows a 90-9-1 rule. This means that typically 90% of on-line users are lurkers, 9% of users contribute from time to time while the rest of the 1% of users participate a lot and account for most contributions.

Now some of the participation inequality is driven by lower costs associated with joining an on-line community which enables not-so-motivated users to join a community…Some of it can be explained by inherent human nature. For example even physical communities display somewhat lop-sided participation characteristics. Internet, though, I think exacerbates this problem. Following are some of the reasons:

  • No rewards for participation. Contributing to a community does not come naturally to most people unless there is a reward associated with contributing. The mechanisms for providing reward for participation are largely missing from the blogosphere at this point in time.
  • The default substrate for interactions on the Internet is anonymity. It takes an extra effort to get them to drop the cloak of anonymity and express their opinions. This happens only when they feel really-strongly about the topic of discussion.
  • Bad UI design that discourages user participation. One of the most annoying issues here is requiring users to register before they can leave a comment.

Spreading the word-of-mouth

In real-life communities, a good band or a good chef gets a fan following. Now as long as the band keeps producing good music or good dishes, those fans will spread the word about the band or the restaurant to their communities, friends and family. Over time this will result in the driving traffic to the band or to the restaurant.
Word of mouth mechanisms are vital in a community

In on-line communities, the situation is quite different. If a reader likes a particular new blog, he/she leaves a comment. This comment on the new blog is not very useful in driving traffic as it does not create any additional incoming links to the blog. Another option that a reader has is to put the new blog on their blogroll. This will create a new incoming link to the new blog. Now, as all bloggers know, getting added to blogroll of a popular blog is a big deal…It only happens if the new blogger is well known or is writing on the same subject as the reader or has established enough credibility in the space. This makes for a fairly high threshold for a new and unknown blogger to get on the blogroll. This lack of a good word-of-mouth propagation mechanism in blogosphere, makes it hard for a new blogger to build a successful blog despite creating great content.

Conclusion

What we need really is an effective word-on-mouth propagation system based on users reading and interacting with a blog. MyBlogLog (rumored to be in talks to be acquired by Yahoo!) did a great job of creating a community of readers. With MyBlogLog, readers could add other blogs that they visit to their profile and thereby create a bit of the word-of-mouth effect. Also the explicit identification of users with MyBlogLog visitor widget humanized the users. This helped in creating trust and more interactions in the community. We need more ideas like MyBlogLog that help community of readers connect, generate trust and share better with each other.

Overall I think on-line groups are indeed communities joining together based on a shared interest. Still we need to address some of the issues – propagating word of mouth, better networking for readers and incentivizing participation – to make these communities a lot more vibrant and participatory.

CopyBot

Great article by Jennifer Granick, on the issues facing Linden labs, the creators of Second life, due to the bursting on the scene of CopyBot. For those not familiar with the havoc a new program called CopyBot, is causing to the economy of second life, below is the summary of the issue from the article:

Businesses in Second Life are in an uproar over a rogue software program that duplicates “in world” items. They should be. But the havoc sewn by Copybot promises to transform the virtual word into a bold experiment in protecting creative work without the blunt instrument of copyright law.

Second Life, operated by Linden Labs, has developed differently from other virtual worlds because it allows custom content and encourages in-world enterprise. It’s a hospitable place for creators to sell virtual goods like clothing, furniture and hairstyles.

As in any economy, the value of those goods depends on their scarcity: people will pay more for a fantastic hairdo that no one else has. If Copybot can indiscriminately duplicate these items, no one has to pay the creator for them. Copying is a value killer.

As a result, Second Life merchants are understandably up in arms over the software, reportedly closing their stores until the problem is resolved.

In the short term, Linden lab is looking to allow copyright holders to sue folks using CopyBot for infringement in courts. This is unlikely to solve the problem as most people don’t have the time or the money to pursue such a case. The other longer term approach Linden Labs is looking at is building a system of norms, kinda like a reputation system, I guess, to incentivize copyright compliance without getting legal about it.

The idea that innovation can flourish in the absence of copyright enforcement is not as heretical as it might seem.

Take the fashion industry. As law professors Chris Sprigman and Kal Raustiala write in their paper on the subject, neither copyright nor patent law prohibit copying fashion designs. There is some protection for the brand associated with the apparel, but no law prohibits a knock-off Chanel suit, peasant skirt or narrow lapel. And yet fashion is highly innovative, with new styles several times a year, despite low IP protection.

Similarly, professors Emmanuelle Fauchart and Eric von Hippel write that haute French cuisine (.pdf) is another area with low IP protection, yet high levels of innovation and creativity. No law prevents copying recipes. Instead, French chefs have developed social norms, much like those Linden Labs seeks to empower, against exact copying, dissemination of tricks of the trade and adopting significant innovations without crediting the chef responsible.

Failure to follow these norms results in reputation harm, including ostracism.

Such a norms-based (rather than law-based) system might work in Second Life. Norms-based systems are context-sensitive and highly responsive to the concerns of the relevant community. They are also cheaper and quicker than litigation. But norms-based systems can only work if the people in the community value the rewards the community can bestow or withhold.

The issue is how can Linden labs expose enough information on individuals without compromising their privacy, such that the community is able to make a judgement about the individual’s compliance of established norms. This will require Linden labs to strike an interesting balance between privacy and transparency.

Another issue is the one pointed out in the article. How will Linden labs make sure that the norms in the community discourage copyright violations? In some real-world communities like some places in China and India etc., it is acceptable and even considered wise, to buy a fake Gucci purse instead of the real thing despite a difference in quality. In the virtual world, the quality of the copied products would be identical to the original products and so the downside of buying copied products will be much less. Would such communities develop in the virtual world of second life?

How would CopyBot effect the pricing power of the creator of the original goods? What is to prevent other vendors from ripping off the original goods and selling them at a lower price? This kind of arrangement will also provide a “plausible deniability” to end users and make the norm-based enforcement almost impossible.

I have no idea how these things will evolve. One thing is for sure…This is going to be interesting to watch.

Some other takes on this issue:

CopyBot, Community and Controversy

Second Guessing Property Value in Second Life

Anti Click Fraud

I did a post on click fraud some time back but I want to get back to the topic. Last week, VentureBeat profiled ClickFacts, a company focused on providing more accurate information to brand-owners about who clicks on their on-line ads.

ClickFacts offers a Web-hosted product. It doesn’t manually track the IP addresses of the people who are making the clicks, like many other click-fraud companies do; that doesn’t suffice anymore, because fraudsters are more sophisticated, using programs that cover up their IP addresses. Instead, ClickFacts has developed an analytics program that looks at other variables.

–It looks at customer’s keywords and analyzes which ones have a higher propensity for click-fraud.
–It compares the sites of competing companies, and shows which IP addresses are hitting each of those sites, and tries to assess patterns, such as whether the timing of clicks appear programmed, for example hitting every 3.4 seconds. But it’s also measuring traffic anomolies beyond IP address patterns, to test for “proxy” and others fraud sources.
–It tests for group activity, in case several publishing sites or other people have colluded to click on each other’s sites or other sites.
–It shows which Web sites are consistently seeing lots of clicks on an advertisement, but where those clicks aren’t leading to continued activity within an advertiser’s site.

This is somewhat different from what some of their competitors are doing…

An incumbent player in this area is Optimal iQ, which is owned by ClickForensics, but its approach relies more on reading logs.

Its still not clear though, what brand-owners can do with the data generated by these companies? Can they go back to Google and as for a refund (Apparently Google and Yahoo! only refund 1% of your advertising $$ spend)? As expected Google has come out critically against ClickFacts and its VentureBeat Profile (posted as updates to the story).

Update: Google has responded to this here and here, sending some reports arguing that ClickFacts has counted as fraudulent clicks some clicks that appear benign. It has to do with when users use the back arrow after clicking through to a advertisers page: When you click on a Kodak ad, for example, and land on the Kodak page, and then click on a specific camera there, and then click back to go look at another camera, ClickFacts is wrongly counting that as a bad click.

Update II: ClickFacts has responded, in turn, saying it is surprised Google is bringing this up. ClickFacts fixed this “back arrow” problem in June, after Google first brought it to ClickFact’s attention, Caruso tells VentureBeat. In fact, he showed us a demo that strongly suggests to us ClickFacts is no longer counting those clicks as bad. He said Google’s reports above refer to an analysis of ClickFacts in February, and that Google knows ClickFacts has solved it, and so this should be a moot point. He said he wants to work with Google on other click-fraud matters, too.

The real issue here is that we need a good and universally accepted way to establish the identify of a user. Its a hard problem. Another manifestation of this problem is Digg with SpikeTheVote related issues…There might not be any quick fixes as this might require a very expensive infrastructure solution. Still the importance of solving this problem cannot be overstated. The future of on-line commercial and community activity may depend on it.

Anonymity is not privacy

I am quickly becoming a fan of Dave Kearns and his Identity Management Newsletter in Network World. Dave discusses complex identity related issues but manages to write in a very simple and easy to read style. In his latest installment Dave talks about the difference between privacy and anonymity

I’d like to begin a discussion on anonymity as it relates to identity and technology. As noted last month, anonymity and privacy are frequently confused. One difference though is that privacy is almost always absolute (either something is private or it is not) while anonymity can be relative. If you look up “anonymity” at answers.com, you’ll find some variations in definition:

* “The quality or state of being unknown or unacknowledged.” (The American Heritage Dictionary of the English Language, Fourth Edition)

* “The quality or state of being obscure.” (Roget’s II: The New Thesaurus, Third Edition)

Anonymity is characteristic of interactions in a specific context…Like you getting a coffee from a coffee shop or leaving a comment with a made up name on a forum.

If I join a chatroom where I’m only known as “SillyGrrl” I may think I’m anonymous because I think no one knows my true identity. But the chatroom has the IP address I use to converse and my ISP knows who was using that IP address at that time. Even if I go to a library terminal or an Internet café, there are records of who used which machine and IP address at any given time. Privacy considerations may lead to those records being destroyed periodically – monthly, weekly, daily – even hourly. But anyone with the wherewithal to be watching while I connect (just as the police were watching outside the coffee shop) can shatter the façade of anonymity and connect the activity to me.

In the course of our life and through out our day, we are going in and out of various contexts in various states of anonymity. We might assume that our status in a particular context is anonymous, depending on weather we share uniquely identifiable information in the context. But as Dave point out and as outlined in this excelled video from Google tech talk, “You Are What You Say: Privacy Risks of Public Mentions“, (thanks Nitin for pointing this out) the risks to your identity, from somebody taking the time to collapse and search across, such contexts, is severe. Anybody remember the AOL Search data release fiasco. I guess, with this background, we can define privacy as a guarantee, that your data will be kept silo-ed and not shared or merged with other contexts.

The upshot – we should be careful about what we say in public forums because even with rudimentary search across contexts, people may be able to find out a lot about you. Even scarier, is somebody forming a company to just search across various public contexts on behalf of clients…In fact, I am pretty sure such companies already exist. So be careful.

Who owns your reputation?

Who owns your reputation? Apparently you, if ReputationDefender has its way…to the potential chagrin of the great Bob Blakley. In this review of the company in Wired magazine:

The mistakes you make on the internet can live forever — unless you hire somebody to clean up after you.

A new startup, ReputationDefender, will act on your behalf by contacting data hosting services and requesting the removal of any materials that threaten your good social standing. Any web citizen willing to pay ReputationDefender’s modest service fees can ask the company to seek and destroy embarrassing office party photos, blog posts detailing casual drug use or saucy comments on social networking profiles.

The company produces monthly reports on its clients’ online identities for a cost of $10 to $16 per month, depending on the length of the contract. The client can request the removal of any material on the report for a charge of $30 per instance.

The troubling thing about ReputationDefender, is the way, the company might go about fulfilling a customer request for eliminating embarrassing information.

Fertik declined to offer an exact description of his company’s means of removing content. “I can say we have codified a series of procedures that we are continually refining,” he says, “and that are specific to the source, location and nature of the content we are asked to destroy.”

If you’re a website owner and ReputationDefender knocks on your door, you are not legally bound to remove anything until a judge orders you to — a scenario that most website owners are keen to avoid.

“Most people will take materials down just to avoid the hassle of dealing with possible litigation,” says Susan Crawford, an associate professor at Cardozo Law School who specializes in cyberlaw and telecommunications law.

“If the letter is sufficiently threatening,” says Crawford, “the threaten-ee could bring his or her own lawsuit seeking a declaration that what they posted wasn’t unlawful. But, again, most people will just buckle rather than fight back.”

I am fine with the company deploying its “series of procedures” on the behalf of a teenager but I am afraid that these services would be misused to strong arm websites to remove newsworthy or to-be-newsworthy information from the web. An extreme example of a similar situation was presented in the movie Eternal Sunshine of the Spotless Mind (that was a great flick) where Clementine Kruczynski (Kate Winslet) had her memory erased of all references of her ex-boyfriend Joel Barish (Jim Carrey). In this case the memory being erased is our collective record, the web, and similar to the movie there is a huge potential for foul play here.

I tend to agree with Bob that one just controls one’s actions. After the actions have been taken one shouldn’t really be able to influence other people’s story about those actions…This is not good news for the web and social-media.

Privacy is the ability to lie about yourself without getting caught

Check out this old article by Dave Kearns about a presentation by Bob Blakley on the subject of Privacy.

Blakley spoke on the topic “What is Privacy, Really?” a subject near and dear to him as well as to many others in the identity realm. Privacy was, in fact, one of the driving forces behind the so-called “user-centric identity” movement.

But privacy is a widely misunderstood concept. It’s frequently confused with anonymity, often confounded with security and colloquially termed the “right” to be “left alone.” As Blakley puts it, “I don’t want to be alone, but I still want privacy.”

After about 20 minutes of telling us what privacy wasn’t, Blakley came around to stating what it was: “The ability to lie about yourself and get away with it.”

He was quick to point out that it wasn’t positing a right to lie (that’s an ethical, or legal question), just the ability to lie. What that means is that when someone asks you a question and you reply with an answer, the questioner cannot judge the veracity of your information. As Blakley more elegantly stated it: “If you could tell a listener the truth or tell him a lie … And if he would accept either story … then he has given you the benefit of the doubt…”

I think a lot of us take advantage of the ability to lie, by providing false information on intrusive web forms. Another element of the privacy that this definition does not quite capture, is that the information submitted by a user,  is contained at the site and not shared with any other sites…No wonder, people have a difficult time defining privacy and just want it to be left alone.

Attack of the Bots

Check out the great article in the wired magazine, regarding the power and menace of the bots and their controllers:

AT FIRST, IT LOOKED LIKE typical network congestion. So the system administrators weren’t too concerned when TypePad blogs and LiveJournal social networks flickered like a light bulb in a faulty socket. But 15 minutes later, at 4 pm on May 2, 2006, the sites went dark, and so did the mood at Six Apart, the company that owns them. In the blink of an eye, 10 million blogs and online communities disappeared. “It looked like the servers had freaked out,” CEO Barak Berkowitz recalls. Flash floods of data thundered into one network port, stopped inexplicably, then reappeared to overwhelm another. The engineers pored over logs, desperately looking for a cause. After an agonizing hunt, they found it: a distributed denial-of-service attack, or DDoS. Six Apart’s servers had been inundated with so many requests that the machines couldn’t possibly process them all. It was the digital equivalent of filling a fish tank with a fire hose.

“After learning about bots, you might think, ‘I feel hopelessly outgunned and outmatched,'” says Peter Tippett, CTO of security consultancy Cybertrust. “You are.”

Its a fascinating look into how paid organized attacks are used to extract money or even shut down companies…It is still wild wild west in some areas of the Internet and without the limitations of the geography, its hard to see how we will be able to get a handle on these issues. This is going to be a big challenge.