Mining Wikipedia paper at ICWSM 2012

Britney Spears and Kobe Bryant at VMA Yay! My paper entitled “What Britney Spears and Kobe Bryant Have in Common: Mining Wikipedia for Characteristics of Notable Individuals” was accepted at ICWSM 2012

The pdf can be downloaded here:
Mining Wikipedia For Characteristics of Notable Individuals.pdf

So what do Britney and Kobe have in common? They’re both successful, and my research shows that having a rare name increases the chance of success.

Wait — you say, Britney is a common name! Not so — when Britney was born in 1981, her name was far down the list of popular names — #758 as a matter of fact. So in her age group, her name was quite rare, and that really distinguished her from other musicians. I remember listening to those albums back then, people always said “Christina Aguilera”, but when you said “Britney”, everyone knew you were talking about the one-and-only Ms. Spears. Only later, when she gained immense popularity, did her name become common as parents started naming their daughters “Britney” (the name Britney rose to rank #137 in 2000).

When Britney was becoming a star, her uncommon name helped her. This is not surprising for entertainers, but according to my research, this observation holds for athletes and successful people in general.
And if you don’t have an uncommon name, then my research shows that using a nickname also helps. Think ‘Steve’ Jobs.

I also looked at birth locations. If you’re born in California or New York, you’re 2x more likely to become an entertainer. Not too surprising, because of Hollywood & Broadway. If you’re born in the South, there is increased chance of becoming an athlete.

This isn’t to say that if you have a common name or you weren’t born in these states, there’s no chance you will become famous. It just shows there is an enrichment for these characteristics.

So if you have a common name, try using a nickname!


  • People with rare names more than 2x likely to appear in Wikipedia (2.43x for women; 2.30x for men). [More]
  • People with nicknames are also more likely to be in Wikipedia. Males with nicknames are 2.39x more likely to appear in Wikipedia while for females it’s a 1.32x increase
  • Individuals born in New York and California are ~2x more likely to become entertainers, and those born in the South are ~1.5x more likely to become athletes.[More]

There’s a lot of data in Wikipedia, it can be mined for much much more. This paper describes a couple of features — more associations can be gleaned in the future.


No comment yet

1 ping

  1. Wikimedia Research Newsletter, July 2012 — Wikimedia blog says:

    [...] Mining Wikipedia for common traits of notable individuals: Researcher Pauline C. Ng presented a paper at ICWSM ’12 showcasing the potential of using Wikipedia as a corpus of data to study the common characteristics of “notable individuals”.[9] Names and birth locations of a list of 40,250 people born in the United States from 1940–1989 and with a Wikipedia article were compared against census data. The analysis reveals interesting patterns such as the fact that “people with rare names [are] more than 2x likely to appear in Wikipedia” or that “people with nicknames are more likely to be in Wikipedia”, but with a significantly more pronounced effect for male than female individuals. The author suggests that mining Wikipedia biographies may help “discover novel characteristics associated with positive life outcomes”. The main findings of the paper are summarized in this blog post. [...]

Comments have been disabled.