TagScore: Approximate similarity using tag synopses
Collaborative tagging is the aggregate effort by a community of online users to annotate web content with metadata labels called tags. It is a simple activity that enriches our knowledge about digital content, and has gained popularity with services such as Del.icio.us. Del.icio.us has a large repository that evolves daily, presenting interesting new problems for IR. We present TagScore, a scoring function to rate the goodness of Del.icio.us tags for their associated web page. It gives us a succinct synopsis for a page that we can use to efficiently find similar pages. Using real Del.icio.us data, we show that our approach gives good correlation to cosine similarity but is several hundred times faster and requires minimal storage overhead.
Keywords: Web, tagging, social bookmarking, del.icio.us