Web Spam Detection
- Marc Najork
in Encyclopedia of Database Systems
Published by Springer Verlag | 2009 | Encyclopedia of Database Systems edition
Web spam refers to a host of techniques to subvert the ranking algorithms of web search engines and cause them to rank search results higher than they would otherwise. Examples of such techniques include content spam (populating web pages with popular and often highly monetizable search terms), link spam (creating links to a page in order to increase its link-based score), and cloaking (serving different versions of a page to search engine crawlers than to human users). Web spam is annoying to search engine users and disruptive to search engines; therefore, most commercial search engines try to combat web spam. Combating web spam consists of identifying spam content with high probability and – depending on policy – downgrading it during ranking, eliminating it from the index, no longer crawling it, and tainting affiliated content.
All copyrights reserved by Springer 2009. This entry has been published in the Encyclopedia of Database Systems by Springer. The Encyclopedia, under the editorial guidance of Ling Liu and M. Tamer Özsu, is a multiple volume, comprehensive, and authoritative reference on databases, data management, and database systems. Since it is available in both print and online formats, researchers, students, and practitioners benefit from advanced search functionality and convenient interlinking possibilities with related online content. The Encyclopedia's online version is accessible on the platform SpringerLink. Visit http://www.springer.com/computer/database+management++information+retrieval/book/978-0-387-49616-0 for more information about the Encyclopedia of Database Systems.