Wikivoyage uses a spam filter to block external links to certain websites on a keyword basis. The spam filter consists of two lists of regular expressions against which edited articles are compared. If the edited text matches the regular expression, the user is warned and the file is not saved.
The main list of regular expressions was available at http://wikivoyage.org/bannedcontent.txt. It was a copy of the BannedContent list on CommunityWiki, which originated the idea of a distributed regular-expression list. This list is not user-editable.
A secondary, local list of banned hostnames was maintained at Project:Local spam blacklist. This list was editable (for registered users); sites that spam Wikivoyage were added to this list.
After the move to Wikimedia Foundation servers, the spam blacklist and whitelist are now:
Wikimedia also maintains shared lists on meta: which affect multiple projects. These pages are editable only by administrators; update the corresponding talk page to request a change.
Since this spam filtering system was put into place, we've had a very steep decline in the number of spam postings on Wikivoyage. This has freed up people to work on other things, like making a great travel guide.
There are a few problems with the banned content list, as it stands:
- It's kind of unwiki -- a technological solution to a social problem.
- It's overstretching -- the list we use has a lot of general regular expressions for words like "porn", "pharmacy", etc., which occasionally end up being part of foreign words or important attractions we need to link to.
- It's relatively unresponsive. It takes a while for new URLs to show up on the list.
Sometimes spam sites slip through the filter, and sometimes valid sites are incorrectly blocked — please report both kinds of problems on the spam filter talk page so they can be fixed. Alternatively, for URLs that are being incorrectly blocked, Wikivoyage maintained an editable Project:Local spam whitelist. If a URL is being blocked that should not be, it was listed on this page with an explanation of why the URL needs to be whitelisted on the whitelist talk page.