VSTF Wiki:Wildtech

Wildtech is an IRC bot that reports spam edits and page creations in. It uses machine learning techniques (classification algorithms) to determine whether an edit is spam or not.

Wildtech reports 2 different qualifications of spam:

: Spam page title  Spam? http://spam-wiki.wikia.com/index.php?diff=60 Spammer * (+4096)

When the length of the data is not sufficient to make a good prediction, but the result of a prediction is positive, the bot will write:

: Short, possible spam page title  Possible spam? (not enough data to make reliable prediction) http://maybe-a-spam-wiki.wikia.com/index.php?diff=60 Another spammer * (+230)

In order to make a classification of the edits, Wildtech uses a data set of a bunch of spam and non-spam examples. In order to add a new spam/non-spam sample, a  command may be used:

By executing this command, a page/revision data will be stored and labeled as spam or non-spam respectively. Wildtech will use this data later in order to improve it's predictions. Dataset changes won't be applied instantly - Wildtech will re-learn a data set in a nearest 3 hours and then use a new data in it's predictions. Only voiced users and channel operators can execute this command.