On Jan 15, 2015 wikiHow will enter our second decade as we celebrate our 10th birthday. Old timers will remember how I used to be fond of saying that accomplishing our mission will take decades to accomplish. That turns out to be true
As we enter our second decade, our shared fervor to give every person on the planet the highest quality, most helpful how-to instructions is burning brighter than ever. During much of our first decade we tolerated and even welcomed low quality articles in wikiHow as we recognized that wikiHow was a work in progress that needed time to improve. As we enter our second decade, I think we can now begin pushing the minimum quality standard we accept upwards. For years many great community members have asked, begged and pleaded with me to come down harsher on some of the low quality articles that compromise wikiHow’s reputation as a trustworthy and authoritative source. I have historically always wanted to wait a bit longer to allow wikiHow more time to develop. After 10 years of patience, I think now is a great time to change how we deal with lower quality articles. As a first step, I’ve asked Chris and Gershon to build an algorithm and bot which identifies articles to put stub tags on. Stubbing more articles will communicate to our readers that we don’t think those articles meet wikiHow’s quality standards. A stub article is also less likely to get read or shared. This move will most likely hurt wikiHow’s readership numbers in the short term. I know that is painful, since we all love seeing how many people read wikiHow. In the long term if we can only show readers articles that we are truly proud of, wikiHow reputation for excellence will rise and our readership will accordingly. This automated process will likely only be 80-95% accurate, so it will mislabel some non-stubs as stubs. To correct for this problem, we’re going to do a couple things. One is we will publish lists of each article the algorithm has labeled a stub, that way anyone can look over them and correct the errors. Second, the bot stubbing will be clearly marked in page history so we can all see what happened. Third, I’d encourage anyone who finds a stubbed article that really shouldn’t be stubbed, to simply remove the tag. I expect to start the computer algorithm based stubbing to start at some point in the next few days. I’m expecting the computer will stub around 5,000 articles. The list of stubs will be published on the forums once we push this live. Like all things with wikiHow this is an experiment. If this turns out to be a tremendous mistake (and it easily could be), we can easily take the stub tags off the articles automatically. If this goes well, we may want to keep doing more. Here is an example of an article that the bot would stub: http://www.wikihow.com/Buy-Call-of-Duty-Mods
Not all articles deserve a stub tag as badly as this one though! There will be plenty of borderline situations. So what do you think? Are you in favor of raising our minimum quality bar as we approach our 10th birthday? Do you think this is a good approach to start raising our quality levels in anticipation of our 10th birthday?
@JackHerrick
I’m glad I’ll be around to celebrate wikiHow’s tenth birthday!
I very much agree with you on the article quality. I gets tiresome seeing duplicate, stub, and just plain low-quality articles around wikiHow. I tried to take some action by doing more merging, but it’s just more than a few people can handle, and I’m not a full time staff member, so I don’t have the time to run a full-time project (although I wish I could). Ever since I’ve noticed all the low-quality articles, I’ve hoped that the wikiHaus would take action some time, since, of course, they have the most influence. I hope the project goes well, and tell me if there’s any way I can help out.
system
3
Will someone explain as to how the bot will differentiate between short articles, as this is a topic that challenges even human editors.
The bot uses a variety of factors including reader behavior on the article, accuracy scores, page views, page length and other metrics. So it isn’t a simple as looking at page length. As I mentioned in the first post it’s not perfect. I’d say it gets about 80-95% correct. Adding human review to it’s work will be very helpful.
I’ll be there for the 10th birthday as well. I’m not exactly a long-time user, but I know the site pretty well by now. As for the bot, I’m in support of raising the minimum quality standard, but I can’t say I 100% support the use of a bot. I’m on the fence about it, really. While it seems useful enough, the fact that it may be as low as 80% accurate concerns me a bit. I see the potential of the bot mass stubbing articles that shouldn’t be stubbed and leaving a mess to clean up. I know that bots can’t always be accurate, but it’s enough of a concern for me to mention about.
@JackHerrick
Will the bot only target articles that have been around for a while, or just all articles in general? I’m interested to know this since you mentioned it takes into account page views. A new article, however good it may be, might not be popular enough to get a lot of views in the first, say, month. How does the bot work with those kind of issues?
system
7
Here is the list! https://docs.google.com/spreadsheets/d/1zb7DZKBU\_ps6oehrJT2S7RGshOtnPGtwHRi\_jUVFo\_E/edit#gid=0
system
8
@Danielbauwens
It’s all articles, but I don’t think page views were weighed to the extent that a good article would get stubbed. If you find any examples of that please let me know so I can send the feedback for fine-tuning!
This is a nice new system! I am definitely in favor of raising our quality bar. re: wikiHow’s 10th birthday/anniversary - I’m glad that I will be around to celebrate this momentous occasion! Looking forward to it! Please keep us updated if there is a meetup, too. We would absolutely love to attend
Hailey
10
I also am in favour of raising the quality bar. This seems like a pretty good idea. I am going to be here on wikiHow’s 10 anniversary too.
system
11
I am very pleased to see quality issues being addressed. Your technical team clearly have talent! The stub program is a great innovation. I will do my best to look for stubs I can add to as frequently as possible. I wonder if the techniques applied to the stub aspect of quality control could be used in any way to stop users adding one word tips?
If there were a way to parse article histories, software might identify articles that used to be much larger, then had vandalism been patrolled in. If proper edit summaries have been made, articles that had been previously stubbed, then had the stub tag removed might also be identified by software. Articles identified to fit either of the above are likely not stubs. *minor ce
I have an article incorrectly identified. See it’s discussion for my comments about the Stub Bot. Guess wikicode don’t work here. It’s at http://www.wikihow.com/Discussion:Bookmark-or-Favorite-Various-Web-Browsers
.
Have it exclude inuse tagged articles and it would help fix a major problem.
Thanks for the heads up on this problem @Robbieleeactor
. I’ve asked the engineers to fix if we run this again.