User:Staeckerbot
From Wikipedia, the free encyclopedia
This is the userpage of a trial bot operated by User:Staecker.
This bot's main function will be to place speedy-delete tags on duplicate image uploads.
See my request for approval.
[edit] Details
The bot runs every 15 minutes and scans Special:Newimages. At the first pass, the bot just checks for uploads with identical filesizes. Any files with identical sizes have their thumbnails downloaded and compared directly (with a simple "diff").
- If the thumbnails are identical, then:
- If the file descriptions differ, each description is copied into the duplicate page using User:Staeckerbot/Duplicate-file-info as a template.
- If exactly one of the versions is orphaned, then that version will be nominated for deletion using Template:Db-redundantimage.
- If both versions are orphaned, then the one which was uploaded first will be nominated for deletion using Template:Db-redundantimage.
- If neither version is orphaned, then neither image will be nominated for deletion, but both will be tagged with Template:Duplicate
- If the thumbnails differ, then the images are logged to User:Staeckerbot/Suspicious images, to be handled by a human editor. I have found that most such cases do represent duplicate images, perhaps with different metadata or other differences invisible to the eye.
The bot is built on the pywikipedia framework, and runs on ubuntu linux.
I welcome any comments or suggestions you have on improving the bot- Staecker
[edit] Logs
- User:Staeckerbot/Suspicious images Images which were uploaded closely together and have the same file size, but differ as binary files. Many of these may be duplicate images, but need to be checked by hand.
- User:Staeckerbot/Trial log A log from my trial period, recording each edit (they get deleted from Special:Contributions/Staeckerbot)
- User:Staeckerbot/Preapproval log A log from before my approval to edit- about 1000 recorded dupes, not marked for deletion by the bot
[edit] Statistics
The bot has been running since March 17, 2007. As of Wed Mar 28 23:30:08 2007
Days in operation | 11 |
Images nominated for deletion | 847 |
Megabytes nominated for deletion | 153 |
Average nominations per day | 77 |