Twokinds ARCHIVE Forums

This forum is for the preservation of old threads from before the forum pruning.
It is currently Tue Apr 15, 2025 4:42 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 142 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next
Author Message
 Post subject: Prototype Bot-Hunter List
PostPosted: Sat Sep 20, 2008 9:18 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
This thread is being linked to as part of a reply for those interested in it to avoid spamming the topic thread.

TrueBots List
(List has been regenerated on Page 2)


MaybeBots List
(List has been regenerated on Page 2)


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 9:49 pm 
Offline
Templar Inner Circle
User avatar

Joined: Wed Mar 09, 2005 1:55 am
Posts: 2885
Location: Somewhere in my pants.
I know for a fact that Joan-Michele is not a bot.

She's a forum veteran.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 9:54 pm 
Offline
Grand Templar
User avatar

Joined: Wed Aug 29, 2007 3:24 pm
Posts: 1545
Location: Carmina Gadelica
It seems the program needs a little tweaking, with the first name on the list being not-a-bot...
o.o

But still, it is impressive...what kinda parameters were used? Post count? Keywords?

Now, if only there was a program that would delete all of the bots...I hate to think of all the hours it would take to manually delete all of those accounts...
o.x


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 9:56 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
Edit*
It retreives information from member profiles like email,msn messenger,icq,posts,location,occupation, and interests. I had found a pattern in occupation and interests in many of the bots starting from around 2005. although there are a few other patterns that I have noticed I haven't had the time to code in the needed logic to catch those patterns as of yet. The case with LadyWarrior is that she posted a website in here Location field and the Prototype currently assumes such placement to be behavior of a bot, it'll take time to refine it as I go along.




These are just the results from the prototype, it isn't going to be perfect right away because it is still being worked on, feedback will fortunately help reduce the chance of errors, there are still more filter methods that can be added to improve the number of actual bot accounts and reduce it mistaking a real member's profile as a bot profile.
Any ideas for making it more accurate are appreciated.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 10:00 pm 
Offline
Grand Templar
User avatar

Joined: Wed Aug 29, 2007 3:24 pm
Posts: 1545
Location: Carmina Gadelica
Any way to check post count against join date? 'Cause anyone who has less than 5 posts who've been here more than a month should probably be booted...and that criteria would include the vast majority of bots as well...
:3


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 10:10 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
That's one of the ideas that I've had but I wasn't sure exactly if a time-based method would be appropriate or not since someone said before that the inactive members aren't the problem. But what about members with no posts or any other information besides perhaps an email? Should they be included after so many months?


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 10:17 pm 
Offline
Grand Templar
User avatar

Joined: Wed Aug 29, 2007 3:24 pm
Posts: 1545
Location: Carmina Gadelica
Some of our most active members provide little if no info in their profiles, so that would be casting your net a bit too widely, so to speak...

If your program has the ability to cross-reference, then that would be a good element to include, though, as long as it is qualified by something else...
:3


Top
 Profile  
 
 Post subject:
PostPosted: Sat Sep 20, 2008 10:20 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
Maybe I can find a way to cross-reference the creation of emails given with the creation of the forum member. Any ideas for other methods anyone?


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 21, 2008 12:14 am 
Offline
Templar Inner Circle

Joined: Tue Jul 15, 2008 1:37 am
Posts: 3264
Location: Washington
O.O

I knew the list was big, but jeez!
Have you gotten any sort of OK from any of the mods? Just curious.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 21, 2008 1:15 am 
Offline
Citizen
User avatar

Joined: Wed Jul 30, 2008 5:13 am
Posts: 96
Considering that (as far as I know) you're limited to information available on profiles, the fact that you've generated this list is pretty amazing.

I would check for the presence of an avatar. At least it's shown on the user profile page.

By publicly available, I refer to the fact that the database stores such nice stuff as the last time the user visited, the time the user registered, etc. (and in epoch time, so you can easily do a delete where $last_visit-$registered < 3600).


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 21, 2008 2:44 am 
Offline
Templar Master
User avatar

Joined: Thu Jul 17, 2008 10:09 am
Posts: 443
Location: My own little fortress...
I have next-to-none knowledge about coding, but wouldn't something like that place the forum newbies under removal threat?

Anyway, impressive list you've managed to gather - even if we don't yet know how many of those really are bots.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 21, 2008 5:21 am 
Offline
Citizen
User avatar

Joined: Wed Jul 30, 2008 5:13 am
Posts: 96
Not if you combine it with other criteria...

Personally, I think that if a user hasn't been seen for 1 year+ (this is the tricky one that requires direct db access) or hasn't been seen since registration+1 hour, and doesn't have any posts, and it's been at least a week since they registered (or longer), and matches whatever criteria that KitWiz has managed to find should be considered candidates for deletion.

In pseudo-code for anyone who wants it,
Code:
//Check if the user hasn't visited for the last year OR hasn't visited since the user was registered other than in the first hour since the account was registered AND the account is more than a week old.
if(((time_now - lastvisit >31536000)||(lastvisit-registration<3600))&&(time_now-registration<604800)){
//And check that the number of posts is more than 0
   if(posts==0){
   //Delete the bot
   }
   else {
   //Probably real, there's an actual post from this account
   }
}
else {
//Probably real, they've visited in the past year or they visited within the week since they've registered, and the account is more than a week old to allow for the newbs.
}


Considering that with a quick look, I can only see a random occupation and 2 interests so far, so I'm not going to attempt to find out his criteria, mainly because I'm supposed to be studying right now for upcoming exams. I would love to see the code used though. (And then it becomes valid studying, because I take Com Sci as a subject. :grin:)
Pattern matching would definitely be necessary, though how is it implemented? Parsing out the interests (for example) by finding the line in the HTML source, reading until <br>, then counting the number of commas? (Based on the 2 interests theme that I see so far...)

I'm also for adding recaptcha to the registration page in an effort to stop the bots from getting in the first place.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 21, 2008 6:45 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
Back from work, I've also noticed that I can access the last post time(if applicable) as well as review the content of the posts made by that member. OH, I've also cleaned up and somewhat refined the Logic for detecting bots and possible annoyances.

I'll be adding the new results in a bit once they get finished. ^_^


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 23, 2008 10:53 pm 
Offline
Citizen

Joined: Tue Sep 16, 2008 10:27 pm
Posts: 99
Well I've added a few things and removed a few, so it's now mistaking people as bots far less than it did, but there is still room for even more improvement. Post here to inform me of any false-positives that you people may find so that I can improve the person-exclusion parameters.

I've also added a snippet of code so that the lists also show which bot parameter selected an included name to make it easier to debug false-positives, the information is given as the 3-digit number after each name. Also, any additional ideas for new parameters are gladly welcomed.

*Lists have been regenerated


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 23, 2008 10:58 pm 
Offline
Templar Inner Circle

Joined: Tue Jul 15, 2008 1:37 am
Posts: 3264
Location: Washington
... OK, I saw my name somewhere on that list >_____>
RL, not screen. Eerie.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 142 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group