J.TrIDr3ESpPJEs ID: b7fb97 Aug. 11, 2018, 5:20 p.m. No.2561043   🗄️.is 🔗kun   >>1161

>>2412127

Thread parsing automation (you'll need to install beautiful soup and mechanize, on Linux this is easy [do an apt-cache search for beautiful soup, and then mechanize]).

 

Here's HunterKiller bot (it can parse threads, but the code is sufficient enough you can probably re-engineer it to parse posts from threads). It was censored by the mods when they deleted the Q-branch thread:

 

https://pastebin.com/LmPFhtXm

 

Uses python 2.7 to my knowledge. Can't help with Windows, I ditched that shit OS years ago.

J.TrIDr3ESpPJEs ID: b7fb97 Aug. 11, 2018, 5:29 p.m. No.2561191   🗄️.is 🔗kun   >>6977

>>2449345

 

Those are 'professional level' software bots, which have been in development for quite some time now (since at least 2010). The end goal of such software is to get 'natural speaking' bots that can engage targets (and 'talk' with other bots to make it seem like a legitimate conversation is ongoing).

 

I spent years lurking and interacting on dubious forums where such malicious activities were being tested. The bots aren't perfected (they have the same flaws as normal chatbots, presently), but there's an ongoing effort to make them 'more advanced'.

 

HunterKiller bot is homebrew, but it's based on several iterations of code which was based on observations of the so-called 'professional' bots. Such software is sold to both military and political activism groups (the bad kind: think Media Matters).

 

HunterKiller was my proposal to counter the bots: basically, a bot advanced enough to hunt other bots. What I've given you is a barebones example that should contain sufficient enough information for you to build your own variants.

 

It's my personal opinion that passion trumps corporate software development any day of the week. That code took me about 7 days to write (I have limited free time), but the potential to tack on other, more advanced Python libraries are there.

 

PS: Shills often use scripts (folders and pieces of paper in more primitive operations), more advanced shill operations use specially designed software that allows them to copy/paste generic garbage responses (usually across several accounts or IPs, depending on sitch), and very advanced shills have bots that automatically select what garbage to copy/paste with the shill acting as bot handler.

 

Check out the Clown College thread where I explain more on bots in my earlier posts.

 

From a strategic standpoint, you have the homefield advantage, because shills/bot posters reply on spam and generic replies or obvious tells. With a HunterKiller bot that is sufficiently well programmed, you can mass identify these bots and shills for some beatdown with administration tools.

 

Eventually you will experience shills who have tools that can 'thesaurus' the words around so it seems 'different' so don't reply on verbatim matches but perhaps even Markov chain analysis.

 

Hope this helps.

J.TrIDr3ESpPJEs ID: b7fb97 Aug. 11, 2018, 5:32 p.m. No.2561225   🗄️.is 🔗kun

>>2561161

The point of the tool is to actually filter out the breads because it's a HunterKiller (you're not huntng/killing the breads: you're looking for the garbage threads). But you could modify it to investigate breads for shill posts or copy/pastes. It's up to you.

 

If you skim over the many anchored threads, you might notice it almost appears as if the admin are using such a tool (which greatly speeds up identification rates of trash threads). It's a pity they censored it as it was intended to help, not hinder (it cannot post and I won't build it so it can as that would only merely aid the shills).

J.TrIDr3ESpPJEs ID: b7fb97 Aug. 11, 2018, 5:45 p.m. No.2561418   🗄️.is 🔗kun

HunterKiller's friends (variations) include:

 

ArchiveBot: mass collect all posts from all active threads in the catalogue (allowing you to do a raw text save of the data). Alternate version: bulk send archive requests of the thread URLs to archive.is/archive.org.

 

[Properly combined, you can keep a simultaneous offline/online version. Word of warning: when archiving to a website, be sure to only archive 'finished' threads on archive.is and to space out the requests over several minutes so it doesn't appear you are flooding/a bot. Automated and slow is better than manual and even slower.]

 

MonitorBot: keep track of which threads have 'moved position' and thus have 'new replies' (this is a technique used by shills to direct their limited resources to whichever thread is presently active; likewise, you can do the same).

 

KeywordBot: Have a bot that looks for specific keywords, image filenames or other triggers and then flag them up when it spots it (shills also use this technique to know when you're talking about a subject that they need to 'shill on').

 

Newsfeed/TrackerBot: use it to pull the latest news from websites (it's strongly recommended you use a news sites' RSS feed to do this as it keeps it nice and simple). Will require substantially more work and beware shitty unicode strings in the returned data.

 

Literally, whatever the hell else you could imagine. You're also not restricted to this board. If you change the URL to another board, the code should largely still work (albeit you might have to modify what data gets accepted as some boards don't have poster names).

 

If you want to get super anal, in theory you don't even need python to build an auto-parse tool. If you're batshit insane, you could even use wget coupled with a bash script.

 

Beautiful Soup and Mechanize are extremely powerful. Beautiful Soup does HTML parsing, and Mechanize is like a full-blown browser under the hood. You can post to a forum/board, but I've noticed the Media Matters shills appear to be doing everything manually (or whatever they have is absolutely shit), so I leave it as an exercise to talented chans to develop (highly advise you do NOT publish any posting capabilities as it will only arm the less well developed shills).

 

And believe me, this is just scraping the surface.