Looking good
I hear you anon.
The key is the content. We have the ability archive threads/qposts. Posts that Q references. Tweets. Known tripcodes/twitter accounts.
What is the source of all the evidence? The dedicated research threads? Notables? In order for it to be automagic, there needs to be a reliable single source here on 8ch. None of the codefag work I've seen reaches a level of what could be called AI - or the ability to discern which anon has posted a certifiable answer/evidence.
Non automated means anonomated, but that causes it's own set of issues.
I agree a wikipedia style thing would be good because it's familiar, but populating it with data may be an issue. Some of it's going to have to be entered in manually.
If all you are looking for is a location for an anon wiki, I think that's pretty easy.
Ya that's fine. I'm going to update that today to cover the latest.
I've been working on a new local viewer that uses the twitter smashed data. It shows the delta + alt text of the tweet + a link to the tweet. I've noticed that alot of the image links I have a currently broken. I was thinking I'd just update those to point to one of the other QCodeFag branch archives rather than try and archive all the images as well.
Expect an update on GitHub later
Here's what it looks like. Just trying to finish off a sort idea and clean data.
When you get that worked out make sure to let us know. I've been wondering about that myself. The early halfchan no's are pretty big. I've found some bugs in my code around there being multiple references per Q post. It does happen on occasion and my scraper isn't catching them all.
I've just uploaded a bunch of json data to the https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON gihub. The json folder is what's generated when you run the ChanScraper, the smash folder when you run the TwitterSmash. Each of those folders has a Viewer.html file that can be used with just the _allQPosts.json or _allSmashPosts.json.
Like I said I need to clean up some dead image links for everything to be working right.
Ya think it's bad form to go lazy and link em to one of the qcodefag archives?
So you have all the breads searchable as well?
HOLEY FUCK YES.
This crosses all breads? If so then this is exactly what we need. I can help you with the SQL if you need it.
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15 you should also specify an ORDER BY
How are you getting the breads? Maybe I can work out a way to get you those. Combine up somehow
I've been thinking about this. Preliminary research shows that elasticsearch and lucene would probably be the best match for what we've got. There are alot of tools that pile into elasticsearch. Any hostfags here with the ability to set up an elasticsearch node?
The data is big. Tons of images. A proper archive takes space. I'm holding @546 complete breads and with no images it's 250MB+. That's for like a month. By the end of the year the bread collection alone is going to be over 1.5GB.
The images I've got so far is around 100MB, but that's just from the Q posts - and even then I know I'm missing some.
Econ Godaddy hosting is like $45 a year. I'm thinking about just putting the chanscraper/twittersmash online, then write some simple apis. Get thread#, filteredThread, qpost# that kind of thing. Useful or no?
Hmmโฆ When I say bread I mean a full Q Research thread. Like this
https:// github.com/QCodeFagNet/SFW.ChanScraper/blob/master/JSON/json/8ch/archive/651280_archive.json
That's the straight bread/thread from 8ch. It includes all the responses whether the BV posted it or not.
I'm finding those by getting the full catalog from
https:// 8ch.net/qresearch/catalog.json, finding the breads/threads that have q research, q general etc in them, and then getting the json for that thread only from https:// 8ch.net/qresearch/res/651280.json
I think I see what you are doing - going thru and trying to mark the relevant posts?
I agree on shill proofing.
I've been playing around with a webAPI. I've got it working nice with all the q posts, looking for a specific post# like #929, and posts on a day. Returns json or xml. This is the Crumb Archive.
My plan is to expand that so that the archived breads can be accessed as well - each as a single json file. This is the Bread Archive.
I'm going to set it up where it's an autonomous machine. It will scrape and archive automagically moving forward from the current baseline. No delete. No put. No fuckery.
I'm pretty sure it would with the QCodeFag scraper repos.
The bread archive is pretty big. I'm sure there's no way I can archive images for all the breads. An image archive isn't what I've been focused on. The focus of this is only making the json/xml available from the chanscraper.
Once I can get the breads all up and being served automagically my plan is to set up an elasticsearch node and suck all the breads in.
I figure a year of godaddy hosting is currently $12 with unmetered bandwidth. I'll throw in.
What about 100k transactional batches?
Yeah man hit it. I've got a github here you can browse around.
https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON
json/8ch has the filtered/unfiltered bread and archives in it. smash has the twittersmashed posts. I've been getting my twitter data from http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json
I set up a test for the webAPI twittersmashed posts here https:// qcodefagnet.github.io/SmashViewer/index.html
I'm getting close on having the webAPI thing finished up. Just running some more tests and then I should be ready to go.
Yeah you could mebbe use the smashed json from me. I've already done the unix timestamp on the trump tweets. All 8ch posts and Twitter posts dervive from the same Post base object with the unix timestamp built in.
I think that's because you can't really get them. There is an 8ch beta archive here, but all the Q Research threads dissappeared shortly after we started archiving them. Even then, those archives are straight HTML. It's of no use to me. AFAIK, once it slides off the main catalog, its pretty much gone. Some trial and error got me a few breads, but not many.
Interesting concept you have anon. You want to be able to search across ALL 8ch? Not just Q Research? By platforms are you talking 4ch/8ch? or 4ch/8ch/twitter/reddit/facebookโฆ?
I would think the time is relative to the archive home timezone. That is, unless archive.x has done some wizardry to change the time zone it's pulling at to be the time zone of the user requesting the original archive. That would be more problematic - but you could still deal. It should be marked what time zone and then you convert into the unix timestamp.
The 4ch breads or the 4ch Q posts?
What are the chances it's hanging on a specific record? I see that all the time doing inserts. Bad data kills it off.
You could look into raising the timeout. Mebbe it's just such a long job that it's taking too long and timing out? https:// support.rackspace.com/how-to/how-to-change-the-mysql-timeout-on-a-server/
Hmm. Yeah just doing some easy math I can see how you would have more than 1mm records. We're at bread 815+ something here and with 751 post each that over 600k here on 8ch alone.
You may be onto something with that. Is there a limit? https:// stackoverflow.com/questions/2716232/maximum-number-of-records-in-a-mysql-database-table
Looks like number of rows my be determined by the size of your rows.
>http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json
AllQPosts smashed with DJTwitterposts by day
https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON/smash
OK brother codefags. I've stood up a simple API. It serves json and XML for your consumption pleasure.
It's currently set up to:
1) Scrape the chan automagically and keep an archive of QResearch breads and GreatAwakening.
2) Filter each bread to search for Q posts and include anything in GreatAwakening into a single QPosts list
3) Serve up access to posts/bread by list, by id, and by date.
I'm going to incorporate the TwitterSmash delta output next. I figure I can do a simple search across all Q posts easily. Searching across the breads is harder.
You can check it out here: http:// qanon.news/
McAffee says secure https:// www.mcafeesecure.com/verify?host=qanon.news
There's a sample single page app that shows how to use it. http:// qanon.news/posts.html
I still gotta set up my email account so if you spam me now, it's likely to get bounced. I'll check back in later.
My reason for doing this is twofold, I figured we could use it, and I'm looking at the job market in my area and thinking about changing it up. This is partially a learning project to open opportunities by using different tech. I'm claiming ignorance. My plan is to try out an elasticsearch node once I get this working as designed.
Let me know if you can think of a query/filter that you think would be useful. It's not proven to be too difficult to work new things in other than the ugly local path issue I came across working on it this morning.
Try it out anons.
> I've been getting my twitter data from http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json
>www.trumptwitterarchive.com/data/realdonaldtrump/2018.json
There was a 9 day gap at the beginning of the year. Otherwise it's been updated. Unfortunately I think there were 2 markers in that time. Delta anon knows about it.
Refresh yer cache? I'm seeing Jan 9 - March 21 2018
Feckin dates. I got it all sorted out. Discovered a bug in the different times zones my dev server is on and the API webserver.
I've been sorting out small bugs and about to wire in the TwitterSmash. The automation part seems to be working good now that I sorted the date bug. I've got it set up to do hourly scrapes. Last run at 8:03pm 3-21 est. The scrapes themselves only take about 45 seconds - including the twittersmashing. There's a test smashpost page here to see the deltas in action. Not totally live Q post data online yet.
http:// qanon.news/smashposts.html
This is another test page using live data
http:// qanon.news/posts.html
I did this to test some code out. Get a random Q post.
http:// qanon.news/api/posts/random/?xml=true
I set up an elasticsearch node today to experiment. We'll see how that goes. Could be an huge pain in the ass to set up at a host. We'll see.
Update your tripcodes codefags.
public readonly string[] ConfirmedTrips = new string[] { "!ITPb.qbhqo", "!UW.yye1fxo", "!xowAT4Z3VQ" };
http:// qanon.news/api/posts/943/?xml=true
yeah that sounds like a good one.
I've done some more work on the http:// qanon.news api. I managed to work out a coupla small bugs and get the TwitterSmashed posts integrated. Everything seems to be working as designed.
Here's the smashposts.html demo page. Shows deltas to Q posts within the hour.
http:// qanon.news/smashposts.html
I've going to add another result to the smashposts where everything is grouped by days. I'll probably put it in the posts API as well.
It's starting to look like this may be close to going on autopilot. Any interest in changes/additions before I move onto something else?
I'd love to work out a local copy of the Jan 1 2018 - Jan 9 2018 @realDonaldTrump tweets. Those are missing from the trumptwitterarchive site. Anybody got access to that?
Hmm. Yeah I'll look into it. I can see that archive getting really big really fast. This things only been running for a month and it's over 400mb only JSON. I'll have to make sure what kind of space I've got avail.
I never figured that another image archive was what we needed. Each of the QCodefag installs has it's own local archive. My concern was in preserving the JSON data from QResearch before it slid off the main catalog.
I'm going to put up a more simple list to show what's been archived. I'm showing 716 total breads., but again that's only starting at 2-7-2018. Q Research General #358 is my earliest full archive - it's up to #982 now.
That's 624 breads in 47 days. 13.2 breads per day. EST 4846 breads in one year ~ 800k/bread = @ 4GB/year in JSON bread alone. Mebbe different if I moved to a DB.
I may have enough storage, but it's so hard to say. Any image archive estimates anons?
Try this
http:// qanon.news/api/posts/962/
or this
http:// qanon.news/api/bread/452/?xml=true
add/remove the xml from the query string to get XML
Do you have to have block data storage? Any other options?
Glad it was useful. The posts API numbering is a bit squirrelly till you get used to it. The post ID is the post count starting from 1 on Nov 28 2017.
So finding out it was post #692 I had to view all posts (or posts.html or and of the QCodeFag installs) to get the post#. The bread# is in the post as threadId
Fuck off nigger. I'm just trying to come up with other ideas. I've been in IT for over the last 2 decades. I know exactly whats going on.
My point was, hosting can be found on the cheap if you look around. Not sure you NEED SSD. What you need is storage space. I was thinking drop the SSD for cheaper storage.
Whatever, it's your problem. You seem to be capable of figuring it out.
Hurts me to my core!
No I write the software. Whatever. Deal with your own problem - it doesn't concern me.
I think I finally managed to squash the date bug in the QPosts/DJTweets.
I took the 60min delta restriction off - and it's applying each day's tweets on each Q post to allow you to see all the deltas.
http:// qanon.news/smashposts.html
I've been thinking about a timeline for the past few days. I looked into different solutions and found timelineJS that works pretty good.
I managed to wrangle the API data into a timeline. I'm planning on adding in the DJTwitter data and ideally news/notable events.
Once I can get the twitter data in I'll cut it loose. I was hoping to figure out an easy way to get other data into the timeline. News/notables. Any ideas? QTMergefag? You got good news/events?
Here's what it looks like:
I think the timelineJS handles that for you if you add it as media/tweet to each slide.
Agree. I've been thinking about trying to work out a way of collab. I'm sure I could come up with a way to prove we're who we each say we are. Unless the clowns are here building community Q research toolsโฆ
Check it out. I got the twitter working.
What I can say about this timeline is that there's alot of events on it. There's Q posts batched down to days across 98 days. Add in the Tweets and there's alot going on. Each day/tweet == a slide. It's definitely more than it was probably designed to handle. It takes a minute to make sense of the somewhat sizable JSON data and then render the display.
FOK delete this please
{"scale": "human","events": [{ "start_date":{"year":"2017","month":"10","day":"28","hour":"0","minute":"0","second":"0","millisecond":"0","display_date":"2017-10-28 00:00:00Z"}, "end_date":{"year":"2017","month":"10","day":"28","hour":"0","minute":"0","second":"0","millisecond":"0","display_date":"2017-10-28 00:00:00Z"}, "text":{ "headline":"HRC extradition...", "text":"The body text...<hr/>" }, "media":null,"group":"QAnon Posts", "display_date":"Saturday, October 28, 2017","background":null,"autolink":true,"unique_id":"1dba35d4-46ac-4c5f-94d7-1e6b0f53ad4d" }, { "start_date":{"year":"2017","month":"10","day":"28","hour":"21","minute":"9","second":"0","millisecond":"0","display_date":"2017-10-28 21:09:00Z"}, "end_date":{"year":"2017","month":"10","day":"28","hour":"21","minute":"9","second":"0","millisecond":"0", "display_date":"2017-10-28 21:09:00Z"}, "text":{"headline":"Δ 25","text":"2017-10-28 21:09:00Z<br/>@realDonaldTrump<br/>After strict consultation with General Kelly..."}, "media": {"url":"https:// twitter.com/realDonaldTrump/status/924382514613030912","caption":null,"credit":null,"thumbnail":null,"alt":null,"title":null,"link":null,"link_target":"_new"}, "group":"realDonaldTrump","display_date":null,"background":null,"autolink":true,"unique_id":null }]}
Interdasting. I'd have to see a list.
http:// qanon.news/timeline.html
http:// qanon.news/Help/Api/GET-api-timeline
>q-questions.info/research-tool.php
Yeah it looks like there are some missed posts in there for sure. You may have done some good work on that one.
>RrydKbi3
Agree. That's the only post with that ID. Nothing ties it back to Q.
Same for Anonymous ID:9o5YWnk7 2017-10-29 19:35:45 Thread.147146601 Post.147171101
NP
Qanon.news bumped from the bread anons.
Somebody said that the site was serving malware and it was taken out of the bread. I posted in the meta thread to have BV check it out and he gave it the OK. I spent an hr or so trying to get it back in. No luck.
I'm not interested in begging - but I do want people to use what I've been working on. I'll see what happens after dinner I guess.
Meh. I've been thinking about it. After reading all about codefags problems, bandwidth issues, SSL certs, all the other qcodeClonesโฆ It may be better to just stay quiet and let people use it when needed. I'm a little disappointed that it was so easy to get something removed from the bread.
What I've been working on is really more backend style anyways. I have been thinking about a few different things though.
I saw one anon post something about there needing to be an RSS feed for QPosts. I think that should be pretty easy to provide. If I get some time I may whoop something out.
I've been playing around with the timelineJS. I worked it up where you can select a specific timeline. Qposts. DJTweets. Etc. Q has mentioned timelines a few times and I've been looking around trying to find threads that were timeline based. No real luck so far. Anyways, I was thinking about working on some different timelines.
I've been starting to wonder if moving to a database solution rather than file based json is going to be worthwhile. Better speed probably? Built in caching? Do I want that for an api? What does everybody else think?
>966124
Even in here.
I built a new API to get a specific post from a specific bread. Maybe I'll get it uploaded today.
Looks like ~/api/bread/981411/981444/
to get >>981444
Researching an RSS/ATOM feed. That looks to be low hanging fruit.
I was contacted by a guy that says he's from this site http:// we-go-all.com
Looks to have a Qcodefag repo installed on a page. He wanted to know if he could help at all and I asked him if he had posted anything in here.
He doesn't know anything of the codefags thread. He's interested in access to the api. I don't wanna dox the guy, but this name matches a guy that works for Representative Jared Polis (D-CO 2nd)
5th-term Democrat from Colorado.
http:// www.congress.org/congressorg/mlm/congressorg/bio/staff/?id=61715
Probably nothing. The QCodeFag stuff is open, 8ch is open. Nothing to worry about anons?
All updated
New Qanon ATOM feed:
I managed to throw together an ATOM feed here:
http:// qanon.news/feed
or
http:// qanon.news/feed?rss=true
It returns the last 50 of q posts. It's a work in progress. I can include referred posts, images etc.
New Timeline api: Timeline api that shows Qposts and DJTweets. I also set up an Obama timeline that another anon pointed out. I'm planning on adding more to it and some other timelines I'm thinking about. You can see a few at http:// qanon.news/timeline.html
Anything is possible.
U is the username? Any other identifying info? Do you know of a post you could point us towards?
I've discovered the machine broke for a few hours on March 27-28 and I'm missing some json. Am I the only one saving off json or does some other codefag have some to send my way?
PageScraper to json?
Nevermind. The JSON I needed had slid off the catalog but was still avail. Thanks CM!
Interesting that you should post that anon, I've been thinking the same thing. We need a crawler. Sounds like a great idea. A better way of visualizing the context thread would be great. Ya know I've been reading about Google. PageRank. How that was designed in the beginning. Links you come across that have alot of responses can be either good or bad on 8ch.
With the new breadID/postID feature I rolled out you could find anything you were missing for sure.
So you think your initial targets are just the baker posts and the other posts that are deemed notable?
I've been wondering if we could use a hashtag internally for our own benefit. #notable. That kind of thing.
It sounds like an interesting project. If I can help at all let me know.
Nice.
I bought hosting from Godaddy. Unlimited bandwidth and 100GB storage. Economy plan on sale was $12/year. I think I even got another domain with that deal for $1/year that I'm not even using.
Yet.
I hear ya on time. My shit got bumped from the bread because 1 anon got confused about a malware notification. I've got 2 pretty solid months of time in on what I've been doing and got taken out by a single post.
As we reach more and more of the masses, the information is going to appear on more sites that show ads/donations. It's a way of paying for the infrastructure needed to provide the service. I see nothing wrong with it.
Wow anon. It's coming together. It will be great to see it once finished.
Interesting what you are doing with the links. I think some of my pages are linking like the qcodefag sites. The RSS I hooked up to go back into the api. Think I should change that?
Nice.
Let me know if you want to hit the smash data. I'll set you up.
I rejiggered the links on some of my pages. It was set up like the qcodefag sites where each post contained a link back here. I changed that to a self referencing link instead. I decided to not be the cause of any more traffic back here.
Statistics show that the pages people coming to my site are interested in primarily the presentation pages - not the API. I think what I've decided to do is remove all references to the API - but still provide it. Default to the posts page or something. I got a few ideas.
Look at the SmashPosts
http:// qanon.news/Help/
Tell me what ya want and I'll see what I can work out.
The Smash API will give you more data you want.
You probably don't want the timeline stuff just yet. Unless you want to just stick with the default q/DJT timeline. Just do a get on the timeline API. The timeline API filters out all the tweets to just show the 5,10,15โฆ deltas.
Yeah Gotta add the full path to the URL. If you are hitting it programattically I gotta give you access. Domain you would be calling it from?
Well you can get all those from the trumptwitterarchive. What I did is group them into days that Q posted, and then only calculated the ones that DJT tweeted after Q posted.
If you check the API you can see the data, or look at http:// qanon.news/smashposts.html to see it more visually.
You are on it!
Pain having to get the 2017 and then the 2018 from TrumpTwitterArchive butโฆ it's the only way.
I guess I could suck all that in and then offer it as an apiโฆ just raw twitter data.
I only thing I found with the twitterdata is that there's a 9 day gap in January at the beginning of 2018. I've been fighting off a compulsion to archive those (manually) to make it complete.
css : You can just use the twitter magic.
https:// dev.twitter.com/web/overview
On the smash page I just make links and decorate with the bird and tweet. The timeline does it automagically.
Here's a question for you.
How hard would it be for you to remove all the inline style you have on q-questions.info/research-tool.php ?
Do you know about jqueryui themeroller?
Conjigger your jqueryUI website and then download the custom css like magic.
Kinda wondering about that myself.
IMO, he was talking specifically about the NP/NK video. Many have archived that offline.
On one hand, I'm archiving online - but that makes it easier for others to archive.
On the other hand - I'm archiving at home too.
The online stuff I'm doing has no bearing on my archives. I put it online so others could use it.
Huh. Anon never showed up to drop his image link on us?