I keep reading posts about archiving everything off-line. I'm trying to help!
I'm writing code to pull posts from the boards and place them into a real database with a web application.
I have some questions for Codemonkey that I need answered.
question 1
I'm pulling the board information with catalog.json, but this brings in only the most recent 25 pages of threads. How do I get the earlier threads?
question 2
Are image files duplicated between posts where they are used or are they shared and merely referenced again? I'm trying to understand if I need to call the server and download each image as a separate instance. The image identifiers are 65-bytes, but I don't know if they're copies of copies between posts.
The big issue is question 1 as we're now into the 600s for the threads just on this board!!!
I've imported the threads that I can get with the current catalog.json and it looks beautiful for less than an hours work.
NOTE that the present selection table is a subset of the thread data. It's all captured into the database.
CM or BO can contact me at either:
n4hpg@comcast.net
n4hpg@protonmail.com
I'll work on importing actual non-Q posts tomorrow. Long day here.
You can try it out:
www.pavuk.com
username qanon
password qanon
No sense ME being anonymous…