dChan - Q Origins Project Archive

I keep reading posts about archiving everything off-line. I'm trying to help!

I'm writing code to pull posts from the boards and place them into a real database with a web application.

I have some questions for Codemonkey that I need answered.

question 1

I'm pulling the board information with catalog.json, but this brings in only the most recent 25 pages of threads. How do I get the earlier threads?

question 2

Are image files duplicated between posts where they are used or are they shared and merely referenced again? I'm trying to understand if I need to call the server and download each image as a separate instance. The image identifiers are 65-bytes, but I don't know if they're copies of copies between posts.

The big issue is question 1 as we're now into the 600s for the threads just on this board!!!

I've imported the threads that I can get with the current catalog.json and it looks beautiful for less than an hours work.

NOTE that the present selection table is a subset of the thread data. It's all captured into the database.

CM or BO can contact me at either:

n4hpg@comcast.net

n4hpg@protonmail.com

I'll work on importing actual non-Q posts tomorrow. Long day here.

You can try it out:

www.pavuk.com

username qanon

password qanon

No sense ME being anonymous…