I have 3 python services running 24/7, pulling data from the 8chan/8kun JSON APIs. One service stores new thread ids into a MongoDB, another services downloads all posts from new and any updated threads also loaded into MongoDB, and the third service downloads all attachments from posts and tracked in Mongo.
Been running this since 2018 and have over 1TB of content. Runs on linux and mac, not tested on Windows. Can be dockerized to run anywhere but never needed to.
I wanted to parse the notables and auto post to a simple website for normies. Started it but but never finished it.
I wish I could run this in the cloud and dump the data in elasticsearch but would probably have a lot of issues the the powers that be.