AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 12:54 a.m. No.22364   🗄️.is 🔗kun   >>2365

>>22361

> Various issues arise–bread shiting, bad formatting in topmatter dough post itself, bad fresh bread numbering, bakers using different conventions in their in-bread notables

None of these are really a problem.

I can simply take all the links of the first few posts in a thread that were posted by the same UserID as the first post and use these links to whereever they point and mark these posts as notables. This way numbering doesn't matter. Of course this way we would get all notables from all threads, so there should be a way to distinguish the general, which can be done by a simple text search.

 

The only problem would be of course someone who creates a fake bread to mark all sorts of posts as notables, but oh well…

 

I can do all that, but as I said I can't host my own site. I could for example upload data to a FTP server for further processing.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 12:59 a.m. No.22365   🗄️.is 🔗kun   >>2366

>>22364

What I tested yesterday, one can img src link to media.8kun.top as well as archive.org (when you have the image link) from any HTML file. Nothing like that is blocked.

 

This means that the website that does all this does not have to host images or videos, these can all be taken from 8kun-server and later from archive.org.

 

One could take at least the images, that's what wearethene.ws does.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 1:07 a.m. No.22366   🗄️.is 🔗kun   >>2372

>>22365

What would also great about this:

We would then get all notables from all threads!!!

 

Right now wearethene.ws was only about the General, not all the country specific breads.

 

The only problem I see is if notables aren't taken properly. If for example the baker links only one post, but the original poster made 3 posts, then only that one post would show up. Of course that would be a baker issue.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 6:03 a.m. No.22379   🗄️.is 🔗kun   >>2380 >>2384

>>22377

I'm archiving all threads since around a month onto wayback machine. That way graphics are archived and preserved too. raw JSON data is also archived.

 

I can understand that he doesn't want to 3rd party sites, and wayback machine can also be modified/censored, but the point of it is being a backup. 8kun archives threads too, but all the graphics + videos are gone at that point.

 

qresear.ch also has a notable section, but this one is just a listing of the notables without any links nor graphics, which makes it quite unusable. It's just the baker notable posts.

 

I would prefer something like wearethene.ws with all actual notable posts including videos + graphics, plus a search for that, plus links the original threads, plus links to wayback machine archived threads. I think that would be really useful.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 6:10 a.m. No.22383   🗄️.is 🔗kun

>>22377

What you could do is to check if a thread got archived on wayback machine and then offer a link to there, and also check archive.is too.

 

The wayback machine is kinda buggy, but I figured that this API here:

https://archive.org/wayback/available?url=https://media.8kun.top/file_store/7690504a4047614878dcd2377a5ac53c9f22772c48f46da051d7049ee7cf0a32.mp4

works fine 99% of the time.

 

You will get JSON data that points to the archive. You can do that with videos, graphics and of course threads themselves.

 

I'm typically archiving every few hours. Threads are either archived when they are in the lower 80% of the catalog, or when they are at 751 posts.

 

I only save graphics that exceed 1777x1777 resolution atm, I was playing with the idea of archiving everything above 1000x1000, but I can't do that at once.

All Q posts attachments are saved, all attachments of posts in reply to Q posts are archived too.

All videos are archived as well (mp4, webm).

 

And I saw that sometimes some threads were already archived before, but this wasn't done all the time. You can check the timestamp returned by wayback machine as well, so that you know if it's up-to-date. I'm doing exactly that before archiving threads.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 7:28 a.m. No.22387   🗄️.is 🔗kun

>>22386

Yes, alternatives are always good.

If one goes down, the others may still be there.

wearethene.ws is great, could use a search though.

 

>>22384

You could link to the archived videos on wayback machine. That way we don't lose the videos.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 2:39 p.m. No.22399   🗄️.is 🔗kun   >>2400 >>2521

>>22396

You need to figure out the archived links.

 

Most of the time simply going to:

https://web.archive.org/web/https://media.8kun.top/file_store/4c797bc1ec665621f30b59237bcbc5b24fb74949b0345f435fa157ad1518ab17.mp4

should work and will look up the most recent archive, but I had situations where this returned that the URL has not been archived despite it was.

 

What works for me all the time is this API:

https://archive.org/wayback/available?url=https://media.8kun.top/file_store/7690504a4047614878dcd2377a5ac53c9f22772c48f46da051d7049ee7cf0a32.mp4

 

This will return JSON data with the most recent archived URL.

 

Videos will play when calling such URLs.

If you want the raw data URL, you have to look how wayback machine calls the images/videos themselves.

 

For example most recent archive for the previous URL is:

https://web.archive.org/web/20200715221046/https://media.8kun.top/file_store/4c797bc1ec665621f30b59237bcbc5b24fb74949b0345f435fa157ad1518ab17.mp4

This will load their mp4-player though and add wayback machine information on top.

 

Actual video is here:

https://web.archive.org/web/20200715221046if_/https://media.8kun.top/file_store/4c797bc1ec665621f30b59237bcbc5b24fb74949b0345f435fa157ad1518ab17.mp4

 

Archive.IS has a similar way of going to the archived data.

 

You can also implement the wayback machine URL lookup code in JavaScript, so that the client side checks for the archived data.

 

Or you could do it on demand, like when a user clicks on a video the first time, you look up the URL. If you get an URL, you save the URL in your local database.

 

If you do that, you need to support multiple archives for threads + json data / check that threads + json data are really up-to-date (compare time stamp of 8kun thread with the time stamp of the archive on wayback machine, if I remember correctly both are UTC).

 

Images + Video URLs never change, and 8kun is also not using different URLs for the same video. When the same video is uploaded several times, the same URL will serve the content.

AutoArchiveAnon !!!OGY0YjcwNDFlOTMz ID: b20624 Aug. 8, 2020, 2:52 p.m. No.22400   🗄️.is 🔗kun

>>22399

To explain further:

 

You know the URL of each thread.

For example

https://8kun.top/qresearch/res/10201968.html

 

To figure out the wayback machine archive you call

https://archive.org/wayback/available?url=https://8kun.top/qresearch/res/10182731.html

 

From there you get

http://web.archive.org/web/20200805010421/https://8kun.top/qresearch/res/10182731.html

 

I just notice that the last 1-2 days I received an archived URL from wayback machine, but when I try to get the URL right now I don't get any, plus the previously returned archived URL doesn't work either.

 

It started with this thread:

http://web.archive.org/web/20200805040335/https://8kun.top/qresearch/res/10183484.html

the one before that works fine

http://web.archive.org/web/20200805010421/https://8kun.top/qresearch/res/10182731.html

 

But the archive.IS link works fine.

http://archive.is/hGdzm

 

Fuckery afoot.