dChan - Q Origins Project Archive

IMPORTANT NOTES ABOUT THE ARCHIVES

Yesterday, I spent much of my day determining which board posts on the map came from so that I could get them properly indexed in my database. Many of these posts were damaged by hackers on their original threads. Fortunately, we’ve been archiving threads as we go along, so we have all of the thumbnails.

THE BAD NEWS

When the threads were archived, the archiving site did NOT save the full size images that go with the graphics. These, apparently, must be saved individually. To get to the full size images when they are available, you will need to enter the original URL for the image into the archive site’s search box. In a few cases, I was able to retrieve the full size images that way, but most of them had not been saved.

IMPORTANT PROCEDURE WHEN UPLOADING IMAGES

This is particularly important for those infographics that can’t be read in their thumbnail form:

After you upload that beautifully crafted highly informative infographic, click on the file link above your graphic on the thread and archive that page that comes up that is dedicated to your graphic. This will assure that your graphic is saved. Archiving the thread itself simply is not sufficient to preserve your work.

WHICH ARCHIVE TO USE: archive.is v. archive.org (WayBackMachine)

There’s an important difference between these two archives. Archive.org saves pages in a format that is about as close to the original as it can be. These pages retain the original html tagging and attributes. Original file names are preserved as well. If I wanted to scrape a chan thread saved on archive.org, I could probably do it successfully. If I save the archived thread to my own computer, I could copy things from the _files directory for the archive’s version of the saved thread directly into the _files directory for a thread saved from a chan page, and that would recover them properly.

Archive.is does not work the same way. That site converts class attributes into style attributes. From my perspective, this means that the thread can not be scraped to preserve posts individually, since I depend on class attributes to tell me what part of the post I am parsing. Also, archive.is renames the image files, making it a bit more difficult to use that site for retrieving individual thumbnails to fix broken posts.

Anonymous ID: 3dd669 Jan. 24, 2018, 10:16 a.m. No.148055 🗄️.is 🔗kun >>8080 >>8110

>>147948

The only change I might make in this strategy would be this:

Don't use the word "patriot" when communicating with libs. This is a dirty word in their vernacular, as recent history has shown us. But the general message about cleaning out Pres. Trump's FBI remains the same.

>>148046

Can you create a version that does not include the dickhead McKek?

>>148016

Speaking of illegals (or refugess) over Americans, I commented the other day about the large increase in homelessness in the last couple years. Someone asked about the races of these. I'd said they were all races. I'd like to make one correction to that. I don't recall ever seeing a homeless Muslim. It's mostly blacks, whites, and Hispanics that I've seen. Even homeless Hispanics aren't seen as much. They seem to be more resourceful. (Something's wrong with our education system.)

>>147856

For those concerned about the issue, just because your stuff is being deleted does not mean that the administration is not paying attention. I noticed on Q post 127421 this: "Thought when we 404’d the link that gave confirmation." So it could be that they're not ignoring you, but maybe they don't want to let on to the opposition that they're seriously looking at it.

>>148163

Excellent!

Anonymous ID: 3dd669 Jan. 24, 2018, 10:45 a.m. No.148302 🗄️.is 🔗kun >>8403

>>148134

Seldom. But I've seen a few homeless Asians. But generally, Asians tend to be quite resourceful, too.

>>148141

Yes, those, I've definitely noticed. L.A. has a large piece of property that was gifted for the purpose of housing disabled veterans. Unfortunately, that is not how the land is being used.

https:// www.cnn.com/2011/10/22/health/homeless-veterans/index.html

Anonymous ID: 3dd669 Jan. 24, 2018, 10:55 a.m. No.148390 🗄️.is 🔗kun >>8433

>>147911

Even some of the stuff posted here on the chans has been of the type that "can't be unseen". For instance, there's that pic of the bin of dead babies at some processing plant. I've decided not to include that type of post on the site I'm building, at least not directly. At most, I will post links to stories about such things with a content warning.

>>147945

With regard to this, perhaps we should set up a separate thread for collecting up reuploads of important infographics. I'm mostly in data mode again. As I work through the breads, I may occasionally ask for missing important full size files. Now, I'm going to go take a nap. (I've been more nocturnal lately.) Then I'm going to try to finish tidying up the posts that are featured on the maps so that I can get a new release of my blog posted. Hopefully that happens tonight, but the complexity of the project continues to surprise me.