>>12350
(Back from the store.)
There will be a lot of dups in the scraped meme collection if you scraped them from the general. QResearch allowed duplicate images in the same bread, too, so even the meme breads sometimes contained dups. And I admit to shitposting many, many, many meme dups over the 19 months when trying to change the shilly tone into a productive tone over there.
You could probably shrink the library by 25% if you had a great programmatic way to ID dups. Hash would probably work well for that.
For the past few months I have been shrinking the meme files as I harvest them.
My algo:
-
Examine memes to be archived and isolate any that HAVE to stay PNGs because transparent background.
-
Examine memes to be converted to .jpg and manually shrink any that are over, perhaps, about 1800 pixels in any dimension. We just don't need that much resolution except for text-heavy infographics which get to remain large.
-
Run png2jpg.sh the ones that survived step 2:
#!/bin/bash
# Convert all .png to .jpg
# 29Mar2019 Added filename conversions from .jpeg to .jpg
# 21Apr2019 Changed quality from 93 to 95
mogrify -format jpg -quality 95 *.png
mogrify -format jpg -quality 95 *.PNG
rm *.png
rm *.PNG
for file in *.jpeg; do mv "$file" "${file/.jpeg/.jpg}"; done
for file in *.JPEG; do mv "$file" "${file/.JPEG/.jpg}"; done
Note: mogrify is a provided by the imagemagick package. It's a rather dangerous command that changes a file in place without making a backup. So use with caution! If changing the filename to jpg overwrites a previous file, well that previous file will be lost.
-
Drag a batch of .jpg files to Trimage image compressor. Trimage sometimes shrinks files by 2%, 5%, 10% by removing unnecessary optional color profiles and such. Now this batch of files is ready for archiving.
-
Separately handle the isolated .png files that are not due for conversion to .jpg.
Drag them to Trimage. Trimage will cogitate a LONG time on .png files and use a ton of CPU. It may produce some work files that are sometimes not cleaned up from the working directory.
-
The unconverted .png files are good for archiving now.
That's it. Lot of steps, but we try to restrain memes library filesize growth. We're just under 15 Gb on the Mega library and I will have to create a new one soon since we're using their free service plan.
Shadilay, codefag.