Anonymous ID: a55360 May 11, 2020, 3:08 a.m. No.9120686   🗄️.is 🔗kun

Checking in. Kind of set the Star Wars: Commander dig on the backburner since I'm not convinced that's the right rabbit hole to be jumping down. I heard someone mention that the redactions can be pulled off the Transcripts in Ghidra, but I'm not sure. The PDF's don't have any executable code in them, so Ghidra would really only show bytes in the documents.

 

Having said that, I loaded up two PDF's anyway just to take a cursory glance. I used Andrew Brown's transcript as an experiment, looking first at Schiff's release and then the DNI release.

 

Schiff's release looks about what I would expect a PDF to look like in Ghidra. Pretty mundane. I looked through the ASCII translation of the bytes and saw some XML formatting code, and I was able to differentiate when paragraphs start, but all-in-all there's nothing to see there.

 

The DNI's release is a bit more interesting though. I haven't found anything just yet, but it looks different from Schiff's. The ASCII readout is about the same with some differences (at first glance), but what stood out to me was that the code analyzer actually returned stuff. I'm not sure what it all means, but (as we knew) there's an obvious difference between the files that Schiff releases and the ones that the DNI released.

 

Observations:

The DNI's version seems to be images as opposed to Schiff's, which could be close to the original PDF documents but with redactions. The DNI, it seems, did the smart thing by scanning these documents back in after redactions, removing the ability for the documents to be torn apart. I think it would be worth opening Schiff's documents up in Adobe Pro and seeing if you can't just simply erase the bars.

 

I will have to give this a shot. As always, I'm open to anyone else's opinions or direction on this. I'm very new to this but I'm dedicated. Also, if these redactions CAN be stripped, it'd be smart to download all of them before they get taken down.

Anonymous ID: a55360 May 11, 2020, 10:36 a.m. No.9125009   🗄️.is 🔗kun   >>9955

Idea

 

Okay so PDF's are essentially heaps of code that Adobe Viewer/Acrobat translate into readable text. There are some nuances to it, but you can read a few articles here to get a good idea of how a PDF is built:

 

https://blog.idrsolutions.com/2013/01/understanding-the-pdf-file-format-overview/#helloworld

 

Here's where I'm at. PDF files will declare objects that will be present when opened in Adobe. Those objects can be a number of things (text, images, signatures, etc), but the problem is that the actual contents of those objects are encoded. Luckily for us, we know what it uses to encode:

 

FlateDecode

 

So I'm still learning a bit more about that, but conceptually one would be able to grab the bytes from the objects in Ghidra and run them through a Decoder (using the… FlateDecoder algorithm?). What this would do is essentially display the encoded object as plaintext. In the event that the object is a picture it'd look like jumbled plaintext, but if it were a text box it may have some code describing the box, and then possibly the string inside.

 

I haven't tried it yet. I'm having to learn about decoding first. I'm trying to figure out if there's a way for me to decode straight from binary or hex through the algorithm into plaintext, or if I'm just barking up the wrong tree again.