Anonymous ID: 7aae8d June 2, 2021, 7:03 p.m. No.13817585   🗄️.is 🔗kun

>>13814830

>Here is python source code to extract all the words from fauci's email drop

 

That python code is using the 'natural language toolkit' (nltk).

'punkt' is a tokenizer.

 

Basically the code is using probability to choose the most likely words that would fit the non-redacted words in the pdf.

 

It is not extracting words, it is using probability to fill in the blanks.

Like throwing hotdogs down a hallway while riding a merry-go-round.

But the hotdogs would be more fun and accurate.