I received several machine-generate e-mails which are all mostly the same: a notification. They are HTML emails with no plaintext MIME part. Yikes! And to complicate matters further, the messages traversed my anonaddy forwarding account which PGP encrypts every message to me before forwarding it to my normal email account.
The gov wants me to give them an “unaltered copy” of these e-mails. This gov office actually blocks my mail server so I am generally unwilling to send them email. This means I will be giving them the emails on paper hardcopy.
So wtf, this is tricky. They want an “unaltered copy”. If I were to print the MBOX files, it would be useless to them because it’s a base64 blob that only I can decrypt. My mail client is mutt so the HTML is detected and piped through w3m to give me a text version that is readable enough.
But in general, how do you give unaltered copies of an HTML email on paper form? This is not necessarily for a court but it could go down that path. Would a court want to see raw HTML tags? Or do courts prefer the HTML to be rendered for readability?
Normally I copy the w3m-rendered text of email into LaTeX and typeset it to look pretty and copy-paste the useful headers into a well-styled header in a monospaced font. And I omit the useless headers. But I get the impression my way of working would not pass for “unaltered”.
I could perhaps try to feed the HTML into wkhtmltopdf
. In the end, HTML rendering always varies depending on the rendering tool. Normies use MS Outlook, and I have to figure that the gov is normally dealing with normies. So maybe I should install Evolution or Thunderbird. Any suggestions for a tool that is particularly good at making HTML email presentable on paper without looking too custom?
#askFedi
In my admittedly limited experience, courts don’t want to look at raw HTML unless something in the headers or something is relevant to the case. Then, they want the important bits to be highlighted by experts.
I actually wrote some Python scripts about 3 months ago to parse MBOX files so that specific emails could be entered into evidence in a lawsuit. I don’t know if my scripts would help you, but I’d be happy to send them to you.
My python knowledge is quite rough but if not much hacking is needed it could be useful. I’ve seen others asking for a similar tool. I thought about creating one over the years but keep passing on it thinking I won’t need it often enough and every situation can bring different requirements as well. Which is why I settled on pasting into a LaTeX template. I do things like use a tiny font on signature blocks that are so big they would spill over to another page.
Does python have a standard library for HTML rendering? Or do you call a browser of some kind?
I’m on my phone right now. When I get home I’ll dig them up.
I might be able to get by without the script. I just found that I can render the body in Firefox well enough (that often fails but it works with the particular emails I’m dealing with), fiddle with the paper format and scale to exactly fit a page, and then import it into LaTeX, rescale, and attach a header. If you’ve already got the script ready then I would be happy to take it anyway and compare the script output to what I’m manually rigging up. But if you’ve not started then no worries. Thanks!
(edit)
fwiw to anyone with the same need, I found this project: https://github.com/nickrussler/email-to-pdf-converter It looks a bit messy to install on my distro and I’m not sure of EML / Mbox differences, so I’m not planning to use it myself.I’d kind of forgotten how I’d done it.
This script searches an MBOX file for emails from or too lawyer1 and lawyer2, that contain the names or email addresses in target_names. It exports each email to a txt file, and saves any attachments in their original format.
Then I used this bash script to export the txt files to PDF, using pandoc (https://pandoc.org/)
In my case, we needed to export like 7,000 emails from over a 6 year period from like a 45 GB GMail MBOX export. The lawyers seemed happy with the result, but it was a lot of data.
Thanks! I grabbed it in case it comes in handy. I wonder if the first script which searches for messages might have been simplified by using grepmail. Grepmail is slow but powerful.
This is slow too.
go full boomer and take a picture of your monitor displaying the emails?
Love the suggestion. That’s actually a great nuclear option. They would have to be understanding in most contexts. Although in the case at hand I will have to reveal that I use Tor which would probably cause a bit of confusion. And considering what a mess HTML looks like in my MUA (mutt, a text client)… well, could be a disaster.