Skip to main content

Feedback provided regarding Google Takeout archive issues:

Originally shared by Filip H.F. “FiXato” Slagter

Feedback provided regarding Google Takeout archive issues:
Because I am still running into some issues with cropped filenames, even when exporting to a zip64 archive, I've filed two tickets through Google Feedback (https://www.google.com/tools/feedback/reports?hl=en or rather the Send Feedback form on takeout.google.com).

I'm sharing them here to make them public record, so others are aware these issues can occur, and perhaps can also provide feedback to Google in case they're also affected.

Regarding Contacts with periods in their name:
When a Contact's (first?) name contains (ends with?) a period, the exported VCF tends to lack the .vcf file extension.

For instance, I had a contact named:
First name: 'wolf.'
Last name: 'nanaki'
The exported vcf however was named: "Takeout/Contacts/Chat4All/wolf.nanaki", no .vcf file extension at all.

This makes exporting specific file types only from an archive a bit trickier.

Regarding cropped file extensions:

In certain cases the archives will contain filenames with cropped file extensions. Especially image files from the Google+ Stream Photos type tend to be affected.

Here's a list of unique file extensions extracted from one of my last Takeout archives (and the command used to extract them):
`7z l -an -ai'!takeout-20181111T153533Z-GooglePlus-00*.zip' | ggrep -Ev '^Path = |Listing archive: ' | ggrep -E -o '\.([^. ]+)$' |sort -u`
.3gp
.CR2
.JPG
.PNG
.csv
.gif
.html
.ics
.j
.jp
.jpeg
.jpg
.json
.m4v
.mp4
.mpg
.nanaki
.net
.p
.pn
.png
.tiff
.vcf

As you can see, several png and jpg files got their file extensions cropped to `.pn` (or even `.p`), `.jp` (or even `.j`).

Example files:
`Takeout/Google+ Stream/Posts/FiXato with Pancake Helmet - By Jessica aka Maki.j`
`Takeout/Google+ Stream/Posts/GoogleSearch-AutocompleteHorror-IAccidentallyAte(5).p`

This again makes extracting just specific filetypes from the archive needlessly difficult, and makes it harder for operating systems to recognise file types without looking at the file header.

If filenames need to be cropped, I would suggest to make sure it doesn't affect the file extension, and instead crops from the file name instead.

Also, I would very much appreciate if archives would include an index file that lists all the filenames that had to be cropped, as well as including a mapping between the cropped filename and their original filename, so I can programmatically restore the original filenames after expanding the archive.

(It also shows the `.nanaki` extension from my earlier Feedback regarding Contacts with a period at the end of their name. The `.net` extension apparently is also the result of the same flaw, namely a contact named "Esper.net")

UPDATE: Note, the Esper.net issue is actually not because of the name of a contact, but rather because of the name of a Group Label. The composite VCF file for all contacts in that group will lack the `.vcf` file extension.


#GoogleFeedback #Feedback #Bugreport #GoogleTakeout #Plexodus #GooglePlusExodus #PlexodusTools

Comments

”go"