Skip to main content

Google Takeout isn't that bad... when compared to others


Google Takeout isn't that bad... when compared to others
Flickr for instance give you no control over what data you specifically want to export. I recently requested an export of all my #Flickr data, due to Flickr's policy change that will take the free 1TB storage away from long-time members who made Flickr into what it was... (Yes, I'm salty)

First of all, it took a while before the export was ready. I requested it at Nov 7th, 2018 at 09:44, but it wasn't till Sat, 10 Nov, 14:51 that I got an e-mail saying my export was ready.

Secondly, all the download links are on your Flickr profile, but there's no new authentication challenge again before your can download them. As long as you're still logged in, you can download the data. This means if you left your account logged in, they can download your entire data archive without being asked again for a password. A bit of a flaw imho.

The 'Account data' seems to include not only your profile information, but also the metadata for each photo. The two files are only about 17MB each for me, so I'm guessing they are splitting these not for archive file size, but to limit the amount of files in the archive.

Finally, all data is split up into small archives of less than 1GB by the looks of it. No choice of how large each individual archive can be before it is split into a new archive. None of these links indicate how big the archive actually is. You have to start downloading it, to actually know how much space it'll take up.
This is how it's all presented on your Flickr Settings page:

Your Flickr Data is ready! It contains all of the information Flickr has for your account. The links will remain available until Nov 16th, 2018 at 9:44am.

Account data
Download zip file 1
Download zip file 2

Photos and videos
Download zip file 1
Download zip file 2
Download zip file 3
Download zip file 4
Download zip file 5
Download zip file 6
Download zip file 7
Download zip file 8
Download zip file 9
Download zip file 10
Download zip file 11
Download zip file 12
Download zip file 13
Download zip file 14
Download zip file 15
Download zip file 16
Download zip file 17
Download zip file 18
Download zip file 19
Download zip file 20
Download zip file 21
Download zip file 22
Download zip file 23
Download zip file 24
Download zip file 25
Download zip file 26
Download zip file 27
Download zip file 28
Download zip file 29
Download zip file 30
Download zip file 31
Download zip file 32
Download zip file 33
Download zip file 34
This data was generated on Nov 7th, 2018 at 9:44am

Having to download 34-36 archives separately sure is more cumbersome than downloading just 1-2 big ones...

Photo filenames at least don't seem to be heavily cropped as on Google, but it's all dumped into a single folder, rather than separated by album. Filenames seem to consist of title, id, and _o suffix (probably to indicate 'original').

So, while Google Takeout's data is far from perfect, you at least get more choice when it comes to data formats, and what exactly you want to export. Also Google makes it easier to sync those archives to other cloud services, and a bit safer to download them.

Something both Flick and Google Takeout really should do, is add checksum values for the archives so you can verify the archive(s) got downloaded without errors, without having to verify the contents of the archive.

#Feedback #GoogleFeedback #GoogleTakeout #Flickr #DataLiberation #DataLiberationFront #DataPortability #DataExport

Comments

  1. 1) A few points: There is no authentication necessary AT ALL to download these files. Anyone who knows their URLs can download them as many times as he wishes.

    2) The archives are split so that each of them contains 500 photos. So the bigger your photo files are, the bigger the archives will be. From looking at your screenshot, I guess you have around 17000 photos on Flickr, right?

    3) The URLS of the ZIP archives only differ in the last number (which goes sequentially form 1 upwards). Because of this and also because of point 1), it's trivial to write a script that downloads all your archives (53 in my case) using WGET, for example (with automatic resume when interrupted).

    ReplyDelete
  2. František Fuka oh, I hadn't actually looked that close yet at the URL structure. Thanks for the pointer. While not requiring authentication at all, is a flaw imho, it will indeed make downloading all of them a lot easier.

    And yeah, that amount of photos indeed looks about right.

    ReplyDelete
  3. My criticisms notwithstanding, Google's data takeout / data liberation is particularly good among online services. It could be further improved, especially in documentation and management tools, but there's an opportunity to set industry standards here which Google can seize.

    The one superior option are content revision system tools such as Git (used on platforms such as GitHub and GitLab) where the full content archive is simultaneously available locally and remotely, and can be synchronised at any time.

    There are reasons I find this model attractive.

    ReplyDelete
  4. I'm looking at having to retrieve my Flickr photos as well. Thanks for the heads up.

    Don't want to derail the thread, but any recommendations for alternatives to Flickr?

    ReplyDelete
  5. Tevel Drinkwater depends on your usecase. Hobbyists might like DeviantArt.com , professionals might like 500px .

    ReplyDelete
  6. Edward Morbius Another thing that would be extremely useful is incremental/delta archives; i.e. being able to download the changes since a specific date / your last archive. Indeed, scm is nice for that, though for binary data it's still not ideal.

    ReplyDelete
  7. Filip H.F. Slagter Yes, especially if people are doing an early + late-sync move.

    ReplyDelete
  8. Edward Morbius I hate having to download another 50GB every time I want to get a copy of my latest posts as I feel I can't trust Google not to kill posts or block people for arbitrary reasons from the platform.

    Even being able to exclude (or only include for that matter) media files would make a huge difference too.

    ReplyDelete
  9. Filip H.F. Slagter The fact that account and post deletions and blocking means that new archives can't even be reliably used to replace older ones is another pain point.

    ReplyDelete
  10. Ugh, Google Takeout's lack of a way to resume failed downloads is getting really annoying...
    Have tried downloading this 36GB archive for 5 times now, but every time I get a 'network failure' two-thirds of the way...

    ReplyDelete
  11. Filip H.F. Slagter I need to move my list of questions/requests of Google to the #PlexodusWiki....

    ReplyDelete

Post a Comment

New comments on this blog are moderated. If you do not have a Google identity, you are welcome to post anonymously. Your comments will appear here after they have been reviewed. Comments with vulgarity will be rejected.

”go"