Skip to main content

Guidance on G+ data migration from Filip H.F. Slagter


Guidance on G+ data migration from Filip H.F. Slagter

We're still working on this, but consider it a starting point for instructions based on current experience. We'll be refining it, testers / feedback welcomed.

Originally shared by Filip H.F. “FiXato” Slagter

DATA EXPORT WITH GOOGLE TAKEOUT
You can already export most of your data through Google Takeout at https://takeout.google.com, a cross-product service provided by the Data Liberation Front within Google, with the aim of making your data within Google portable.
You can export data from a lot of Google's products/services with it, though the amount of services listed might be a tad overwhelming.

Relevant data related to the Google Plus platform are:
★ Google+ Circles:
Exportable as:
· vCard (default, often importable into contact organisers and emails clients),
· CSV (comma separated values, very dense format, easily importable into spreadsheet apps such as Excel)
· HTML (The format websites are written in. Not easily importable, but might be handy for offline browsing)
I'd recommend getting a separate export in all 3 formats, as they each serve different purposes. Though if you had to pick only one, I'd say pick CSV as it's probably the most versatile format, and could fairly easily also be converted into vCard. It might not be as human-readable, but if you wish to import it into something else, it'd arguably be the most useful.

★ Google+ Communities:
exportable as:
· HTML (default, but only suitable as human-readable format, for offline viewing by yourself)
· JSON (machine-readable format, most suitable for importing it into other services should they provide support for it, or to convert into other formats)
This data by itself isn't the most useful as it only contains data about communities, and only those you created. Expect things like your community's name, id and profle links of your community's members. If you want more detailed info about your members, you'd have to use those profile URLs to extract more info through for instance the the People API
NEITHER OF THESE FORMATS CONTAIN THE ACTUAL POSTS FROM YOUR COMMUNITIES
Probably because the copyright of those posts belong to the actual post submitters, and thus technically aren't your data. Your own posts inside communities are inside the Google+ Stream data.

★Google+ Stream
All the posts, photos, events, collections and other activities you have posted onto Google Plus.

You can select subsets of this data (for instance if you only want all your photos from before Photos was ripped out of G+ and turned into the standalone service Google Photos), or all of it.

Personally I'd select all except the ActivityLog, as that currently seems to be consistently generating errors, especially with regards to Comments, PlusOnes and Reshares.
You can then more easily see if there were export errors with the rest of your Stream data, and separately export the ActivityLog afterwards.

Exportable as:
· HTML (default, again mostly suitable for offline viewing, but not really for conversion and/or import into other services)
· JSON (personally recommended, even if you don't currently have plans for converting or importing your data elsewhere. Machine-readable format with the most metadata of both formats.)
You need to individually set the formats for each of the data subsets.

★ +1's
This dataset, at the top of the Takeout list, isn't really related to Google+, but rather a list of bookmarks served as an HTML file, containing websites you've clicked the +1 button on, through one of Google's widgets.

While I wrote that JSON is the most portable machine-readable format, it does still rely on import tools that currently still need to be written / provided to import the data into other platforms. These tools will likely not be developed by Google, but rather by the new platforms, or by other former Google Plus users who developed (and shared) their own tools to achieve such tasks.

When doing an export, I recommend hitting 'Select None' first, and then selecting just the Google+ related products. This data alone will likely be quite a lot already, and will probably take quite some time to generate and download.

After selecting which data you want to export, you will need to make a choice regarding the archive format (Zip or TGZ) and its maximum filesize.
Archives can be split over multiple volumes of various sizes (1, 2, 4, 10, 50 GB)
Archive formats are:

★ Zip (default, recommended). Probably the most well-known format, especially for Windows users. Recent versions of Windows should have no issue opening these archives, and versions since Vista should also have no issue with the zip64 sub-format that will be used when archives are bigger than 2GB. macOS users might need third-party apps such as Pacifist or The Unarchiver, or commandline tools such as p7zip or ditto to extract Zip64 archives. From personal experience this format also supports non-US-ASCII filenames (i.e. glyphs such as Japanese characters, certain accented/diacritics characters such as å, æ, ø, or for instance Hebrew or Greek).
★ TGZ. A GZipped Tarball, also known as .tar.gz. Commonly used in the *NIX world. Depending on compression algorithm it can have slightly smaller filesize, though I haven't actually checked if Google Takeout uses any compression at all.
It shouldn't have issues with archives larger than 2GB, however from my experience, this format does not play well with extracting non-US-ASCII filenames, replacing them with either question marks or incorrect bytesequences, which can also cause cropped filename extensions. YMMV

I hope this tutorial is helpful to everyone. This really is information Google should've already been providing to us themselves though.

Comments

  1. I might make a short video recording of the process as well at some point.

    ReplyDelete
  2. The Google+ stream one, does reshared posts count?

    ReplyDelete
  3. Alvaldong I think it includes the reshares you've made, but not the ones others made of your original post. I'd have to double check this though.

    ReplyDelete
  4. Alvaldong okay, just looked at the JSON file of one of my reshares of someone else's post:
    {
    'content': 'This contains the HTML-formatted text I put in my reshare',
    'resharedPost': {
    'url': 'https://plus.google.com/+OriginalPoster/posts/IdOfOriginalPost',
    'author': {
    'displayName': 'OriginalPoster',
    'profilePageUrl': 'https://plus.google.com/+OriginalPoster',
    'avatarImageUrl': 'url-to-original-poster's-profile-photo',
    'resourceName': 'users/numeric-id-of-original-poster'
    }
    'content': 'HTML-formatted text of the original post',
    'resourceName': 'users/numeric-id-of-original-poster/posts/IdOfOriginalPost'
    }
    }


    It only contains the comments that my reshare received.

    ReplyDelete
  5. Can an app be developed to access our g+ accounts and make saving into a manageable file compression accessible format or why can’t google make a patch to save directly to my google drive? It’s like the least they can do

    ReplyDelete
  6. Google Takeout can store the exported archive directly on your Google Drive. It will count towards your storage quota though.

    ReplyDelete
  7. Re tar.gz: someone pointed out recently that it does not gracefully handle certain characters in filenames. So this compression format should only be chosen if you know what you're doing and have a specific need for it. TL;DR: If you don't already know all about that, you probably don't have any reason to choose tgz instead of zip.

    ReplyDelete

Post a Comment

New comments on this blog are moderated. If you do not have a Google identity, you are welcome to post anonymously. Your comments will appear here after they have been reviewed. Comments with vulgarity will be rejected.

”go"