Skip to main content

A reminder: REQUEST JSON FORMAT for Google Data Takeout


A reminder: REQUEST JSON FORMAT for Google Data Takeout

Google HTML is absolutely worthless.

The instructions don't make this clear, but you're going to see this advice repeated repeatedly and repetitiously. With great frequency. Repeatedly.

REQUEST JSON FORMAT for Google Data Takeout

Why? Because the JSON format can be used by tools for extraction and import. Tim Berners-Lee is building one for Solid, there will be others.
https://github.com/solid/solid-takeout-import (not yet usable, but in process).

REQUEST JSON FORMAT for Google Data Takeout

Why? Because the JSON format contains additional, useful, and critical fields for extraction and import to other sites and tools.

REQUEST JSON SSSSSSSSCCCCCCccccrrrrraaaaaattttcccch......



Tap.
Tap.

Is this thing on?

Hey folks, we interrupt this beat because I hear that you there, no, no, next to you, there, looking at your phone. Yeah, you, well, uh, I hear that you're thinking:

"I don't know what JSON is."
"I don't know what to do with JSON".
"Heck, I don't plan to take any data out of Google. I'm gonna just #DoTheDataWalkaway."

Hey, cat, like, that's totally cool. I get it.

I had a girlfriend who was all into that minimalism stuff. Wicked cool design ethic. But every time I wanted to fry an egg, I had to buy a new frying pan, 'coz she'd, like, donated the old one to charity or something? But she's moved on, and I like totally respect here and stuff.... Where was I? Oh yeah.

But, y'see, we're not all like that. And some of us do know what to do with a JSON data extract. And we know that what it does is to make the information we want to use more accessible wherever else we want to use it. And there's a lot of folks who don't know this yet or are confused because, frankly, Google have messed this stuff up in how they do and talk about it. And those people need to know. And so when you're like "all, whatever man" out here on the dance floor, well, it kinda actually fucks shit up and stuff. So like, if you don't mind, please don't. Not here, anyhow?

And even if we don't know what JSON is, or how to use it, what we know is that there will be tools created. Maybe by Google (we hope). Maybe by third party sites (and they're doin' just that). Maybe by a close personal friend. Maybe by a particularly talented housecat. But it's gonna happen, and JSON is the way.

So. Go back to the phone. Do you important phone stuff. We'll leave you alone. We're cool with that. And let this be the message to the people doing the #GoogleDataTakeaway.

Ya still wanna tawk about it? Start a new thread.

Hip?

Let's get this BEAT BACK ON!!!
Let's get this BEAT BACK ON!!!
Let's get this BEAT BACK ON!!!


REQUEST JSON FORMAT for Google Data Takeout

Why? Because Google's generated HTML is an ugly bastard stepchild of HTML that's not actually useful even as HTML.

REQUEST JSON FORMAT for Google Data Takeout

Why? Because you'll give yourself far more options and far fewer headaches down the road.

REQUEST JSON FORMAT for Google Data Takeout

Why? Because even if you can't make heads or tails of the output, the tools likely to be developed for intake to where you want the data to go will.

REQUEST JSON FORMAT for Google Data Takeout

Why? Because your friendly neighborhood hackers (and Space Alien Cats) can hack something together using 'jq' and 'awk' (or Python, Ruby, Perl, Go, ...) if all else fails.

REQUEST JSON FORMAT for Google Data Takeout

Why? Because it's what you actually want.

REQUEST JSON FORMAT for Google Data Takeout

REQUEST JSON FORMAT for Google Data Takeout

REQUEST JSON FORMAT for Google Data Takeout

(We need a have memes for this.)


Where?

Here: https://takeout.google.com/settings/takeout

Comments

  1. Is there also an option to delete your account? Also where are these options in the app?

    ReplyDelete
  2. sanjuro ogawa Please start a new thread with your questions, thanks.

    The Google Data Take-Out URL is: https://takeout.google.com/settings/takeout

    ReplyDelete
  3. Btw Edward Morbius I'm currently still verifying it, but it's likely the recommendation should also include using Zip rather than GZipped Tarball, as it looks like tar doesn't support non-ascii characters in filenames. Zip should support it, but I haven't been able to test it yet. Still downloading a new archive to do so.

    ReplyDelete
  4. Andy Cowling JSON format already exists for G+ takeout. You just have to select it, because the default is HTML.

    ReplyDelete
  5. Filip H.F. Slagter Thanks.

    I'm starting to address the Data Takeout page on #PlexodusWiki. First thing I've done is protect it (any registered user can edit) as a defence against malicious edits (prime phishing target), something I'd already been aware of.

    ReplyDelete
  6. I'm also going to be patrolling this particular post heavily. Sidelining will be (and has been) deleted.

    Helpful suggestions (Filip H.F. Slagter is on point) are appreciated. Others ... really aren't. This is my One Daily Hardass post.

    ReplyDelete
  7. Sallie Alys Montuori Ah yes. I see it now. Thanks. I wonder if this will work better than the HTML export that reported errors but refused to tell me precisely what they were.

    ReplyDelete
  8. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/ibn6LbmJ0wm1Uzbm3AmHvM-kKncrLYz-6q69Mw7huXmjT6GBMdorEF726HB_rKHdbPWKPbSI_DvXbpZ7D51RCc74BTsFAXuDGBlp=s0

    ReplyDelete
  9. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/xidMmbzf7Klxmy10B0cAqe1KG_xtLVmzDbuUzNl5ETWl1tLnlYrUGSqSzvAplnohcZEp3dfg8RzM0rfSNT7W28dD_GZ4sP2MmpNn=s0

    ReplyDelete
  10. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/XQ6LPZSa7ezWnAxz50fu4sp1e0daKswYoF_aXsj2YV_-w0z6WDtZJJTP4D2h2mr_G3_hERbA_JiCr6zOyWCNF_4b36bS1zKd8B9P=s0

    ReplyDelete
  11. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/wrazOZDZf_BoMD80PzEt_CR7JudBNLFfkKMfEUd9XNslNjV7d6k2A4w2YMvS1UMP1E-5to8kYyrSqcZyR_Salr_k2t94b47aVbR-=s0

    ReplyDelete
  12. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/G2-I2HhcG-vVffoa-iKXyMUj8P2SkrLw28F4VDV1wcg8jXtiV2vA5giMNPGJh61Lce3WgreVKcaW5Oz8NlK-Ws169sS8UuH-wBBP=s0

    ReplyDelete
  13. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/OAhNd-xTpe0_ANK5fe2vOI1_cSPnciIDCCTbKDX56PYBfND9SW9UQwBI9m2nzJhMLlhlolf24KWVv8JNWNL03gZ06uzw-wGq3a31=s0

    ReplyDelete
  14. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/YPmKWK9nPkErgpbls3loWWyHc4EXZuq2ocCiQN8PM7YRoi2rckbpHRdhHzqNm1d8MZnoOCIyYPcBaYmwu7_emDVfUmBBylZyDXFN=s0

    ReplyDelete
  15. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/B2_pMdzZAIyFAT-ELtX0flXQBDYVfaxXZkYAN8UBk18fp9b5wdjOrDFPLbo0L4dGJJRhlqPzd-m42orbZfj8Mam-oSDdCEDcC-Xf=s0

    ReplyDelete
  16. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/OcM0kX5NLAaboHx8x_u9Aag_VCmd-OG8GacGX9KWdcEBX37BQRY1CWBQh0SMD63aVtHzmMQtFzU0SdaurNVzLYlWxk3--olvh1dw=s0

    ReplyDelete
  17. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/zGYNlnXi141fesSdy_nPUGlpYrf0KMTWv-F3ZxC0A0WpadrshkdGIP1BKDX9mOGhUBXib3UltbpwkXlXcD99xeMjk_tzaY9qSD5W=s0

    ReplyDelete
  18. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/nHQrZ2VqNuZpiy8QNkIJVdwNDTtGyHGih1Ha6_XnvZsPsbHNzUJtNpWCKbWaCtO43046wX2zskuMd9gOHRXMiqbX5bYRpU6mjkSK=s0

    ReplyDelete
  19. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/TJz4mWSLcRhogxIluocs9jVOUvOZQY3v9SD4_LVWJfKX0lYjnFWG4NIYThiZ3yl956dw4he-QmF0YzzHPjcXQQDfUAkKer6rq6gU=s0

    ReplyDelete
  20. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/iITZmPqtymBik62nJczgEH7iB9r-kAIatQi6XDpuY_itnOMSYGKXU7dblGjJg0UBXSFnQUEGEyUmyLKS5dAfzsur1axz5RvEK_8a=s0

    ReplyDelete
  21. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/rSDkgL_nNkncNucdSAmVDu5BmPRAjbl5SPqA1LKTqwZl6Ge4S0yuSFihGDUb-f3JbDZMHt1RyVcQJSnr8-5jUxJ3v5D8Vy7PDaVV=s0

    ReplyDelete
  22. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/KE2andbesCmz0Be6SlB0bSk-wNAsIDQ9oXInvwJnrAtnDjDjrMzkmJXUZThq_6MdpD51OJP8miUhNaosyDeyKmBsvtvO01wGMPlf=s0

    ReplyDelete
  23. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/kFjVBPCrcLDAk6hUCQTo6HrEpUR4P8r0Gb4rEwMiWVewcePcG-oqAXsxCbdDzz_0IZa5aKPYcwJn0uXjiKJeBx1PoYAJgGqTSi8k=s0

    ReplyDelete
  24. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/Jko6VdZlNEF5BwCf835QEy32EMV0KafC53kKaFHtujzbY0nAzfFA80s_hhgqJzCL30BCtlqizKreeAY5Nv9yaZg6zT0jAj-GZZxy=s0

    ReplyDelete
  25. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/UYe1qLrSnatBN2rcEPgE8vnMWLP9IPUdlC7iHa5rvowKnlaxpI-xWvfkxQ3_GxlAFpss6Kb0ZyutBHbygFmlkJaTqH2-BGC6RuBl=s0

    ReplyDelete
  26. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/SY-LCu-p2a8ycTQr3tqMCAbVYHGj3zXxHu547dObg3SC3UFPvXjzAO4u5TGn06XrU9a2CIgFAo9L0taWzEzioA4KDlvjPwrAPs3b=s0

    ReplyDelete
  27. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/6EM7y7f6IkppWlN1PA0aENQX4VGHZWmvx48LSNTOI3XmyeBBwVo54wKMp55KhwtLnIQGy0NfYtQXY5FReZcVUZX_BaXkSKPZjnbK=s0

    ReplyDelete
  28. Andy Cowling depends on the kind of errors you get. These ones I got were rather unhelpful. Especially since the stated "Click any of the names to be taken directly to the item in question, and you can download it from there." doesn't seem to apply to these errors I got.
    https://lh3.googleusercontent.com/BgMaQUIkYluGsH6n0ZQK6xd8dT03xE5F_mN6xnZOKay93MSbiCoXT-wTYA7dXj_eNhhSs5nMfet6fru698ZjUaqWBVsnybkayJnJ=s0

    ReplyDelete
  29. Andy Cowling Suggestions on improving error reporting would be appreciated (mention here and file as feedback to Google).

    Creating useful and yet understandable error reports for a userbase of 3 billion+, in a range of languages, and levels of computer and language literacy, is difficult.

    ReplyDelete
  30. Still the question stays: what do I do with that cool format? Can i search, filter the data? Is there some kind of browser ?

    ReplyDelete
  31. Olivier Malinur Ideally, there will be tools for that provided.

    Otherwise there will be tools for that made.

    Regardless, the JSON data are structured to assist in precisely what you're requesting.

    If you are commandline-savvy, the 'jq' (json query) utility allows search and filter, as well as output. I think you might be up for that, and the Linux / programming / Mac folk here should generally be able to manage. Nontechnical users will be over their heads.

    ReplyDelete
  32. Edward Morbius '3 billion+' Are you serious. The population of Planet Earth is c. 7.5 billion. Are you seriously suggesting 50% of every human uses/used G+ ? If so, stop taking the drugs and wake up.

    ReplyDelete
  33. Edward Morbius Facebook, which is widely acknowledged as slightly more popular than G+ is estimated to have 2.2 billion active users in Q2 2018.

    ReplyDelete
  34. One thing to note about using Zip as archive format is: "Zip files larger than 2 GB will be compressed in zip64. Older operating systems may not be able to open this file format. There are external applications that can be used to uncompress zip64 files."
    One notable example of an Operating System with shoddy zip64 support is macOS. Not sure if the latest version of macOS has support for it yet, but it's definitely something you'd want to look into before downloading a 50GB archive ;)
    If you do want to handle a zip64 archive on macOS, I would suggest using p7zip from Homebrew:

    Install Homebrew: brew.sh - Homebrew
    Install p7zip through homebrew: brew install p7zip
    List all files from your command line (Terminal.app or iTerm2.app for instance): 7za l /path/to/archive/file
    Extract file Takeout/index.html from Takeout.zip: 7za e Takeout.zip Takeout/index.html

    ReplyDelete
  35. Andy Cowling 3.3 billion is based on profile counts from Google+ Sitemap files as verified by both myself and ... a now-offline website that I've referenced in the past.

    You might (or might not) be interested to know that I'm the same Edward Morbius who'd estimated active G+ users in January of 2015 (Stone Temple Consulting expanded the methods, sample size, and results a few months later). I found ~4-6 million active accounts, based on public posting activity, and yes, far fewer than Facebook's actives. But at the time, 2.2 billion with a B profiles. That expanded to about 3.3 billion the last time the data were publicly available.

    https://ello.co/dredmorbius/post/naya9wqdemiovuvwvoyquq

    STC's confirmation:
    https://www.stonetemple.com/real-numbers-for-the-activity-on-google-plus/

    I think +Greg Miernicki's site is offline now, but if you Wayback that, you should see the 3.3 billion number as of ~2016, possibly early 2017:
    http://plus.miernicki.com/

    Here we are, May, 2017, 3.358 billion profiles:
    https://web.archive.org/web/20170514033738/http://plus.miernicki.com:80/
    3,358,581,784 Google+ accounts
    5,418,727 Google+ communities

    Or you can grab Google's sitemap files and count profiles yourself. It was 25 GB last time I did that.

    But yes, I'm serious.

    The profiles are based on all Google registrations which didn't specifically exclude Google+, at a time when very nearly all did. The bulk are probably Gmail + Android registrations, and there are absolutely dupes in there. But the raw count is indeed in the billions.

    Check who you're talking to next time. And please keep the conversation on-track, respectful, and productive.
    ello.co - Estimating User Activity: 4-6 m - dredmorbius | ello

    ReplyDelete
  36. Okay Edward Morbius, it indeed looks like it is wise to recommend against using tgz as archive format due to lacking support for non-ascii filenames. Zip seems to handle those a lot better. I've been able to extract json files, which retained their utf-8 filenames. On both macOS El Capitan and W10 at least.

    Example from my G+ Photos archives:
    .tgz: Kafe__ Belgie__ - beertasting - 06.jpg.metadata.json
    .zip: Kafé België - beertasting - 06.jpg.metadata.json

    Commandline command (after installing p7zip from Homebrew) used to extract just the json files into the current directory:
    `7z x /path/to/takeout-2018archive.zip *.json -r`

    ReplyDelete
  37. Olivier Malinur If all you want to do is view your historical data, then HTML is perfectly fine, and will load in any browser. But if you have any hopes of porting it elsewhere on the internet, then JSON is the way to go.

    I'm hoping to be able to port mine either to my Dreamwidth account or a WordPress site. I don't think my diaspora* account will be a suitable place to dump 7 years worth of old posts.

    ReplyDelete
  38. Edward Morbius Wow! If this was included and presented to the major news media, the negative publicity might make Google rethink their G+ August close date.


    (BTW, do you have a daughter that looked like Anne Francis in that scifi movie?) 😁

    ReplyDelete
  39. Edward Morbius Compelling, scientific, factual based evidence. I guess you must be slightly disappointed then that the GMM community has only managed to amass 2,800 members out of that 3 billion.

    ReplyDelete
  40. Andy Cowling But there's so much potential out there!

    ReplyDelete
  41. Mike Waters Altaira bears an uncanny resemblance to Ms. Francis.

    ReplyDelete
  42. Coming back once in a while until August can't be reiterated enough, there WILL be several tools for various types of take-out, from just circle contacts to posts, etc.

    ReplyDelete
  43. Also useful to note:
    You might not think you need something else than HTML right now, but your plans might change in the future. Better to get the more detailed backup while you still can.

    ReplyDelete
  44. then I extract back to HTML in my own desired structure, categorised using metadata and tags from json['content'], onto my own website, optionally with a static version of the comments and plus-one data. While doing that, I can also decide to convert the data to RSS and/or import (some of it based on keywords) into Blogger or other platforms.

    It's not that it's impossible with the HTML format; I could probably do it through scraping with Nokogiri. It's more that I've seen the HTML soup that is commonly created by Google. Garbled soup with obfuscated classes and IDs I'd rather not feed to a parser, nor would like to try to figure out.
    While XML might be a structured data format, albeit more bulky than needed, HTML imho is first and foremost a markup language aimed at turning the data into something human readable rather than machine readable.
    So, when I want multiple systems to be able to handle the information, I'd rather pick the format that's actually designed for the task.

    ReplyDelete
  45. Filip H.F. Slagter Those of us who can code might well do the same. And clearly, having the JSON copy gives you options later. So even if you can't do anything with it now, you've got the copy for later when you or other people can do something with it. But for the vast majority of people JSON is useless to them without somebody else's code. And that's what I was really trying to get at.

    Have a look at the HTML output. The index.html is a horrible mess of javascript. But the individual HTML files aren't too bad. It wouldn't be completely horrible to drop the section into the section of an Atom file. Or as a guide to recreate some html.

    ReplyDelete
  46. Julian Bond People who can't code can be given tools by those who can.

    And it remains easier to create usable HTML from JSON than it is to do so from (most) Google HTML.

    ReplyDelete
  47. Filip H.F. Slagter And how can we do that?

    ReplyDelete
  48. either learn to code, or wait till others like me who have started tackling the same issues, have finished and shared their code / instructions / tools.

    ReplyDelete
  49. I feel the need for a list of code that needs writing.

    ReplyDelete
  50. You could come a long way with using commandline tool `jq` to extract just the data you want from the json files, and then using something like https://www.htmlgoodies.com/beyond/javascript/display-json-data-using-the-jsrender-template-engine.html JsRender to display it as HTML.
    htmlgoodies.com - Display JSON Data Using the JsRender Template Engine

    ReplyDelete
  51. I'm actually working on a jq library first now, to make it easier to aggregate all posts, and filter them by various criteria.
    What I've got working so far, are the following filters:

    def withComments:
    [.[]|.comments as $comments| select($comments != null and ($comments|any))];

    def withImage:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("image/")))];

    def withVideo:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("video/")))];

    def withAudio:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("audio/")))];

    def withMedia:
    [.[]|select(.media != null)];

    def withoutMedia:
    [.[]|select(.media == null)];

    def isPublic:
    [.[]|select(.postAcl.visibleToStandardAcl.circles[0]["type"] == "CIRCLE_TYPE_PUBLIC")];


    with this snippet saved in ~/.jq/library/googletakeout.jq, I can now do:

    $ jq -s 'map(.)' takeout_archive_2018/**.json | jq -L$HOME/.jq/library 'include "googletakeout";.|isPublic|withComments|withVideo' > all_public_activities_with_comments_and_video.json

    to store all my public posts that have comments and videos in a combined json file at all_public_activities_with_comments_and_video.json

    ReplyDelete
  52. ** should be a single asterisk, but Google's formatting escaping sucks...

    ReplyDelete
  53. Filip H.F. Slagter Thanks. I've never coded in js.

    ReplyDelete
  54. Mike Waters this actually isn't javascript, but jq's own language. :)
    stedolan.github.io - jq

    ReplyDelete
  55. New additions:

    def withInteractionWith(displayNames):
    [.[]|select(..|.displayName?| IN(([displayNames]|flatten)[]))];

    def withCommentBy(displayNames):
    [.[]|select(.comments|..|.author?.displayName?| IN(([displayNames]|flatten)[]))];

    |withInteractionWith(["Person A", "Person B"])
    would filter the results to only contain posts that have some kind of interaction with a user with the displayName "Person A" or "Person B"

    |withCommentBy(["Person A", "Person B"])
    is the same, but then limited to comments by any of those users.

    Once I've got a nice set of library functions, I'll put the code on Github along with more detailed documentation and examples.

    ReplyDelete
  56. Updated my jq library a bit more, making it work with the ActivityLog (which now also properly exports at Google Takeout!) as well:

    def not_empty:
    select(length > 0);

    def withComments:
    [.[]|.comments as $comments| select($comments != null and ($comments|any))];

    def withImage:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("image/")))];

    def withVideo:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("video/")))];

    def withAudio:
    [.[]|select(.media != null and .media.contentType != null and (.media.contentType|startswith("audio/")))];

    def withMedia:
    [.[]|select(.media != null)];

    def withoutMedia:
    [.[]|select(.media == null)];

    def isPublic:
    [.[]|select(.postAcl.visibleToStandardAcl.circles[0]["type"] == "CIRCLE_TYPE_PUBLIC")];

    def withInteractionWith(displayNames):
    [.[]?|select(..|.displayName?, .authorDisplayName?| IN(([displayNames]|flatten)[]))]|not_empty|unique;

    def withCommentBy(displayNames):
    [.[]?|select(.comments?|..|.author?.displayName?| IN(([displayNames]|flatten)[]))]|not_empty;

    def urlFromDomain(domains):
    [.[]|select(..|.url?|sub("^https?://(?[^/]+).*"; .domain)?| IN(([domains]|flatten)[]))];

    def sort_by_creation_time:
    sort_by(.creationTime);

    def sort_by_update_time:
    sort_by(.updateTime);

    def sort_by_last_modified:
    sort_by_update_time;

    def sort_by_url:
    sort_by(.url);

    def sort_activity_log_by_ts:
    sort_by(.timestampMs);


    Still have to write documentation and put it in a public Git repo though.

    ReplyDelete
  57. https://github.com/FiXato/Plexodus-Tools will be the location for further updates to my jq library and other Google Takeout related tools.
    github.com - FiXato/Plexodus-Tools

    ReplyDelete
  58. Update notice here as well: Added a bunch of filters and documentation over the past couple of days. Most notably support for filtering by access control lists and getting unique data lists.
    Also added/updated data structure documation at https://github.com/FiXato/Plexodus-Tools/blob/master/activity_data_structure.md
    github.com - Plexodus-Tools

    ReplyDelete
  59. Filip H.F. Slagter Care to submit that as a post if you haven't already?

    (Edited to say what I meant.)

    ReplyDelete
  60. Edward Morbius did you mean post of its own, rather than a comment? I'm waiting with that till I've added the Ruby portion that parses the JSON files into something useful, to the project.

    ReplyDelete
  61. Filip H.F. Slagter yes, that's what I meant (not what I said).

    ReplyDelete
  62. Edward Morbius I hope to have working JSON processing code in by the end of the week. :)

    ReplyDelete

Post a Comment

New comments on this blog are moderated. If you do not have a Google identity, you are welcome to post anonymously. Your comments will appear here after they have been reviewed. Comments with vulgarity will be rejected.

”go"