Skip to main content

Code that needs writing

Code that needs writing

- Takeout.G+Streams.Posts - > Blogger
- Takeout.G+Streams.Posts - > Atom
- Takeout.G+Streams.Posts - > Wordpress
- Takeout.G+Streams.Posts - > Reddit
- Takeout.G+Streams.Posts - > Other platforms that have an import or post API
- Takeout.G+Streams.Posts - > Static HTML as a better alternative to that provided by Google.
- Takeout.G+Streams.Posts.html - > Extract section to files.
- Takeout.G+Streams.Circles - > Enhanced VCard/CSV with additional data via G+API.people.get
- Takeout - fix the filenames to deal with UTF-8 characters
- Find My G+ Contacts on platform XXX. Like https://bridge.joinmastodon.org/

Maybe Google will provide some of this. Like Atom output and one click migration to Blogger. More likely is that the community will cobble something together from the takeout files and Apis.

Comments

  1. Wouldn't JSON-LD be the thing?
    points 1 - 5 could probably all import it, see e.g. code.sgo.to - Feeds JSON-LD ·

    ReplyDelete
  2. Sebastian Lasse I'm quite in favour of the idea of a feed format and schema to rival RSS and Atom but in JSON. But people have been trying to do this for quite a while now and nothing's really got traction. RSS and especially Atom have a huge advantage that there's loads of it out there and loads of tools and libraries for using it. So I'd suggest that G+ migration is not the time for building new formats.

    ReplyDelete
  3. But see the above link. It is not a new format it is a valid JSON-LD with encapsulation. JSON-LD is not a new format, it is very old and recommended by the W3C. Just tried to be realistic because officially evil google will probably give a shit about our data ;) We could simply use that JSON to convert to the suggested formats more easily. BTW: Saw your page. Maybe this is fyi :
    indieweb.org - Events - IndieWeb
    just put in https://indieweb.org/2019/Vlissingen

    ReplyDelete
  4. I know the code that can pull the data in python, ( for example crummy.com - Beautiful Soup Documentation — Beautiful Soup 4.4.0 documentation ) but I don't know the code that can create the new data structures once that's done.

    If you find something in Python that does this, then you can use Beautiful Soup to grab the content out in HTML format and possibly in json, but there are likely just native methods for that in Python.

    ReplyDelete
  5. How does Atom deal with media-assets? My assumption is that it would assume that any media reference resolves to a globally accessible URL. This would mean that all the assets included in the export archive would first need to be hosted somewhere.

    ReplyDelete
  6. Bernhard Suter There's a lot of references in the json and html to media assets on xxx.googleusercontent.com This is the same location that Google Photos uses. So I think it's highly likely that content from your profile that counts towards your storage allocation and is stored in that domain will stick around.

    ReplyDelete
  7. Julian Bond - googleusercontent.com is probably just part of the static resources CDN. I have done a combination of uploaded, and shared from google photo app & website and I don't know which of these are resources that are still linked there. I would guess that images that are not accessible from photos will disappear in accordance with user data erasure rules.

    ReplyDelete
  8. Bernhard Suter Doesn't google host photos in two or three places? I'm always confused by that. There was some service, they used it in Google+ for a little bit then it was gone. I'm never sure if I have a picture only here on Google+ or if it's in some "Google image service". Sigh.

    ReplyDelete
  9. John Lewis Yes. And there's actually a takeout service for the photos you stored in G+ Photos.

    ReplyDelete
  10. The generic problem here is the large number of references in the takeout files to assets and URLs within the G+ system. If G+ goes away completely and those assets and URLs disappear, the takeout files will have major holes in them. Even something as comparatively minor as profile avatars.

    ReplyDelete
  11. Ok, little confused here, what code are you looking for... Just joined the community and my main goal is first extraction of an entire community, including comments by people, date the post was made and everything associated to the post.

    I'm not sure Google takeout actually allows that. Mike are not the only posts, my wife has them as well. Isa family community. I just need a way to extract the data and i can write the other code(even extraction code).

    ReplyDelete
  12. Communities are a real problem at the moment. Only Community owners get anything. They get a list of people and a list of post URLs. That's not enough to get the posts and comments via the API because the API calls for activity.get requires activityId which is different from the URL shortcode with no way from going from one to the other.

    Good luck!

    ReplyDelete
  13. Garry van Diggele - if your community is very small and all the members trust you, you could ask them to individually download their own post archives (in JSON!) and shared them with you, e.g. through Google drive. From there, you can filter and merge all the posts that are part of the community and reconstruct the entity relations within that community. BTW, what is going to be the target format? A static archive (html, pdf) or some new interactive platform?

    ReplyDelete
  14. It's small enough that I could get most of the members data. The community was an interactive diary for my kids since they were born with about 10 posts per week and family member comments including ones past away.

    Target state would be my own social network server in either gcp or aws hosting content to continue the trend. I haven't decided which open source platform I'm moving to.i was considering mastodon...I think. The goal was to give it to my kids for when they are old.

    ReplyDelete
  15. The post archive, does it include comments from other people and the state. I.e who liked it etc?

    ReplyDelete
  16. Julian Bond surely there has to be a relationship you can extract.... Else hopefully they enhance before shutting down

    ReplyDelete
  17. Garry van Diggele - Yes, the post archive includes full comment content and identity reference of the commenter and "plus-oners". At least for now... See here for an attempt to reverse-engineer the schema of the post data: social.antefriguserat.de - Data Migration Process and Considerations - PlexodusWiki

    ReplyDelete
  18. That's a serious good start, it's late here so haven't read it all but with the information I may be able to build something...

    I think I'd store contents in a mongo db or something... Then I can worry about the migration later including the complex image handling component... Maybe serialise... Who knows. Anyhow a good start for when I'm awake

    ReplyDelete
  19. Garry van Diggele - I am planning to move my publicly visible posts to diaspora. blog.kugelfish.com - Google+ Migration - Part I: Takeout

    ReplyDelete
  20. It may be obvious from the above, but beware. Takeout of your own Posts (including into communities) is pretty complete and includes comments. Takeout of posts from a community is VERY incomplete and only consists of a post url.

    ReplyDelete
  21. Yep. How about a twist of someone else? I actually can get access to most accounts on the ad they are mostly family members

    ReplyDelete

Post a Comment

New comments on this blog are moderated. If you do not have a Google identity, you are welcome to post anonymously. Your comments will appear here after they have been reviewed. Comments with vulgarity will be rejected.

”go"