After (re)creating 5 wordpress blogs and one blogger blog, I finally think I have something I can work with.
After (re)creating 5 wordpress blogs and one blogger blog, I finally think I have something I can work with. I'm still not sure exactly how I got there, but I think I have a good guess.
The first attempt was blogger. "Google+ Exporter" (G+E) (does it still work?) didn't have direct support for wordpress (wp) in its earlier versions, so I did blogger. That worked pretty well, but left all the media (mainly photos) hosted on Google's content distribution network.
G+E quickly produced updates that 1) gave a means to save most of the media and 2) export directly in wp format. I ran it again and generated the export files.
Then created a wordpress.com free blog and imported my g+ stream. This is at "mosqueeto.wordpress.com". It's not clear where all the content is being held, but it appears that most of it is still on google's servers. The key may be in the import plugin that loads the export file -- you need to check the box that indicates that media and attachments will be downloaded, otherwise the media just is represented by links. However, that import is done, and I don't want to redo it.
Plus, I would rather have a self-hosted blog anyway. Next I tried bringing up a self-hosted blog on my home network -- not visible to the outside world, but I figured I would learn from the experience. (I've installed wp for other people many times in the past, but always I just left it to them when it was done.) This time I did check the correct box on import. This time the media library on the site was full of media, and the posts all referred to the local media, not Google's copies of it. And even better, somehow by magic, the files in the "wp-content/uploads" directories were saved under the original name instead of the google CDN hashed up name. Something like "IMG-00532.JPG", instead of "lh3.googleusercontent.com/-pPecSpqcHwg/W-sPqwZrPfI/AAAAAAAA_mA/2lJqFgAk4PUF9cjsJiLpIkJ91YxYn2pIgCJoC/s640/p4200784a.jpg". I don't know how this magic happened, but I'm grateful for it.
After fiddling with this private blog for a while it was time to build something somewhere reachable. There are many hosting companies that provide wp environments, but once again I went with Digital Ocean (do), which has a "one-click wordpress droplet" thingie. It was pretty painless to build. I did an export from my private blog and then did an import into the do instance. Everything went very well, so it seemed, until I tried to access my new blog from outside my home network. It seems that all the imports failed because the urls pointed to my private blog so it just left the urls. And they worked when I was viewing things from home! So I was blithely worrying about aesthetics and themes because everything seemed to be working, while in fact they weren't working at all.
And, to add frosting to the cake, it turned out that I had negligently selected a relatively massive host size, with a cost of $40 a month, instead of the skimpy host I had in mind at $5/month.
So I created a new host (droplet) in a more affordable configuration (still completely adequate, mind you), and rebuilt the blog once again.
Except for a minor error that required me to build yet another host and import the blog again... The problem of dealing with a hidden blog was solved by tarring up all the media, uploading it into space reachable by the new server, and then editing the import file to alter all the urls to point to self.
This all sounds horrendously miserable, but in fact all the real time was spent fiddling with aesthetics, which is actually fun. My last three rebuilds took about three hours. Overall, it was worth it because I learned a great deal about the internals of wp and how it is structured.
The first attempt was blogger. "Google+ Exporter" (G+E) (does it still work?) didn't have direct support for wordpress (wp) in its earlier versions, so I did blogger. That worked pretty well, but left all the media (mainly photos) hosted on Google's content distribution network.
G+E quickly produced updates that 1) gave a means to save most of the media and 2) export directly in wp format. I ran it again and generated the export files.
Then created a wordpress.com free blog and imported my g+ stream. This is at "mosqueeto.wordpress.com". It's not clear where all the content is being held, but it appears that most of it is still on google's servers. The key may be in the import plugin that loads the export file -- you need to check the box that indicates that media and attachments will be downloaded, otherwise the media just is represented by links. However, that import is done, and I don't want to redo it.
Plus, I would rather have a self-hosted blog anyway. Next I tried bringing up a self-hosted blog on my home network -- not visible to the outside world, but I figured I would learn from the experience. (I've installed wp for other people many times in the past, but always I just left it to them when it was done.) This time I did check the correct box on import. This time the media library on the site was full of media, and the posts all referred to the local media, not Google's copies of it. And even better, somehow by magic, the files in the "wp-content/uploads" directories were saved under the original name instead of the google CDN hashed up name. Something like "IMG-00532.JPG", instead of "lh3.googleusercontent.com/-pPecSpqcHwg/W-sPqwZrPfI/AAAAAAAA_mA/2lJqFgAk4PUF9cjsJiLpIkJ91YxYn2pIgCJoC/s640/p4200784a.jpg". I don't know how this magic happened, but I'm grateful for it.
After fiddling with this private blog for a while it was time to build something somewhere reachable. There are many hosting companies that provide wp environments, but once again I went with Digital Ocean (do), which has a "one-click wordpress droplet" thingie. It was pretty painless to build. I did an export from my private blog and then did an import into the do instance. Everything went very well, so it seemed, until I tried to access my new blog from outside my home network. It seems that all the imports failed because the urls pointed to my private blog so it just left the urls. And they worked when I was viewing things from home! So I was blithely worrying about aesthetics and themes because everything seemed to be working, while in fact they weren't working at all.
And, to add frosting to the cake, it turned out that I had negligently selected a relatively massive host size, with a cost of $40 a month, instead of the skimpy host I had in mind at $5/month.
So I created a new host (droplet) in a more affordable configuration (still completely adequate, mind you), and rebuilt the blog once again.
Except for a minor error that required me to build yet another host and import the blog again... The problem of dealing with a hidden blog was solved by tarring up all the media, uploading it into space reachable by the new server, and then editing the import file to alter all the urls to point to self.
This all sounds horrendously miserable, but in fact all the real time was spent fiddling with aesthetics, which is actually fun. My last three rebuilds took about three hours. Overall, it was worth it because I learned a great deal about the internals of wp and how it is structured.
Thanks for this info, Kent Crispin! Very useful 😊
ReplyDeleteMarysia Kurowski I should point out that the process is far from perfect, and a great deal of hand editing is required :-(
ReplyDeleteI am guessing you actually have fast Internet. I am dreading how long it will take G+E to download all of my G+ posts over the 2Mbps (on a good day) DSL connection I am stuck behind.
ReplyDeleteWith the original August shutdown date, it was possible I would have our fiber optic to the premises network built but we won’t even have started final construction by April.
Thanks for this write up.
ReplyDeleteWhat kinda of things required hand editing?
Brian Holt Hawthorne F+MG+E has no rate limiting internally and is now being subject to "prove you aren't a robot" CAPTCHA from Google, which it doesn't (as of 1.7.4, the current version) display to you while downloading posts (it displays it only on login), which makes it almost completely useless.
ReplyDeleteYour slow DSL connection may make it work sufficiently reliably for you. I'd suggest starting sooner rather than later though!
I opened a WordPress blog (stevedisque.wordpress.com), and, while I'm getting used to operating it, I must say that the interface is not really intuitive, no matter what the technicians at WordPress may think!
ReplyDeleteSince I'm not really interested in migrating any of my "content" from G+ anywhere -- most of my posts represented what I call "professional ephemera" -- I see no need to use the Google-owned Blogger, especially as Google is cavalier about just closing down sites that it decides violate Terms of Service.
I also now have an Instagram (to promote the blog) and a Goodreads. But it won't be the same as G+.
Kent Crispin
ReplyDeleteThanks Kent.
This pretty much mirrors my experience. If nothing else F+MG+E allows me to get my content saved elsewhere in a readable format until I can make a decision of what I really want to do going forward.
Can someone please tell me what "F+MG+E" stands for? I've searched in the community, which brings back a lot of results, but I can't find an explanation? I'm also seeing "G+MM", and "F+MGE" in my search results and am now totally confused. I'm guessing that these are all different, but perhaps related, apps (or perhaps all the same app) for downloading/uploading one's Google+ content? If there's an explanation of all these various collections of initials in the community, could someone please point me at the link?
ReplyDeleteI do know about the Google+ Exporter, though I haven't used it yet.
Marysia Kurowski
ReplyDeleteThey are all terms used for gplus-exporter.friendsplus.me - Google+ Exporter
And it works great!
Thank you, Andrew Hatchett. In that case I'm puzzled by statements in other posts such as, "G+MM are officially recommending the Friends+Me Google+ Exporter". Got it! Google+ Mass Migration, duh!
ReplyDeleteUpdate: the 1.7.5 release of F+M G+E today uses Tor. It definitely goes slower as a result — by an easy order of magnitude though I haven't timed before and after — and made progress again.
ReplyDeleteThe "Google+ Exporter" ('exporter'), depending on options selected, produces output files in different formats. The ones of interest to me are 1) wp dump file format (an xml file with no included media -- only text, structure info, and links to media 2) a blogger dump file with, I believe, the same general characteristics; 3) an archive file with most of your media encased within. If you unzip the archive you find all the pictures it could get as individual files; And finally 4) a csv file with information about each image.
ReplyDeleteThe links in the xml file generally are urls for objects in google's network which may or may not exist after G+ shuts down. Likely as not they will be gone forever, so even if the file names the exporter are meaningless to you, at least you will have copies.
Since the xml files contain no media they actually download surprisingly quickly.
eric peacock The hand editing I referred to was of the xml files. They contain very long lines, and your text editor needs to deal with those gracefully. I needed to change all occurrences of things like "http:///blog/wp-content/uploads/2019/01/IMG-00444.JPG" to http:///tmp-media-directory/2019/01/IMG-00444.JPG" -- I had put all the files in the old wp uploads directory into an archived, uploaded the archive to a directory "tmp-media-directory" on the new server and made sure the new web server could serve them up as well as my new empty wp installation. Thus, when I did the import, wp pulled the media files from local web service.
I believe that if you check the "download media and attachments" box while doing the wp import that the actual content can be pulled from the google distribution network, including somehow regenerating the original names from which they were uploaded. Google must preserve those -- otherwise I don't know they could have been regenerated. I was very glad of that, because in most cases I still have the original files, and it would have been work to correlate that info.