sign up log in
Want to go ad-free? Find out how, here.

Single parameter left blank by Google caused Aussie superannuation fund's cloud subscription to expire and be automatically deleted

Technology / news
Single parameter left blank by Google caused Aussie superannuation fund's cloud subscription to expire and be automatically deleted
Google Gemini generated image of a facepalm
AI image generated by Google Gemini

Google has revealed the issue behind the Unisuper cause célėbre earlier in May, that led to the Australian superannuation fund's cloud presence being deleted without warning, leading to a long service outage.

In a post-mortem of the incident, an unnamed blog author blamed the deletion on an internal tool used in 2023 to set up Unisuper's Google Cloud VMware Engine Private Cloud (GCVE).

While Google says its operators followed internal control protocols, a single parameter for the internal tool was left blank when the private cloud was provisioned. This was interpreted by the system as Unisuper's cloud being set to expire after one year.

"Once the expiry date was reached, that was it: Unisuper's cloud was set to be automatically deleted. Furthermore, there was no warning to the customer because of the internal Google tool being used for the deployment of the Private Cloud."

Making matters worse, while Unisuper said it had duplicated the cloud instance across two geographies as protection against outages and loss, the expiring subscription meant this too was deleted. An instance in this context is IT industry jargon for an occurrence of a virtual computer system running in one or more cloud data centres.

Unisuper had migrated to Google Cloud from Microsoft Azure and two of its own data centres last year.

The accidental deletion of Unisuper's cloud instance made the news all over the world, and Google Cloud chief executive Thomas Kurian had to front up for the misconfiguration and apologise.

Recovery from the cloud instance deletion was difficult, and took days with 24-7 effort from engineers required. Thanks to diligent backups by Unisuper, and a robust and resilient architecture designed by the company, data loss was minimal.

To avoid a repeat of the disastrous incident Google said it deprecated the internal tool that caused "this sequence of events", and fully automated capacity management with no need for human intervention.

Google Cloud was at pains to say the auto-deletion of Unisuper's cloud is not a systemic issue and that no other GCVE deployments are at risk. The system behaviour that set Unisuper's cloud to be automatically deleted has also been corrected, Google Cloud said.

We welcome your comments below. If you are not already registered, please register to comment.

Remember we welcome robust, respectful and insightful debate. We don't welcome abusive or defamatory comments and will de-register those repeatedly making such comments. Our current comment policy is here.

21 Comments

"Trust us" they said, "Your data will be safe in the cloud". But when you understand what the "cloud" actually is and how it is managed, then you will realise the safest storage is your own hard drives, backed up and firewalled of course!

Up
9

Yes indeed Murray.  There is no such thing as "The Cloud" other than a couple of words to deceive you into thinking there is.

What happens is your stuff, your work, is on someone else's computer.  No more, no less.

Up
5

Google is doing some garbage work at the moment.

Look at their cack-handed rollout of AI search results in the States. There are numerous documented instances of Google giving flat-out wrong information on everything from who scored the most home runs in a given year's baseball season, through to dangerous health answers (the sort of thing that would be classed as "misinformation" if the average pleb claimed the same).

For example, I've seen Google

  • Telling people that drinking several liters of your own piss a day is a way of curing kidney stones (in fact there was an entire thread I saw on Reddit providing examples of various conditions that Google's AI thinks can be cured through urine consumption)
  • Suggesting "jumping off a bridge" as a treatment for depression
  • Claiming that it's healthy for women to smoke up to a few cigarettes a day during pregnancy 

Although it is all-encompassing into so many aspects of our lives, I think Google's just another example of a business that was great once but which has lost its way, is trading off past reputation and which will never reclaim its former glory. 

Up
9

It’s an absolute disaster. They are giving their prize winning search engine the Google + treatment in the name of “progress” 

Absolute madness given this is their bread and butter, SGE will be the final nail.

They can’t launch anything successfully these days…

Stadia anyone?

Up
3

Agreed. I don't think they have really made anything good/innovative in quite some time. 

Somebody just linked me another example of a Google SGE/AI result where it claims leaving your dog in a hot car is safe, because the Beatles wrote a song about leaving your dog in a hot car.

Two problems - it's never ok to leave a dog in a hot car, and secondly, the Beatles never wrote/performed any song of that nature ... it's a lesser-known internet joke.

However, there are way too many people out there who blindly trust whatever Google tells them/provides them, and "nek minute" you've sous-vided your family hound on the basis of a Google AI recommendation. 

Up
2

https://www.bbc.com/future/article/20240524-how-googles-new-algorithm-w…

However, experts and more than half a dozen media executives and websites owners told the BBC the broad trends in the data are all too real. In place of these sites, there's one platform you'll be seeing much, much more of: Reddit. According to SEMrush, Reddit saw a surge that amounted to a 126% growth in traffic from Google Search. The company is already feeling the benefit. Reddit just announced its first quarterly earnings since becoming a publicly traded company in March 2024. Its revenue totals $243m (£191m), up an eye-watering 48% from the year prior.

Up
2

Good piece that.

Up
0

I've noticed the increased proliferation of Reddit content (as well as Quora and a general 'resurrection' in Google Search results of old forum/message board results, often from many years ago). Without wanting to risk "doxxing" myself, a not insignificant part of my job is somewhat dependent on keeping up-to-date with this sort of thing - I hadn't seen your shared article before tough so thank you for a good read.

I'm torn on the good vs bad of Google prioritizing sites like Reddit in search results (well for now, until they too can just be replaced with AI responses). On one hand, it reflects the way that I've personally found myself searching more and more - specifically looking for Reddit results as they can be a lot more succinct and useful than generic content on blogs/websites that is designed to generate ad revenue. On the other hand, there can be so many outdated responses and a lot of the content is now gated behind a login requirement (Reddit claiming it's to shield you from NSFW/adult content but really it's to force you to create an account and inflate user metrics) 

It has primarily been info and review-type searches that have been impacted, e.g. "air new zealand skycouch review", or "best places to see in Rome" or "how to stop your dog from barking" ... content that is easily creatable by even basic AI tools, so Google has kind of been forced to take a nuclear approach to combat the overwhelming weight of this AI spam, and has chosen to go down the path of prioritising anything that is clearly user generated content on forums/discussion boards whether small or Reddit-sized. 

The problem is that in doing so the algorithm update has killed off a lot of good, independent publishers producing vastly superior content than what Reddit can provide ... they've just committed the egregious sin of wanting to profit off that. There's also now the problem of marketers, business owners etc registering for Reddit, and then assaulting any top-ranking/viewed thread with responses that link back to their own products and services (check any old Reddit post that shows in Google and sort by 'new' and you'll see what I mean). 

Then you get to the BIG ISSUE, which is what exactly is Google's AI being trained on? e.g. with the example I gave above of Google's AI suggesting a depressed person to "jump off a bridge" the actual quote in the AI answer was 'according to a Reddit user, you should try jumping off a bridge'. 

On a side note, Reddit's ad network is astoundingly garbage for most purposes, and both first-hand and from other sources I've got a strong suspicion that a huge amount of the activity on it is bot/fraudulent that unwitting advertisers wind up paying for. 

Up
1

content that is easily creatable by even basic AI tools

I imagine the underlying LLM is simply using data collected directly from sites such as Reddit. There's no magic review writing happening other than a very fancy predictive text service. So it can't review AirNZ on its own, but it can quickly summarise 1000 reviews for you from many different sources.

The one thing I'm curious about, and you've covered it a little in the comment above; with google search results there are a number of hoops a website must jump through in order to "score" higher, and therefore be a more trustworthy source of information. With this content likely vectored, what happens with LLM is that the content result is scored on how "close" the result matches the question - but then how can two pieces of content in abstract from their sources, and completely vectored, be challenged as to which is more trustworthy and reliable than the other? The fact that reddit scores so highly with such varied trust across the platform suggests that currently it's whichever source provides the most content wins the race, rather than which content is correct.

Up
1

Long story short, Google's algorithm has historically determined who ranks at the top for any query (this may all be subject to change with SGE / AI) on the basis of two key factors:

  • Content relevance to search
  • Trustworthiness/authority of the domain/website

The first part is easier to understand. For example, let's say I search for "Air New Zealand premium economy review". Google searches its index of websites/content to find results that it feels best match the nature of the query. Basically what pages/websites that we (Google) know exists talk about "Air New Zealand" and "Premium Economy" and "Review" - or similar words - on the same page. Way back in the day this process was more simplistic (i.e. Google depended more heavily on very clear use of whatever keyword/phrase you had searched. These days Google has a better general understanding of context, e.g. Google might return a result that doesn't specifically mention 'Air New Zealand premium economy review' every second sentence but instead uses more natural language that communicates the same meaning. 

People who do "SEO" will manipulate the specific wording of content and other factors Google looks at (e.g. page metadata) to try and make the page/site as clearly about a specific topic as possible as Google is still 'lazy' and wants to find an easily-understandable result.

The second part is more complex. If you only ever looked at content relevance, from a trustworthiness-of-results perspective you'd have the issue that anybody can just create content that looks plausible and is well-optimised for Google to understand. In other words if I'm better at making content for Google than you are - even if you are a subject matter expert - who is Google to believe? This, in fact, is the big issue that spurred on Google's algorithm change that prioritizes Reddit etc ... because with AI I could literally make in 5 minutes a 1000 page website that portrays the perspective of a seasoned travel/airline reviewer (all perfectly 'optimised' to increase Google's chance of understanding and ranking the content) even if I had never set foot on a plane before.

How to solve this problem? For a long time, Google has done this primarily by looking at links pointing to websites. Basically, if other websites are linking to your site/content, then those links act as 'votes of confidence' that the linked site knows what it is talking about. There are additional factors e.g. in certain verticals (such as health) displaying clear credentials, but linking is the big one. That's why sometimes you Google something and you get a result from a site like TradeMe even when TradeMe doesn't fulfill your request ... because it has enough content relevance for Google to consider it, and then TradeMe has such a strong, authoritative domain that it can rank easily. 

What Google has done with Reddit, forums etc is basically prioritised signals that would indicate user-generated content (e.g. the website is clearly a 'forum' or 'discussion board, or even at a domain level e.g. with Reddit) and then that combines with those two key points above. 

 

Up
1

Yep, it's a 'Large language model' not actually AI. it cannot tell the difference between fact and fiction, if you think of the crap people post on reddit and find out they have been 'training' the AI with data from Reddit, it's no wonder. It's really designed to mimmic human language based on millions of websites. This is going to get exponentially worse as it trains itself, you cannot go in and change the data.

80% correct is what I was told at a conference, and it has moments of Hallucinations, basically makes stuff up. All the language around AI seems to anthropomorphize computer code.

Up
1

Yep - no magic, just very clever. And regularly hallucinates. Intentionally too - a 100% accuracy in machine learning models is considered bad.

Up
1

Pretty good outcome, really. Obviously Unisuper's diligent backup regime turned what could have been an extinction level event into a bit of a nothingburger, they probably now have a stack of sweet credits on their acount so they can pay bugger all for their backup storage while they shop for a new provider. PR nightmare for Google of course.

Up
5

We had 100 gb of storage , through having our emails through Gmail, for a few dollars a month. Now we have 60 gb, for about 30 bucks a month, and constantly getting threats to disconnect, if we go over 60 gb. I am in the process of moving  any personal stuff to a personal Gmail drive, where you get 15 gb free. But it's a real worry they have the power to delete your data. 

Up
0

That's quite a lot of money to pay for some emails and a thumbdrive worth of storage.

Up
3

Yes, if they decide you are a business,  they sting you hard.

Up
0

Considered moving to someone like Sitehost? I have a domain and 5 email addresses (with multiple aliases) through them for $5.75 per month. I don't know much at all about web stuff but they were more than happy to help me get set up.

Up
1

Yes , our website rego is with freeparking , i can move the emails there for similar price to sitehost. more the drive data i am worried about. 

Up
0

If you like tinkering around, there's plenty of youtube videos on how to put together your own NAS (Network Attached Storage).

Up
1

I am still surprised that people still use google, I mean they trail the competition these days in almost everything.

Do not bother using them for search or emails...there are several much better options.

Up
1

FC what do you rate as best couple of search options

Up
0