
As someone who keeps an eye on information flows for a living, the release from Auckland's Data Insight about its seven module The Data Academy continuing professional development workshop was quite interesting.
“Data is everywhere. It should be put to good use, influencing decisions at every level of an organisation,” is what Data Insight general manager Ben Winterbourne says, and that sounds about right.
Data Insight works with big name Kiwi brands on providing what its name implies, and I asked Winterbourne for more details on The Data Academy, and how much it costs.
"We piloted the programme with an organisation of 600 or so seats. We can’t say more other than the feedback is excellent and it gave us the confidence to launch the programme wider," Winterbourne said.
Data Insight does an assessment to determine the level of educational need for a company, and how to break up the modules for best results, Winterbourne added.
That in turn decides the pricing.
"Once that’s completed and scheduling is locked in, we figure out things like venue, AV [audio visual gear], and other logistics. This all goes into the price. We can comfortably deliver The Data Academy between $800 to $1500 a head depending on the number of modules needed and the requirements of the organisation," Winterbourne said.
Back to data itself: there's definitely a whole lot of it these days, in digital format which means information in theory is readily accessible. Data provide great insights as we have huge and powerful computer systems everywhere to churn through vast amounts of information, and now there is artificial intelligence assistance to make sense of it all on your computer screens.
There are also fun moments to be had, as you're exposed to terrible cliches like "data is the new oil" and politicians like Boris Johnson talking about smart cities with sensors everywhere that are "pullulating" with data, without necessarily knowing anything much about the actual topic.
And that's fair enough, as the field like most other specialised ones is rife with dense jargon. Recently, I had to look up what a "data lakehouse" is; autocorrupt wants to call it "data bakehouse" either way, large US analytics company Databricks defines it as:
"A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data."
"ACID transactions" are not drug related, but described as:
"A set of four essential properties—atomicity, consistency, isolation, and durability—that guarantee reliable and predictable database operations."
I know a few people to whom that doesn't seem like an arcane magic incantation, but they huffed a lot of Excel when young and are now permanently damaged by the experience.
Data is a fascinating topic though, and worth thinking about as we wade through increasing and perhaps overwhelming volumes of information and captured knowledge.
By themselves, large volumes of data doesn't necessarily lead to anything much useful. Just like the word "content" can describe the inspiring text you find printed on the pages of a book, it can also refer to what toddlers joyfully fill their nappies with. In the same way, "data" can be an empty word, hiding that not all information is equal, and that its value depends very much on how it's used.
For example, online forums provider Reddit looked at what its users produce, that runs a wide gamut of almost everything under the sun. Some of that large data pool is great, whereas other stuff is just plain weird, but it doesn't matter: data-hungry AI providers are keen on fresh user posts on Reddit, to train their models.
Last year, Reddit signed a US60 million deal with Google to license that user-generated data for model training. A similar deal was signed with Microsoft-backed OpenAI, but the revenue from the AI companies, being better than nothing and basically free money, isn't that much to write home about. Around 10% of total revenue in fact; it's advertising that brings home the bacon for Reddit instead. In 2024, ad revenue was up 50% to US$1.2 billion.
Now, Reddit is thinking about using that user generated data itself, to extract more value out of it. Which the online forum most likely needs to do as its net loss for the full 2024 financial year jumped to US$484.3 million from US$90.8 million the year before.
At least Reddit has full access to potentially revenue raising and product improving data to process. Not having that is, it would appear, a major problem for Elon Musk's Tesla. The electric vehicle (EV) maker is trying to up its sales in China by offering a free trial of its full self driving (FSD) system in the country.
A quick look at Tesla's Chinese site suggests buying FSD when you order a Model 3 would set you back ¥64,000 or just under NZ$15,200.
Tesla FSD in China is hampered by data sovereignty laws in both that country and the United States however, as Yahoo Finance reported. Musk is complaining that Tesla can't ship video for training outside China. Likewise, the US doesn't allow Tesla to train the FSD in China, to improve the system.
That means Tesla's non-light detection and ranging (LiDAR) and radar equipped cars are at a disadvantage to competitors, because the FSD needs data from the vehicle fleet to learn and to improve. Tesla is now working on cooperating with China's Baidu to train the FSD locally.
It's also likely that apart from Tesla's FSD being perceived as not as good as its competition, having an expensive system offer only as a trial is just not that attractive to potential customers. That, and Tesla leader Musk's highly publicised antics which are reported to have put people off from buying the company's vehicles in other parts of the world could indeed have had an impact on sales.
Elsewhere, the right to repair movement, which is hoping for supporting law changes in New Zealand currently, has run into similar issues with data access. It may be that a repairer needs access to the data in a device or machine, but if there are "technological protection measures" (TPM) in the way, beware: as the Copyright Law stands, attempting to bypass TPMs such as digital rights management (DRM) electronics is highly illegal, and punishable by chunky fines and long prison sentences.
Not that anyone's been convicted for hacking through a TPM in New Zealand, but it's something to think about for our parliamentarians pondering right to repair legislation and it's not going to be an easy nut to crack as it involves intellectual property rights.
Conversely, some less than civic-minded individuals have no qualms about facilitating data access for themselves through illicit means to earn a quick buck. It's very common, as you can see by heading over to Google News and search for "data breach". A never ending stream of reports will appear on the topic.
There are many subtle nuances to data, both its gathering and processing. One not so obvious one is found in an earlier piece on interest.co.nz/technology which looks at how Netflix is able to puzzle together detailed profiles of viewers through collecting data on them. Based on that profile, Netflix then supplies viewing recommendations, which people tend to go with, by and large. The question that Mark C-Scott asks at the end of the story, if Netflix is catering to our preferences, or shaping them, is quite poignant, particularly if the intention wasn't to modify what users want.
1 Comments
Great article, yes data. Agree, that sounds about right!
“Data is everywhere. It should be put to good use, influencing decisions at every level of an organisation,”
Heard an interesting point made by Fukuyama in an interview today, he said, we are an "economy of influencers". How true - every little morsel, from every little snippet, keystroke and transaction being put to use by every individual - not just every organisation.
We welcome your comments below. If you are not already registered, please register to comment.
Remember we welcome robust, respectful and insightful debate. We don't welcome abusive or defamatory comments and will de-register those repeatedly making such comments. Our current comment policy is here.