Can big data gain the trust of the public?

Public Policy / opinion

Overcoming preconceptions: How big data can get and hold a social licence

28th Jan 23, 4:21pm by Zhongchen Song

Source:123rf.com Copyright: piren

In Insight 103, we examined how the Integrated Data Infrastructure (IDI), administered by Statistics New Zealand, is used in New Zealand for research and how this is helping us to make better policy decisions. With 735 projects approved since its development, the IDI has improved our lives in tangible ways. Despite countless benefits the IDI has brought to New Zealand, gaining the public’s trust is likely to be difficult in the face of negative preconceptions around big data. In this Insight, we investigate how the IDI can gain the trust of the public.

What is a social licence?

There is no universal definition of a social licence. In a nutshell, it is the idea that society can bestow or withhold approval for a business to operate because of society’s confidence that it will behave legitimately. The term first became prominent in the 1990s within the mining industry, which likened the effects of a community’s resistance to be as detrimental as a missing legal licence (Clark-Hall 2018).

Since the 1990s, the idea of a social licence has spread across the private and public sectors, permeating global discussions. In New Zealand, various industries seek to measure and cultivate their social licences as a part of their long-term strategies to safeguard their operations (Edwards and Trafford 2016).

As industries have sought to measure their social licence, research has determined that trust is the most important factor for maintaining public approval. High trust indicates that a business has been approved by society, and as a result, it can be assured that it will face little resistance to its operations. An inconclusive trust level indicates that society is either wary or unfamiliar with a business. In such a case, a business’ public perception is especially vulnerable. Low trust levels mean that a business has no social licence to operate, in which case its operations will be a continuous uphill battle against public resistance (Boutilier and Thomson 2011).

Trust is the currency of a social licence. Positive media coverage, constructive interactions with stakeholders, and respecting social norms build trust and move a business up the social licence spectrum. While negative media coverage, poorly managed customer interactions and venturing outside social norms drain trust.

Does Stats NZ have a social licence?

As the custodian of New Zealand’s data, Statistics New Zealand (Stats NZ) would not be able to function effectively without a social licence to operate. For instance, if Stats NZ faced public resistance, then it’s likely that the response rate to the New Zealand Census would fall dramatically. This would seriously inhibit Stats NZ’s functions and hinder the ability of all organisations that rely on Stats NZ data to function effectively. Including both public sector and private sector organisations, from service design, policy planning to academic research.

In 2018, Stats NZ measured for the first time its trust levels. The results showed that 34% of us had a positive or very positive attitude towards Stats NZ, while only 6% had a negative or very negative attitude. While more of us trust Stats NZ than not, an overwhelming 61% of New Zealanders have no opinion towards the agency (Stats NZ 2018).

As a result, Stats NZ may claim to have a social licence – but it is a tenuous one that is overshadowed by the undecided.

With such a majority hanging in the balance, Stats NZ should be cautious towards any potential disruptor that could push this group towards negative perceptions.

Could the IDI be a social licence disruptor?

One potential disruptor for Stats NZ’s tenuous social licence is the IDI.

A 2018 study discovered that few in the community have heard of the IDI (Gulliver et al. 2018). As it is not widely known to the public, the IDI currently has minimal impact on Stats NZ’s social licence status. However, as Stats NZ seeks to encourage the use of the IDI among policymakers, academics, and other social organisations, public awareness of the IDI is likely to grow.

74% of us in 2019 reported knowing little or nothing about Stats NZ. (Stats NZ 2020d). As this is a high percentage, it is likely that those who know little about Stats NZ also reported in 2018 as having neither trust nor mistrust in the agency.

With little to no pre-existing knowledge of Stats NZ, this group is particularly sensitive to any concerns which may circulate around the IDI. This is potentially dangerous for Stats NZ, as initial impressions of the IDI can be fearful. Concerns around privacy, security, and potential harm towards vulnerable groups were raised by a focus group that was introduced to the IDI (Gulliver et al. 2018). These concerns may negatively influence the 61% of New Zealanders who neither trust nor mistrust Stats NZ.

Will New Zealanders trust the IDI?

The IDI holds sensitive information on almost every New Zealander. From our health records, our use of social services, and our interactions with the police. While the security of this information is a major concern for New Zealanders, our last Insight showed that big data could be a force for good – but conversely, misuse of big data could cause significant harm in our communities. To gain New Zealanders’ trust, Stats NZ must prove that the IDI is safe and trustworthy.

Stats NZ is serious about protecting the IDI from misuse. While the benefits of the IDI are potentially limitless, its access is tightly regulated by the ‘Five Safes’ framework – Safe People, Safe Projects, Safe Settings, Safe Data, and Safe Output.

Safe people

Stats NZ won’t let just anyone access the IDI. Only approved researchers are allowed into the system and must first pass referee checks and undergo training to protect confidentiality. Researchers who are judged as too risky or with unethical intentions will not gain access in the first place, and if researchers don’t follow the rules or breach the security of the data, they can be banned and blocked from the database. Those who are allowed in are also liable for prosecution if they abuse the IDI in any way.

Safe projects

The next step is the application. This means providing Stats NZ with the objective of the research project and its proposed methodology. An application must also show evidence that the project is for a statistical purpose, for the public interest, conducted by competent individuals, and that the database can be utilised for the project. Stats NZ also requires IDI researchers to demonstrate that they are aware of their obligations under the Treaty of Waitangi. Researchers need to demonstrate that they have a plan to engage with the community of interest, and identify any need to mitigate the risk of harm to the communities. (Stats NZ 2022a)

Once an application is submitted, it can take several weeks to review. Then, if all goes well, the researchers must sign a statutory declaration of secrecy. Lastly, the application is sent to the Government Statistician (GS) or another person authorised by the GS for final approval (SWA 2017).

Safe settings

As well as adhering to Stats NZ protocols, the database can only be accessed through accredited Data Labs – a secure room with computers containing the database. These computers are not connected to the internet, and there is no option to download or print data. (SWA 2017). Stats NZ houses such labs in its Auckland, Wellington, and Christchurch offices, which are open only during business hours.

Safe data

Access to data is provided on a need-to-know basis, and only the relevant parts of the IDI are made available to researchers. Furthermore, all data revealed to researchers is purged of any identifying information – like name, date of birth, and address. Data which can’t be purged and could still be identifying – like IRD and NHI numbers are replaced with another encrypted number.

Safe output

After completing a project, researchers also need to obtain an exit clearance for their results. This is done through an output check. Stats NZ staff review all findings to ensure that it does not identify any individuals or have the potential to cause any harm to the community. 1 SQL is a standard language for storing, manipulating and retrieving data in databases.

How do we ensure safety while increasing access?

For researchers with the know-how and funding, the IDI is relatively accessible.

However, as a tool, the IDI requires a certain level of capability as the agencies who contribute data can take unique approaches to format; the database is not a uniform environment. All the data from the IDI is managed and held in a data management system, which can be accessed using Structured Query Language (SQL). 1 To exact any meaning, researchers require proficiency in coding, especially in SQL (SWA 2017).

Not all organisations can find staff with coding capabilities, representing a significant barrier to those who would otherwise benefit from the IDI as a research tool. Of course, researchers can hire a professional IDI researcher to assist them on the technical side. However, hiring an experienced IDI researcher can be unaffordable for some.

Stats NZ has identified this roadblock and designed ways to make the data more accessible. Essentially, those without the technical ability to conduct IDI research themselves can join the Group Project Pathway scheme. This new option attempts to make the IDI more accessible. Through it, Stats NZ provides extra support during the application while allowing similar projects to apply for a shared IDI project and pool their resources to hire an IDI researcher. Ultimately it is up to the researcher to find others willing to join them in the scheme, but Stats NZ can, in some cases, help link projects together for those with a limited outreach network (Stats NZ 2022b).

The database can also only be accessed through accredited Data Labs. Although there are data labs in many places like universities, and projects can apply to set up a remote data lab in their own workplace if they meet security conditions. The physical requirement still places a strain on researchers who need to travel to Auckland, Christchurch, or Wellington, to conduct their research.

Issuing after-hours access to IDI researchers would help improve accessibility, but the benefits of this would have to be weighed against the risks of a potential security breach, as supervision during after-hours access will likely be limited.

However, organisations can increase their access to the IDI by applying to set up a private data lab. Organisations with fewer resources may find it impractical to open their own data lab due to strict physical and IT security requirements. Some, like government agencies and universities, have found this possible and have Data Labs that their approved researchers can access (University of Auckland 2022).

As part of the New Zealand Government’s response to the COVID-19 pandemic, some researchers doing critical work were granted home access when data labs across the country were closed during lockdown (Stats NZ 2020a). Examples of such projects include the multilayer networks for modelling complex contagion of COVID-19 run by the University of Auckland (Stats NZ 2020b).

Although the lockdown showed that the IDI could be set up for wider accessibility, it was only considered in a few cases. The demand for home access still exists, and it would greatly increase accessibility for both the well and less resourced. Stats NZ is currently exploring options for remote access. Whether this is a feasible avenue that can maintain the system’s security is an unanswered question (Stats NZ 2020a).

This restriction on access is required to keep New Zealanders safe. This is a balancing act for Stats NZ. On the one hand, they could open the database completely but lose any ability to stop its misuse. On the other hand, Stats NZ could only grant access to a small minority but stunt the database’s usefulness to society. In the middle of these two extremes lies the sweet spot for Stats NZ. A system that opens the IDI as much as possible to benefit the community and guard against the irresponsible.

What happens if New Zealanders don’t trust the IDI?

Looking overseas, we can take lessons from an ill-fated attempt at creating an IDI equivalent that failed to gain the trust of the public, healthcare professionals, or the media.

In December 2013, England created the National Health and Social Care Information Centre (HSCIC), whose flagship initiative was Care.data. The purpose of Care.data was to link individuals’ information from different areas of the national health service (including general practitioners and hospital records) with the data held by the social services (Solon 2014). The intention was to create a database for the NHS to study its effectiveness in preventing, treating and managing illnesses.

The HSCIC would have also allowed approved researchers to access the de-identified information while creating an opt-out pathway for people who did not wish for their data to be used (NHS 2014).

One of the first priorities of NHS England was to gain public trust for Care.data. To do this, they moved forward, assuming they would have to raise public awareness about the new database, its uses, and the safety measures established to protect individuals’ privacy (Carter, T Laurie, and DixonWoods 2015). The awareness campaign was allocated nearly two million pounds, half of which was used to issue a leaflet to all 26.5 million English households in January 2014 (Evenstad 2014).

Almost immediately Care.data ran into serious trouble. One commentator dubbed the failure as the “junk mail fiasco”, as many reported accidentally throwing out the leaflet with unwanted advertisements while others reported not receiving one at all (including the UK’s Information Commissioner).

Those who did receive the leaflet were left confused. Explanations were criticised for being vague and impersonal and failing even to identify the database by its name. The leaflets were also sent before safety protocols for Care.data had been finalised. Two months after the start of the awareness campaign, a review reported that the HSCIC was unlikely to have the capacity to deidentify data extracted from local providers. This further promoted distrust in the safety of the system (Pollock and Roderick 2014).

After a poor start, NHS England struggled to bring on board the public, health care providers, or the media.

A survey by the Medical Protection Society found that less than 15% of 1,400 public respondents reported understanding Care.data.

While NHS England referred people to their general practitioner if they did not understand the database, another Medical Protection Society survey discovered that out of nearly 600 general practitioners, only 20% said they understood Care.data (Bradshaw 2014).

Many opposed the database believing it was a model to raise funds by selling public data to private companies rather than a research tool to improve healthcare results. The media likely contributed to such fears by fuelling ill-informed debates (Hays and Daker-White 2015) and running such headlines as “NHS Patient Data to be Made Available for Sale to Drug and Insurance Firms” (Bell 2014).

Ultimately Care.data never obtained a social licence from the English public. It struggled on for two years and was placed on hold three times by the government. In July 2016, it was permanently closed (Department of Health and Social Care 2016).

A lack of clear information and inadequate safety protocols likely contributed to the failure of Care.data. However, an analysis by the University of Leicester and the University of Edinburgh argued that the public’s preconceptions of big data was more important in the initiative's failure than a poorly executed awareness campaign. (Carter, T Laurie, and Dixon-Woods 2015).

Awareness of a system is simply not a good measure of trust. Few have the time to adequately research a government initiative before making an informed decision on whether it is trustworthy. Instead, people often rely on previous experiences or beliefs to make a judgement (Carter, T Laurie, and Dixon-Woods 2015).

This was reflected in many tweets, which showed a preconception that the British government, NHS England, and an initiative involving big data were incompetent, untrustworthy, and dangerous.

#NHSPatientdata scheme handling a ‘masterclass in incompetence’

#Caredata #NHS Don’t trust UK governments to protect YOUR personal data (& why would you?) –send them a message: Opt-out of #caredata

@[journalist] Businessmen are now in charge at the top. That’s the nub of the problem. They don’t even grasp our concern #Caredata

Source: Hays and Daker-White (2015)

Care.data should have focused on overcoming these negative preconceptions. This could have been done by incorporating so-called trust cues.

Trust cues are powerful methods for gaining public trust because they consider human psychology.

Humans are far more likely to trust an initiative that has a public face, is explained to them on an individual level, and directly addresses their preconceptions.

To create such trust cues, NHS England should have focused engagement with communities. Before any awareness campaign, NHS England should have initiated consultations with members of the public. Communities that have been historically marginalised could have also been a focus since they are more likely to have negative preconceptions. Such consultations aim to identify society’s concerns, priorities, and world views in relation to Care.data.

Framing the initiative in terms of these concerns, priorities, and world views would then be a priority.

This would have communicated to the public that the Care.data was designed with their values at the forefront.

Such a strategy would have had a greater chance of gaining trust and later, a social licence.

Growing trust for the IDI: Ngā Tikanga Paihere

As the failure of Care.data shows preconceptions are a powerful predeterminate of trust. This type of trust, known as “experiential trust”, translates into an imperative for organisations like Stats NZ to be extra cautious. Failing to gain trust will affect future endeavours and their ability to gain a social license.

For Stats NZ, identifying and engaging with negative preconceptions in our communities will be central to gaining trust for the IDI.

A focus of such efforts should be on Māori communities. As inequality continues to disadvantage and marginalise Māori, negative preconceptions around the use of big data are more likely to be present.

Historically, our national data ecosystems, including Stats NZ and the IDI, have not been designed to include a partnership with iwi and Māori. As a result, mistrust among Māori around big data is high, while the rest of the system has missed out on the unique opportunities for Māori insights (Stats NZ 2021).

To overcome this, in 2021, Stats NZ partnered with the Data Iwi Leaders Group to co-design a Māori data governance model. Such a model would respect crown obligations under the Treaty of Waitangi and create a space for Māori governance over Māori data. The model is still being refined and assessed by Te Ao Māori and Kāwanatanga groups (Stats NZ 2021).

In the case of the IDI, progress has been made to acknowledge Māori data sovereignty and utilise Māori insights.

In 2018 Stats NZ and the University of Waikato Associate Professor Māui Hudson developed the framework of Ngā Tikanga Paihere to ensure that the IDI is used safely, responsibly, and in a culturally appropriate way (Te Pokai Tara Universities 2021). The framework is built upon 10 Tikanga (principles) to guide the use of the IDI.

By requiring researchers who access Māori data to adhere to the 10 Tikanga, Stats NZ is currently mandating researchers to highlight that the IDI promotes Māori values, priorities, and world views. This includes demonstrating that the research projects need to engage with the communities of interest, and that the researchers have considered how the findings may be potentially harmful to the communities. By doing so, Stats NZ is growing trust and confidence among Māori communities.

The ten Tikanaga have been developed from a Māori point of view. However, this in no way limits the potential use of Ngā Tikanga Paihere to only Māori. Indeed, as the Tikanga are explained below, they could be applied to all community groups who are stakeholders in IDI research.

By recognising the utility of Māori worldviews and allowing such insights to feed back into the whole of the IDI, Stats NZ can develop trust and confidence across all of our communities.

Pūkenga and Whakapapa

The first tikanga to incorporate into the IDI is Pūkenga. This relates to the skills and expertise of researchers who use the database. To protect all cultures within Aotearoa New Zealand, researchers who utilise data generated by our communities must first be culturally aware of them. Without adequate Pūkenga (expertise), the likelihood of researchers causing harm to our communities grows. To avoid harm by ignorance, researchers should be required to demonstrate awareness and respect for the cultural values of those they are studying. This means using Whakapapa – establishing relationships with communities before research is initiated, and gaining approval and advice from hapū, iwi, and other groups related to the project (Stats NZ 2020c).

Pono and Tika

All Tikanga should be built upon Pono (doing the right thing) and Tika (being true and authentic). Those impacted by projects utilising the IDI should understand the purpose of the research and its intentions. Regardless of whether communities support or oppose a project, Stats NZ should always endeavour to engage with them. When the community is unaware of a project that will affect them, researchers should take steps to reach out, raise awareness, and seek consultations (Stats NZ 2020c).

Kaitiaki and Wānanga

Stats NZ is the Kaitiaki (guardian) of the IDI, and all researchers who use data from our communities become a Kaitiaki of that community. Being a Kaitiaki comes with responsibilities. They must ensure that they use the IDI in a way that respects and maintains Māori and other groups’ priorities, values, and world views. Organisations that also play a role in collecting such data and delivering results also have duties under Wānanga (instructors/ teachers) and must perform these duties transparently and ethically (Stats NZ 2020c).

Wairua and Mauri

The Tikanga of Wairua (spiritual dimension) and Mauri (life force) relate to a holistic view of our communities and their data. Emotional and spiritual harm is as real as financial or physical harm. The IDI must be guided by considerations of how the use of data can harm or benefit Wairua. This necessitates an understanding of Mauri, or how data can be altered by the researcher after its original collection. Researchers accessing the IDI must ask how their projects change the original intention which collected the data. By doing so, we can understand how data evolves through New Zealand’s pathways (Stats NZ 2020c).

Tapu and Noa

Tapu is the concept of sacredness or being set apart. Information about ourselves we consider sensitive is Tapu. This includes data on our social justice, economic, and health services, and it remains Tapu even when de-identified. Those who use such data must do so appropriately and with respect. In contrast, Noa is that which is free or unrestricted, untouched by that which is Tapu. While Stats NZ should treat data that is considered Tapu with the utmost respect, data that is Noa should be examined to see whether there are benefits to disseminating it among all (Stats NZ 2020c).

What next?

In Insight 103 (Song 2022), we showed that big data like the IDI had improved our lives in tangible ways. Benefits range from better data, conducting research and monitoring results, addressing inequality, and making funding decisions. The powerful structure of the dataset makes it an ideal tool for research in New Zealand, and the research output has been growing significantly in the past decade.

Despite the various benefits of the IDI, the existence of the database is still, in many ways, unknown to the public. For an organisation that requires a social licence to operate, gaining trust in groups most likely to have negative preconceptions around government agencies must be of paramount concern.

As we learnt from Care.data, poorly handled public awareness can be disastrous for big data.

While the Five Safes are important and should be communicated to the public plainly and transparently, developing trust cues in our communities are essential if Stats NZ hopes to gain public confidence in the IDI.

In addition, as government administrative data makes up the majority of the data in the IDI, it is important that individual government organisations put in the effort to develop trust cues and communicate with communities at the data collection stage.

Having a social licence to operate is just as important for individual government organisations as it is for Stats NZ. Government agencies should develop trust in collecting data by being transparent about what they are collecting, why they are collecting it, and how it will be used. They should also ensure that the data is collected fairly and lawfully and that it is protected from unauthorised access or misuse.

Ngā Tikanga Paihere is a powerful framework for growing trust. By recognising the general utility of Māori worldviews, Stats NZ can apply the framework across our communities. This will develop a people-first focus with our communities and place their priorities, values, and world views at the forefront of IDI research.

This content is supplied by NZIER. It is here with permission.

We welcome your comments below. If you are not already registered, please register to comment.

Remember we welcome robust, respectful and insightful debate. We don't welcome abusive or defamatory comments and will de-register those repeatedly making such comments. Our current comment policy is here.

5 Comments

by andreas_od | 28th Jan 23, 5:38pm 1674880737

I was randomly chosen by stats NZ to respond their household survey. Had a feeling like I was questioned by the police. First thing the surveyor told me that if I am not cooperative, I’ll be fined $500. Then I had to spend 2h and answer very detailed financial questions, thanks for not asking what the size of underwear I am wearing. Privacy is not respected at all , they just milked all personal data , after that I had to collect all receipts for 1week so that they know what I spend my money on.

Great way to gain trust. Keep on !

by David Chaston | 29th Jan 23, 2:13pm 1674954834

Isn't that the point of using big data? So you don't have to survey in intrusive ways? People hate those types of surveys; I know I did when it was my turn. But using big data can come with other credibility issues, which is what this NZIER article is addressing.

by Spiceeh | 30th Jan 23, 4:50am 1675007409

Have you noticed the NZIER don't publish any detailed data about their specific funding sources for projects just 'members'(in reality they are clients, but then they would have to classify themselves a real business and not a NFP)...nice to be able to obscure that behind NFP designation don't you think?

There are far too many NFP Think Tanks in existence these days and far too many BS Jobs associated with them. We need to get these people into the real world physically solving real life problems, and not just sitting behind desks being paid exorbitantly to producing 'thinks'(see propaganda) that their funders want us reading, hearing and approving of.

by Spiceeh | 30th Jan 23, 3:36am 1675002971

WTF how is that legal? I can tell you now I will never be answering the phone or mail from Statistics NZ. They have lost their social licence with me and mine if that's the type of behaviour they undertake.

by Spiceeh | 30th Jan 23, 4:36am 1675006574

This article itself is contradictory, telling us there are only three secure public labs, then further going on to say there are also more private labs, then also going on to say there are home access methods. How is it not connected to the internet, yet somehow private labs have access to the data as well as people working from their homes? Are they saying that large tracts of highly personal data are being frequently physically transported around the country? I hope that's being done in Chubb vans with armed security guards and all the most secure measures akin to a presidential visit?

The only way you will gain trust is if the projects working on the data are fully transparent and listed on a public site for anyone in the public to browse through. From identifying the individual researchers, the project data specifics, the intended outcome and the ultimate beneficiaries eg identifying corporations and organisations who have any minute connection to the funding of the research.

No one wants their data anywhere near social manipulation(advertising or propaganda) or economic exclusion(insurance and debanking) projects. We should all be told each and every time our data is touched and by which projects(I don't care if I am pinged a hundred times a day if that is what it takes) so we can decide ourselves if our data can be used.

No matter how much BS de-identification is promised, we should all know by now as has been scientifically proven, de-identifcation is a falsehood, as data matching can be reverse engineered if you carefully select for specific pieces of information ahead of time, or using multiple research projects that then are combined and correlated the right ways.

Lastly, no one should trust 'public good' BS. 'Independent' Think Tanks are all funded by 'someone', big government, big business and big data know all the psychological methods to falsify reasoning. Unless a project can prove to the public at large, and not just a small group of approved approvers, that it is looking to fix a real world existing problem, I cannot see why our data should be allocated to research anything else. There is far too much 'research' taking place in the dark...and 'democracy dies in darkness' don't you know.