The ICO exists to empower you through information.

Introduction

As the use of the internet has expanded, search engines have become a core feature of everyday life for most internet users. Indeed, for a long time, the traditional search engine was the main interface through which users accessed the internet and found new information online. 

However, this may be changing as novel methods of search are gaining traction. These often move these activities not just away from the traditional search engine interface, but off our devices altogether. 

Instead, search results could increasingly be impacted by the way in which users interact with their surroundings. They could generate recommendations from information drawn from the digital and physical world around us. The further embedding of emerging technologies, such as generative AI in particular into search interfaces, is similarly likely to transform how we find information online. These changes may have privacy and data protection implications. Given the centrality of search to our experience of the internet, any such impact has the potential to be significant. 

About search

Traditional search engines grew in popularity as web 1.0 and web 2.0 gained momentum during the late 1990s and early 2000s. They responded to text-based queries, for example about sport scores, the latest news or local restaurants. They did so by providing ranked lists of websites which might provide an answer to those queries. 

The later use of algorithms helped these engines to provide more relevant and useful results. Search engines traditionally monetise their products by embedding advertising into search results and ranking them accordingly. The data processing56 involved with embedding targeted advertising into search has seen some raise privacy concerns.57 In recent years, both the ways users search and the methods of providing these results have undergone a significant evolution. These new developments aim to provide a more personalised experience to users, better tailored to their individual wants and needs, but are likely to involve the collection of even more personal information in the process.  

State of development

A number of existing and emerging technologies may radically change how we use search. A combination of different methods of search, lots of new data points, sensors, ambient tech, and AI can make search a more personalised experience and allow these solutions to interact with the world around us.

One such development is the increasing use of voice-based search, particularly through voice-based assistants on smartphones or dedicated IoT devices. A quarter of UK citizens now own a smart speaker.58 Their use as search tools may further increase as the ability of these systems to converse with users in a natural and fluent way improves. These improvements could include an increased ability of the voice assistant to remember conversations and the ability to analyse a user’s emotional state and adjust its responses accordingly.59 As deepfake technology becomes more widely used, users may be able to change the voice used by a voice assistant to that of a loved one.60

Increased use of multi-modal search will similarly allow search to move away from purely text-based search. Current iterations include the ability to input an image of a particular type of food or service. The search engine then provides relevant local information to the user, such as local restaurant or service recommendations. Other multi-modal options allow the user to use their phone to capture images which will then generate information visible to the user on an augmented reality (AR) overlay.61 For example, users can already use this feature as an image recognition tool to identify the pair of shoes on a passerby or recognise a songbird in a tree. This method of search may become increasingly accessible if technologies such as smart glasses see wider adoption.62 As more users take advantage of immersive environments, searching via virtual reality (VR) technology may become increasingly common.   

Queryless search is another innovative search method which could see development in the next few years. Queryless or “ambient” search refers to information presented in varying forms to the user without the need for a specific user input, such as a question or image. Information can therefore be presented to the user based on information gathered about them at other times or places. For example, options for restaurants which are presented on a smart home device screen during mealtimes. This could be based on a user’s search history, combined with third-party and other personal information. The results could be personalised by information about a user’s dietary requirements. These solutions may develop alongside the increased ability of ambient computing. This is when computers are embedded into our immediate environment, such as sensors.63 This allows more information about our current, direct surroundings to feed into search algorithms, to further adapt what users are presented with.  

Using generative AI as a search engine and embedded within established search engines is perhaps the most visible recent development. Instead of using a search engine to navigate to a website in the hope of accessing a piece of information, generative AI can provide the information directly to the user via a chat-based interface. As generative AI systems become more personalised, they may be able to refine their outputs based on information about the user.

The new methods of search described above are not expected to develop in isolation but will intersect with each other. The search ecosystem could become more complicated and we may see new market entrants and users each using these tools in different ways. Therefore, there is an increased risk to transparency and people’s ability to understand what information has been used and why. 

Fictional future scenario

Miguel uses voice-based search. This means the search engine can use emotional analysis to assess his voice, provide insight into his stress levels and so prioritise restaurants likely to improve his mood. The search engine experience is more personalised and, in some ways, is better able to respond to his preferences than a desktop-based search experience in 2023.

However, the vast amounts of data collected about Miguel means that any data breach is likely to be riskier and could reveal a detailed picture about his life. Voice-based searches are likely to return fewer results, which could mean reduced user choice. The highly personal nature of the search means that the potential for serendipitous discovery is reduced.

Miguel knows that his watch collects lots of information about him but enjoys the highly personalised results the watch provides and the efficiency with which this allows him to choose a restaurant. But after several meals, he does sometimes wonder what other options he might be missing out on. 

Data protection and privacy implications

These innovations are likely to have an impact on data protection and user privacy and will present novel issues for digital regulators.

  • Collecting vast quantities of personal information: One central concern is about the quantity of information these new modes of search collect. For example, in voice-based search methods, such as smart speakers. The information collected helps this search method provide more personalised results to the user. This can compensate for the more limited number of results that voice-based search can offer in comparison with traditional search engines, which can provide at least a dozen per page.

    It is possible that large amounts of personal information will also be required for the level of personalisation potentially offered by queryless search. The large amounts of information collected from a user and various sources, have the potential to make any data breach more harmful to the user. Organisations who develop these innovative methods of search need to be aware of their obligation to conform with the principle of data minimisation This means that they need to ensure that the information they process is adequate, relevant and limited to what is necessary and that sensors, speakers and other means of collecting information only collect what they need to function.     
  • Transparency: Numerous data points, from a variety of sources, may be necessary to feed into the personalised results offered by ambient search solutions that interact with our environment. Users may therefore find it more difficult to exercise key data rights such as the right to be informed and the right of access. A related concern is the reported lack of transparency about sharing information gathered by voice-based search methods with third parties.64 Organisations that provide new methods of search need to process personal information in a transparent manner and ensure users are able to exercise their information rights.  
  • Hallucination: As noted above, there is a trend towards the increased use of large language models. These are used to search for information as well as embedding this type of AI into traditional search engines. The privacy implications of personalised generative AI are set out elsewhere in this report and more generally in our previous publication on generative AI.

    However, there are concerns about the accuracy of the results provided. It has been noted that such systems can “hallucinate”65 and provide inaccurate information to users. While efforts are being made to make errors less likely, some have questioned whether such efforts can be entirely successful.66 This is further complicated as a growing number of websites prohibit generative AI developers from using their websites to develop their models. Therefore, potentially this reduces the diversity and balance of the relevant training information (which can help to reduce the risk of hallucinations).67

    If such hallucinations contain personal information, this tendency could complicate an organisation’s accuracy obligation to “take all reasonable steps” to ensure the personal information they process is “not incorrect or misleading as to any matter of fact”. The future of search and generative AI are linked in other ways too. Recent reports have highlighted a tendency for generative AI content to feature prominently in search results gathered by traditional search engines68, raising concerns about false information spreading across the web.69
  • Intersections with immersive technologies: Multi-modal methods of search, noted above, offer users another way to search and gather information. However, these features may be subject to similar privacy issues, especially if used by next generation smart glasses and headsets. These affect the usage of AR which were noted in the immersive tech chapter of our first Tech horizons report . Immersive technologies may therefore be used, inadvertently or otherwise, to collect information about people in close proximity to the device. This is likely to become more of an issue as relevant devices become more discreet. Other data protection issues include collecting considerable amounts of special category information and targeted profiling.

Current and future methods of search have the potential to offer users a wider variety of ways to seek information. They could offer the user more personalised results and an improved user experience. They could also have an impact on user privacy and will have data protection and privacy implications that organisations should consider from the outset.  

Recommendations and next steps

  • We are planning a foresight report on the future of search and discovery which we aim to publish in 2024. 
  • We will continue to monitor developments in the search space and seek to understand how the various technologies and elements that may drive change interconnect and intersect. We will continue to be look out for new privacy implications which may arise. 
  • We will also aim to work with other regulators, notably the DRCF, to bring more definition and regulatory clarity to the search space.

 


56 Forbes article entitled How Much Does Google Really Know About You? A lot.
57 Reuters article entitled Google faces $5 billion lawsuit in U.S. for tracking 'private' internet use
58 YouGov study regarding Smart Speaker usage
59 Daily Upside article about Google smart speakers
60 NPR article about Alexa capabilities
61 Homepage for Google Lens
62 Meta page about new Ray Ban smart glasses
63 ZDNet article about Ambient Computing
64 Mozilla Foundation about Amazon Echo Dot
65 IBM article about AI hallucinations
66 Fortune article about AI hallucinations
67 IBM article about AI hallucinations
68 ITPro article about AI's impact on Google
69 Article from The Guardian entitled Does Australia exist? Well, that depends on which search engine you as