The BigBrotherAward in the category technology goes to
Google,
represented in the EU by Google Ireland Limited, for the mandatory AI feature “Gemini” in its Android mobile phones.
Grandmother – no – Google, why do you have such big ears? That could be the question that Android-based mobile phone users have started asking themselves. Using the software update, Google quietly and stealthily replaced the previous Google Assistant with the AI-based application “Gemini”.
Up to now you could communicate with the device via voice commands, start apps, have the weather report read out to you and so on. Even this “old” Google Assistant – a combination of refined technology and latent surveillance – was too clever to feel at ease with. The more data it was fed, the better it could provide answers.
Now the users of millions of Android phones can expect a great innovation: Google Gemini is sneaking onto telephones and tablets via software updates. Most devices will probably have the automatic app update activated, so that practically overnight, without notification, the Assistant will be replaced. Google celebrates Gemini as a magnificent new feature: Artificial Intelligence, which can completely relieve me, the user, of the need to think.
If Gemini is not entirely deactivated and its access blocked, Google’s AI app will virtually become the command centre of the device. Gemini connects into the chat communication, reads messages and automatically devises responses. Unlike the old Assistant, Gemini is a chatbot: I can talk with it, it has an answer for everything and should support me in all situations.
The severe disadvantage is that many private and intimate details leave the device, because – that is the difference to the old Google Assistant – Most of Gemini’s intelligence is held in the cloud. And here, “cloud” means: Google. The data is processed on Google’s servers and is subject to the risk that they will be used for more purposes than most users would approve of. Only simple tasks are carried out by a local AI on the device. At least that reduces the risk of data being released unwantedly into the cloud – there is however no guarantee for that.
Those who have followed the development of AI in the last two years will have noticed that fierce competition has broken out among the leading providers: Who has concluded contracts with whom about the use of training data for the purpose of developing ever better AI models?
Although, a lot of training data has been used without permission from the copyright holders, which has already provoked considerable legal disputes. Google wants to avoid this – and for that reason it obtains a permanent supply from Gemini users. Gigantic amounts of text and voice messages are needed, as well as material from newspaper archives or social media sites. Why?
A Large Language Model (short LLM, this is the correct term for an AI system that processes text and can produce answers to questions) would not be so “smart” if it were not trained with gigantic quantities of data.
So it suits Google quite well that Android users deliver the text passages required for training, free of charge. Since according to the terms of use, Google collects our text and voice messages as well as other content authorised for Gemini, in order to feed its AI model.
Even if I were to agree to the use of my texts for training purposes (after all it is anonymous, nothing can happen!), the following passage might well fill you with unease. In the user terms for Gemini it says:
“In order to improve the quality of the results and the products themselves (for example the generative models for machine learning on which the Gemini apps are based), your conversations with Gemini apps will be read by our examiners (including Google service personnel), furnished with comments and processed.”
(“Comments” means manual additions to the recorded text, which the computer uses during training to help the AI models to classify the text fragment correctly.)
An article in The Guardian reveals what this manual post-processing means in practice. It reports of thousands of underpaid and overworked employees of Google’s contractors, whose job it is to classify sometimes sensitive and unpalatable fragments for the AI training – violent and sexually explicit content. This burdensome work is outsourced to companies like GlobalLogic, who also impose enormous time pressure.
In order to be able to use the training data for future AI models, Google allows itself to store these records for up to three years. Amazon’s Alexa won a BigBrotherAward for a similar practice in 2018.
It should not come as a surprise that our communication content is being used as training data. They are quasi-reimbursement for the soaring operating costs for AI. The development and operation of large AI models swallows a huge amount of electricity – so much that Meta (that is Facebook, Instagram) would prefer to build their own nuclear power plants. Conservative estimates claim that OpenAI needed 50 gigawatt hours of electricity for training ChatGPT‑4, at an estimated cost of 100 million US dollars. And a single AI query, the so-called “inference”, can easily consume quantities in the kilowatt range, especially if, for example, a high-resolution video is to be produced.
Using Gemini on Android is free of charge. That should cause you to sit up and take notice – as for every other service offered “free of charge”.
What then is the problem, if my – possibly very private – conversations are used as training data? For that we must take a look at the way large language models function. They learn by being trained with large quantities of text material and with those they recognise language patterns. The training data is transformed into numbers (vectors) so that the model, using neural networks, can calculate probabilities for the next word. It is improbable, but cannot be ruled out, that a subsequent answer may contain elements of my conversation. By chance – or because a hacker via targeted prompts was able to extract these fragments from the AI model. And this has already occurred.
I do not wish to find my text messages in other people’s AI-generated answers.
It is not about harmless phrases, but rather personal data like passwords or credit card data, which were “learned” from my communication. Google claims to recognise such cases and exclude them from training. But the president of the Signal (messenger) foundation, Meredith Whittaker, issues a clear warning. She points out that AI agents such as Gemini have complete access to our highly personal data like calendars, email accounts or messaging apps, if they are fully integrated in the operating system. She sees an obligation for device manufacturers and AI developers to ensure effective protection and the right to object to the involuntary and usually unnoticed outflow of data.
Until then: how can I prevent Gemini from transmitting my message content for use as training data? For that you must find your way through various submenus on the smartphone to object to activity tracking and the utilisation for training purposes. As a precaution you should also deactivate all connections to apps. As of late, Gemini also offers a private mode, which supposedly at least forbids using data for AI training. Nevertheless, conversations always remain in storage for 72 hours.
Google itself has a tip in the support section, as to how the Gemini app can be used without concern:
“Do not include any confidential information or data, which examiners should not see or which should not be used for improving the products, service and technologies for machine learning.”
So we should not converse about private, confidential topics when Gemini is on board.
We have a better tip: Better not use it, deactivate Gemini, curtail access rights and hope that no data will still be collected.
Congratulations, Google, on your BigBrotherAward in the Technology category.
Laudator.in
