We need Opensource based LLM tools and private AI services – II – profiles based on the interaction with AI services

In the first post of this series

We need Opensource based LLM tools and private AI services – I – ROI-pressure

I commented on AI service provision by big tech companies. I referred to the usual critical points of online privacy and private data protection when people and companies use popular AI services. Most of these services exchange text and speech data you provide at a prompt with remote servers of the providers. The real AI runs on GPU equipped server phalanxes at data centers of the AI providers. What later on happens with your data is uncontrollable for the AI service user.

Presently the top AI applications available online (as ChatGPT, Gemini, ..) are based on typical client/server infrastructures with (mobile) clients, the Internet as a transfer medium and specialized remote servers running the big LLMs. Plus additional infrastructure to access information resources by the AI. However, the exchange of information with remote servers is given for other web services, which we feed with personal data, too. The most prominent examples are social media applications.

So, what is different for AI-services in comparison to other services available over the web? As e.g. social media services? What might be particularly problematic aspects of AI services regarding the provision of private information to a chatbot and thus also to the AI service provider in the background?

While I am looking at potentially dark sides of present AI-services in this post series, I would like to explicitly make the the following introductory statement: This is not a post series against AI or Machine Learning [ML] technology – it is about getting control over its exploitation by quasi-monopolists.

The criticism of an acquaintance …

The last weeks I have often thought about an experience I had a while ago. I have a highly educated acquaintance who uses Google and ChatGPT quite often. When we had a discussion about a complicated subject we both said “Let us ask ChatGPT”. My discussion partner then criticized me when he saw that I typed our question at a text prompt: “Why don’t you use the voice interface and speech recognition? Its so much more efficient …” I asked back: “Why should I provide my voice and speech patterns to OpenAI and thereby probably also to Microsoft?” He – himself being around 60 and close to retirement – looked at me as if I were an alien from a long gone historic past. His answer was: “But its only a bot you talk to …”

For me this is an example of how willingly people provide very intimate information about themselves when they feel that it is worth it – and efficiency in the sense of saving time has become a major criterion for our decisions of whether and how we use available web tools. Obviously, also in private life.

The guideline in using tech gadgets is no longer “Think twice of what you do and what or whom you share information with – and what the interests of the service provider might be in this game.” Instead we want shortcut tools – no matter what it might cost. And AI-tools are regarded as shortcut tools – I quote my acquaintance: “So much better than the old dumb search engines”.

What are differences to classical interactions with social media and standard search engines?

Before we dive a bit deeper, and thereby also deal with the “efficiency”-argument of my acquaintance, let us shortly list some points that come to one’s mind when looking for differences of an AI interaction in comparison to classic search engines and social media. Having in mind that the providers – like Google or Meta – earn money by using the data we deliver to their services. The lists below are certainly not complete.

Differences in comparison to search engines

The interaction is natural language based. We can have a kind of “communication” instead of providing only a bunch of keywords.
Furthermore we can “talk” about almost anything – and we apparently can correct and guide the AI by “explaining” what we want, what the context is and why we want an answer.

A short side remark: As an AI can cover the tasks typical for search engines, the business model of commercial search engine companies is threatened. Well trained AIs may in the future replace search engines completely. For the moment they provide at least a better interface for web search – even if they use conventional search engine databases in the background. The results of search engines driven by commercial companies are not at all bias free today. Influencing the ranking is part of the business model of commercial search engine providers. A clear problem therefore is that coming AI empowered search engines will be trained to provide intentionally biased results. Yeah, but this would be nothing really new at this front.

Differences in comparison to “social” media and discussions with “friends” there

The interaction is in general unenforced by the AI. At least for present day AI-services in the western hemisphere. In social media a “friend” may explicitly request and drive a communication with us by sending questions to us.
The interaction is subject centered – and thus a clear expression of something you personally are occupied with or engaged in.
It is no human we talk to.
We sometimes need to define the Q&A-arena by delivering more background details and context information to get the “right” answers. And something a “but” from our side is required …
We may have contact with the same AI service provider in different roles and for different purposes – as we use the provider’s AI tools also in business contexts.

Though being trained in part on social media data, a single AI service cannot replace the full spectrum of social media purposes. But this is true the other way round, too. Today’s social media cannot cover all aspects and benefits of present and future AI applications for an individual. So, what about the combination? Regarding business models, we may find that particularly AI capable social media providers may have an interest in making an integrated AI service your “best” friend on their platforms. But let us look at the points step by step …

Classical business models

What is the classic business model for cost-free or relatively cheap Internet services? Concerning both commercial search engines and social media, a brief and fair description in my opinion is:

The user delivers personal data (name, location, age, …) to get an account. During the interaction with the service he/she provides additional data about topics or subjects he/she is interested in. By collecting such data a profile about a person is created. This profile can be sold to “affiliate” companies which evaluate the provided data for their own purposes – in the best case for creating person oriented advertisement in different contexts of the web. By using cookie-technology even in other web-applications. As Google and Meta have demonstrated, this is a billion dollar business.

(If you doubt my description read the details of conditions regarding the usage of services of Meta (see here) or Google (e.g. here) and respective privacy politics.)

When we extend this business model to AI services we end up with the question of “person profiling” based on our interactions with an AI. The question is: What advantages would such an interaction give in addition to the data we already provide in social media and elsewhere on the Internet?

It is only a chatbot …

Interestingly, with the successes of LLMs a lot of discussion started about the right design of human-AI-interaction. On one side there is the problem of “anthropomorphism”, i.e. the projection or the attribution of human traits, emotions, or intentions to AI algorithms [22]. Then we have the objective to better imitate a human-like empathetic environment during AI usage. AI should be enabled to better recognize and react to human emotions during the interaction. We speak of so called “Emotional AI” in various forms and facets (see e.g. a comprehensive review from Chinese researchers at [2]; but also see [3], [4], [5], [23] and links therein). “Emotional AI”, of course, includes an analysis of user provided information regarding his/her sentiments and emotions. This leads us to the aspect of commercial exploitation of sentiment and emotional profiling; see below.

On the other side critical voices request the opposite, namely clear signs and warnings to the user that he/she does not interact with a human. Not only to mark differences of an AI compared to the working of human brains – but also to emphasize human responsibilities and hinder an unreflected responsibility transfer to machines during human/AI interactions (see e.g. [5]).

If one has followed the literature on AI and anthropomorphism over the last years one could get the impression that the assignment of quasi-human attributes to algorithms during our interaction with an AI is a most dangerous thing. Yes, I agree. But is the background for a (un-) conscious provision of private information during AI interactions not more complex? I am interested here in a somewhat contrary aspect:

Is it not the a priori knowledge that we do not deal with a human which seduces us to provide a lot of information about ourselves in our interactions with AI services?

And: Natural language makes it very easy to express ourselves – and thus give away more than necessary information about our interests and capabilities. Which then can in principle be exploited by the analyzing part of “emotional AI”.

We deliver conversational context and details about what we know and what we like to know about certain subjects

How and why do we interact with AI-services? We have questions or a problem and expect a hopefully neutral and qualified solution. And we want a quick and short representation of the answer in natural language. (Otherwise we would use other methods of research …) To achieve this we provide a description of our problem in texts or spoken words at a text- or speech-analyzing prompt.

In the context that opens up during the Q&A process with the AI we want to be able to go into details and request improvements – in case we were dissatisfied with the answer. Present AI interfaces allow for this; we are eager to become masters in guiding the AI to what we want to hear and see by mastering “the art of prompting.”

What we ignore in our often fascinating engagement with an AI is that in any exchange similar to a conversation you always give away a growing pile of information about yourself. What does such information consist of?

One direct aspect is that during an AI session and during the extension of a question’s context we provide information about what we already know or what we do not know about a subject. And what we would like to know, what we are interested in. This all carries a lot of information about us as a person and our background regarding the subject or overlapping topics. We provide information about to what degree we master subject specific terms (as e.g. technical terms).
Other aspects are indirect and have to do with our use of words and language. We provide e.g. information about how well and how fluently we control the language we use. The diversity of words and expressions tells something about our degree of education.
Furthermore, in the Q&A-process with an AI, all characteristic features of information exchange in human communication occur, too. One may even get kind of impatient with the AI. Feelings (like frustration, impatience) may come up in the “dialog” – especially when you think you are an expert in some field of knowledge. See [6] – and my personal example below. This again transports information about us as individual persons.

The list certainly is not complete. Ok, you may say that this is in parts differs from working with conventional search engines. But – it is not so much different from how we provide personal information during a Facebook or Chat conversation. Also there we provide “emotional traces” in the texts we deliver. Of course, the use of written or spoken sentences in the interaction with an AI application and with a certain quality of answers drives a flow of steady information which can be exploited commercially in a post communication analysis and in personal profiling. But so does social media interaction. The recognition of human emotions by analyzing text sentiments with the help of specialized AI tools is nothing new, either.

However, I think there is one interesting difference of an AI interaction in comparison to a regular conversation with (assumed) human counterparts via social media: Actually, we do know that our AI communication partner is some kind of machinery controlled by SW.

We may unconsciously assign some human capabilities, but in the end we feel relatively safe because we know as a matter of fact: It is a SW driven machinery – and not a potentially harmful person.

**Its only a more or less stupid machine** …

We humans – if we are no brain-dumb bullies – are normally a bit careful in the interaction with yet unknown human communication partners. We do not know what his/her intentions might be. And we know that he/she is in the same situation. So, both communication partners balance the flow of information carefully and try to evaluate the communication partner on different physical, psychological and abstract levels – before we open up and deliver personal information. With an anonymous AI algorithm multiple levels of interaction are not accessible (yet). So, the evaluation of an AI algorithm as a trustworthy communication partner is difficult.

Still and somewhat astonishingly we apparently have a tendency to project human aspects onto AI and LLM models – due to their language capabilities (see e.g. [7] and references therein). At the same time we may feel safe as we know “its only a machine or algorithmic SW”. It can and will not directly harm us – as we know from experience. It remains friendly and polite – even if we get kind of verbally rude … (see [7], [8]).

This is an interesting psychological mixture! I am no psychologist – but it may explain why some people find the conversation with an AI chatbot more satisfying than with a human. But just because of this effect the named mixture could be a potentially critical one regarding privacy and the delivery of private personal information. If not now [9] – then certainly with future LLMs analyzing and reacting to our emotions from the information we provide to them ([10], [11], [22], [24]).

To get a problem solved we provide a lot of context and details which we feel to be important. As we know that on the other side of the communication is no human being we may feel relatively free and safe to provide more information than we would in a conversation with an unknown human counterpart. We may think that we need not to be too careful; after all, its only an algorithm we talk with. No reservations required …

Because the use of language makes it so damned easy, we also provide context-relevant information and personal argumentation lines in combination with our interests or problems. This may include experiences in our personal history – willingly or unintentionally. And we express even feelings by accepting an answer or uttering dissatisfaction and impatience. From this it can be learned what your weak “trigger” points are … As already said: The recognition of human emotions by an AI and a proper reaction to it is a very active field of research (also e.g. at Meta, see [10]). In the best case for designing better AI interfaces, in the worst case for personal behavior profiling (see [11], and regarding sub-sequent manipulation [19]).

The problem with all of this is: Yes, indeed, its only a machine or an algorithm – but it is one in the hands of companies or organizations with commercial ROI interests – or, in case of dictatorial regimes, with even worse intentions. And the critical information evaluating parts reside within their technical infrastructure.

An example

Let me give you an example: I sometimes talk with some LLMs on problems regarding the physics of Black Holes. Harmless and neutral? Safe ground?

Well the companies providing the LLM-service have my account and respective data. It is relatively easy to find out who I am in reality. (I have tried it myself with the provided obligatory information.) But let us assume I had faked some of the data. What could they still learn about me? By analyzing the “conversation”? E.g. by a well trained profiling algorithm …

The problem “discussed” with the AI was when and where exactly the so called event horizon comes into existence during the collapse of a mass shell to a black hole. During my “conversation” with the chatbot I felt that I had to correct the standard answers of the AI (reproduced from popular Internet articles) sometimes. Step by step “we” went relatively deeply into the present literature on the topic, into questions of the solvability of the non-linear equation systems for spherically symmetric cases and the question of collapsing shells of photons. The “conversation” became more and more interesting (sometimes even fascinating), sometimes a bit “controversial” and more and more detailed (you have to guide an AI to the right points and extend or focus the range of information sources taken into account).

Now let us look into this from the perspective of a company interested in me as a customer or otherwise exploitable person. What points and personal attributes could such a company derive or guess with some probability from such a conversation?

To name a few points: Probably physicist or ex-physicist, interested or still interested in general relativity, having read relatively recently published books on the topic, but probably no longer being an expert, no native English speaker, some expressions indicating an older man, reacting strongly to unclear answers (impatient? arrogant?), willing to provide his own knowledge, obviously interested in IT, too. Now, let us take in addition the IP address => well located in southern Germany around Augsburg. Time of conversation: Working hours. Retired person?

What could we sell him? In the best case: Books on relativity and astrophysics. How to manipulate this person if requested? By undergirding arguments by scientific or seemingly scientific positions. How to exploit his knowledge if required? By appealing to his scientific education and using a technique of positive feedback (see [12]). How to provoke him? By wrong citations or weak arguments.

This is quite a lot. Now add the knowledge from further conversations about other topics – e.g. regarding society and politics. You see where this is leading us to? To extensive profiles of AI users. Backed by authentic conversational behavior. Specific for certain branches and subjects.

Profiling is done by Social Media providers – and it will probably be done by commercial AI service providers

“Personal Profiling” was and is done by social media providers as Meta, but also by other organizations. For a comprehensive survey on profiling see [11]. For profiling on a social media platform see e.g. [13], [14], [18]. Regarding Facebook there are rumors that profiles were and probably are even generated for non-Facebook users ([15]). See the end of chapter 2 in [11] for proper definitions of a “profile” and the act of “profiling”. For profiling with the help of AI on the psychological level see e.g. [16], [20].

What we already know from “social” media is: The more knowledge is gathered about a user via direct “conversational” interactions with the interface of a social media service, the more physical and psychological qualities, capabilities, preferences (also political ones) and needs of a person can be and are distinguished, described or even measured in a digital model of the user – a digital profile.

In addition the art of creating better human like AI interfaces drives research on the so called “emotion AI“, i.e. the detection of human emotions by text sentiments, by direct speech analysis or the analysis of facial expression during parallel video information. This will enable (future?) AI to predict both the direct reaction of the person to certain topics, to certain words and to the general development of a conversation. Profiling can and will include behavioral predictions.

Our language (and our faces) betrays us – it transports both direct and indirect information about us. All analyzable – ask psychologists or Meta. Or have a look at [17] up to [22] and references therein.

While profiling in general also can have many some positive aspects [11], one driving force in (naturally capitalistic) companies is that profiles can be sold to “affiliates”. So, adding AI based supplements to their services will be an extremely interesting subject for “social” media providers. A personal AI advisor or counselor (with all legally required warnings about possible mistakes), a kind of “your best friend” on a social media platform, would be the ultimate source of information for enhancing personal profiles.

So, no wonder, really: Meta already offers access to a so called “AI Therapist” [24]. I quote from the respective web page: “Welcome to AI Therapist: Your Emotional Guide, an innovative journey into emotional well-being powered by artificial intelligence.”

The point is, however, a general one: With AI services offered by Big Tech companies the temptation to create, use and sell really extensive profiles will also become much, much bigger. One reason is an achievable “scaling aspect” by integrating AI in IT tools we use on a daily basis.

Services with integrated or embedded AI

Something which in my experience is often not evaluated to its full extent is the following development:

A simple, but effectual scaling effect in profiling can be achieved by service providers who already offer other IT tools and can afford the investments in supplemental AI features and a respective infrastructure. As soon as companies integrate supplemental AI services into other daily used IT tools (like e.g. Office tools), they can enhance the potential collection of data on and about their customers by a huge factor. In very different contexts of daily life.

Such an integration of AI into basic work tools can have many facets: Use AI to build up a lively presentation on a topic for your company’s management. Use AI to produce a summary of a video or audio conference. Use AI to solve a problem with some piece of SW you or your company develops. Use AI to prepare advertisement on a new product you have developed, but not yet released to the market. Use AI to transform commands given in a language into functions or procedures of SW or machines. Use AI agents to write mails to customers or friends …

Let us call AI features of daily working tools – “integrated AI” or “embedded AI“. The difference would be that “integrated AI” are integrates sub-tools explicitly used by the customer. “Embedded AI” would characterize AI tools working in the background for analyzing input (and maybe give some recommendations).

In all variations we deliver information about certain subjects we personally and/or our company are/is occupied with. And exchange information with some AI data center. With the usual arguments from the providers: We only improve our tools and services – in the interest of the customer. True – but maybe not the only truth …

But: If we have good reasons to believe that an interaction with AI carries information both about the person and about subjects, we must also accept that all providers of online or regularly downloaded IT-services could create enriched personal and company profiles via integrated supplemental AI into their original programs, apps or tools. As profiling is a billion dollar business … the temptation to exploit user and/or derived company data exists and lurks in the background.

Constant AI interaction by people in different roles – information gathering scaled up

No one forces you to work with or engage in stupid and time wasting echo chambers of social media – and display yourself there. Nobody forces you to use Bing or Google for Internet searches. However, the use and application of AI may become a “must” in many job situations. This brings us back to the “efficiency” argument uttered by my acquaintance. And to the present development of “integrated AI” and respective scaling effects.

The hype around AI has turned into a well known message: You MUST use our AI empowered tools to “survive” in a competition on the market. As a qualified professional person – and as a company. No week without a respective article in a leading newspaper here in Germany. The buzz words are : “gain efficiency by AI or loose competitive power and potentially disappear from the (international) market”.

Remember what they told us when the Internet, Cloud Services, “Social” Media for person oriented advertisement and the 4th and 5th generation of robot-based production were build up? “You use it – or you and your business will be disrupted” – due either to noncompetitive efficiency or due to missed chances of a direct interaction with potentially millions of customers. It actually got true … somewhat later than expected, but in the end true. This type of history repeats. New dependencies of private persons, companies and even governments will be built up regarding AI. And, once again, everybody shrugs with the shoulders regarding consequences for data privacy in the first round …

As AI can be used in almost all contexts of personal and business life, tech industry has consequently started to integrate AI into all the tools we use on a daily basis. This endeavor also is part of the competition in between the big tech companies. It is also a matter of customer binding. Take Microsoft as a prominent example: AI empowered search engine Bing, and “copilot” functionality in almost all of their products.

Now assume that due to the AI hype a person is interacting permanently with a certain AI-service – on the one side by using this AI privately, e.g. instead of conventional search engines to find information or by using AI features integrated in their preferred Social Media. On the other side as an employee – using AI empowered Office tools, conference tools with AI integrated functionality or in form of special purpose tools like e.g. AI powered SW-development tools. This would be a paradise situation for an overall service provider who wanted to perform personal and company profiling at the same time.

Due to the “market pressure” we are going to have a direct and presumably rather careless information exchange directly with companies capable of both providing AI services and using AI for profiling and profile analytics. On the scale of millions to billions of people. Who is capable of something like this? Well, you my readers know the answer – big tech companies and states. And who invests billions in this development? You may find a coincidence …

Company profiling ?

Disregarding individual aspects for a moment – one real problem is that “integrated AI” opens doors for company profiling up to industry espionage. I am not saying that this is done, yet – but the danger is out there. Integrated or embedded AI enables the gathering of relatively detailed knowledge about companies. The data on everything employees work with or are interested in may be provided by AI empowered working tools to the most capable companies for analyzing them and thereby creating not only person, but company profiles. All driven in the name of “efficiency” ..

As soon as a company has made a decision for using an AI empowered toolset, the employees will deliver information required to solve problems (e.g. in the company’s SW) – the management has urged them to use AI to become “more efficient”. AI empowered SW development tools as well as AI supported Office tools can become an incredibly detailed source of information about what a company focuses on.

The real threat is not the AI itself, but the flow of data

So, what is the really dangerous part in the interaction with an AI regarding privacy protection and personal data integrity? Flaws in thee AI algorithm? Biases and hallucinations in the AI’s answers? Our tendency to project human attributes? Our wish to talk in natural language when finding solutions or answers? In parts, yes ..

But the core problem with the present provision of AI services is the flow of information to the AI service providers and to companies controlling embedded or integrated AI.

In my opinion the answer to the respective threats is not a decline to use AI. AI offers so many positive chances for mankind, which we really should grab. The answer is neither to add some fishy additional paragraph to contracts with big tech companies. The real answer is:

We as AI customers must get the physical and technical control of the data flows during the use of AI services or AI empowered tools.

This has implications for the installation of AI. More precisely: We need local implementations of AI and LLM models on PCs or servers controlled by the AI-using persons and companies themselves and not by some service provider – and no data transfer to an uncontrollable server machinery somewhere at the data centers of big tech companies abroad or outside the EU. Note that this technically does not exclude the usage of the Internet as an information resource by our AI in general Q&A-processes.

The ultimate goal must be: We need to develop downscaled AI algorithms regarding resource requirements – without loosing much accuracy. And aside of opensource AI models we maybe need crowd funded “open system and data-flow AI providers”. More about this in forthcoming posts.

Conclusion

The discussion above considered two potentially dangerous aspects of the present and future use of AI:

The delivery of a potential plethora of personal information by text or speech to providers of online AI services – giving them extended options of personal profiling. Intrinsic problems are the unconscious attribution of human qualities to an algorithm, but paradoxically also the fact that we know that we do not talk to human beings.
Integrated and embedded AI paves a road to person and company profiling possible for big tech companies on a detail level not seen before.

The hype regarding AI has already been turned into an imperative: You must use AI to become competitive. This affects not only companies, but us as private persons. Presently, people and companies willingly implement tools with integrated AI – often without looking beyond their direct interaction with the tools. AI interaction is interesting and often fun. We like to forget that we use tools financed with a lot of money invested by others who have a natural capitalistic interest in a ROI. What do we as customers provide in return? At least potentially: Profiles of ourselves – as persons or as companies.

In the next post I will focus a bit on the Opensource aspects of the required development.

So, please, when using AI, remember a wisdom from the early days of the Internet regarding security: On the Internet we have no friends. We deal with interests …

Links and literature

[1] Salles, Arleen & Evers, Kathinka & Farisco, Michele, 2020, “Anthropomorphism in AI“, AJOB Neuroscience. 11. 88-95. 10.1080/21507740.2020.1740350.

[2] “Affective Computing: Scientists Connect Human Emotions With AI“, article at scitechdaily.com
and
Guanxiong Pei , Haiying Li , Yandi Lu , Yanlei Wang, Shizhen Hua, and Taihao Li, 2024, “Affective Computing: Recent Advances, Challenges, and Future Trends”, article at spj.science.org

[3] Meredith Somers, 2019, “Emotion AI, explained“, article at https://mitsloan.mit.edu/

[4] Peter Mantello, Manh-Tung Ho, Minh-Hoang Nguyen, Quan-Hoang Vuong, 2023, “Machines that feel: behavioral determinants of attitude towards affect recognition technology – upgrading technology acceptance theory with the mindsponge model“, article at nature.com

[5] Salles, A., Evers, K., & Farisco, M., 2020, “Anthropomorphism in AI“, AJOB Neuroscience, 11(2), 88–95. https://doi.org/10.1080/21507740.2020.1740350

[6] Melissa Heikkilä, 2024, “Here’s how people are actually using AI“, post on MIT technology review

[7] Adriana Placani, 2024, “Anthropomorphism in AI: hype and fallacy“, AI and Ethics, 10.1007/s43681-024-00419-4

[8] Sarah Gibbons, Tarun Mugunthan and Jakob Nielsen, “The 4 Degrees of Anthropomorphism of Generative AI“, article at nngroup.com

[9] Josipa Majic Predin, 2024, “AI Empathy: Emotional AI Is Redefining Interactions In The Digital Age“, article at forbes.com

[10]
Meta, 2019, “Making conversion models more empathetic“, article at ai.meta.com
and
Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau, 2019, “Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset“, arXiv:1811.00207v5 [cs.CL] 28 Aug 2019

[11] E. Purificato, L. Boratto, 2024, “User Modeling And User Profiling: A Comprehensive Survey”, preprint at arXiv, arXiv:2402.09660v2 [cs.AI] 20 Feb 2024

[12] Prunkl, C. Human, 2024, “Autonomy at Risk? An Analysis of the Challenges from AI”, Minds & Machines 34, 26 (2024). https://doi.org/10.1007/s11023-024-09665-1

[13] Büchi, M., Fosch-Villaronga, E., Lutz, C., Tamò-Larrieux, A., & Velidi, S., 2021, “Making sense of algorithmic profiling: user perceptions on Facebook”, Information, Communication & Society, 26(4), 809–825. https://doi.org/10.1080/1369118X.2021.1989011

[14] Paul Hitlin and Lee Rainie, 2019, “Facebook Algorithms and Personal Data”, Publication of Pew Research Center

[15] L. Heddings, 2018, “Facebook is Using Your Phone Number to Target Ads and You Can’t Stop It“, post at howtogeek.com

[16] Yanou Ramon, Sandra C. Matz, R.A. Farrokhnia, David Martens, 2021, “Explainable AI for Psychological Profiling from Digital Footprints: A Case Study of Big Five Personality Predictions from Spending Data“, arXiv:2111.06908v1 [cs.AI] 12 Nov 2021

[17] Y. Ramon, S.C. Matz, R.A. Farrokhnia, D. Martens, 2021, “Explainable AI For Psychological Profiling From Digital Footprints: A Case Study Of Big Five Personality Predictions From Spending Data“, preprint at arXiv, arXiv:2111.06908v1 [cs.AI] 12 Nov 2021

[18] S. Butler, 2021, “What Are Facebook Shadow Profiles, and Should You Be Worried?“, post at howtogeek.com

[19] Wikipedia on the “Facebook–Cambridge Analytica data scandal“

[20] L. Berril, 2021, https://aimagazine.com/ai-applications/artificial-intelligence-and-people-profiling, article at aimagazine.com

[21] K. Pal, 2023, “AI and Forensic Psychology: Advancing Criminal Profiling“, article at techopedia.com

[22] Example for companies using sentiment analysis for emotional tracking: https://www.cognigy.com/ blog/ sentiment-analysis

[23] Simon Y. Blackwell, 2024, “Empathy in AI: Evaluating Large Language Models for Emotional Understanding“, article on LLMs doing empathy analysis at hackernoon.com

[24] Meta’s AI Therapist, Link to the commercial offer of Meta : https://www.meta.com/en-gb/experiences/7182773695092320/