We need Opensource based LLM tools and private AI services – I – ROI-pressure

with No Comments

About a month ago I wrote a post with some criticism regarding a German TV talk about AI. In particular I criticized the statements of a professor regarding the present working mode of AI-tools like ChatGPT and others – both with respect to the online service for standard customers as well as the access to up-to-date resources.

While I think this criticism was justified, one comment of the professor actually indicated the direction Machine Learning and LLM-based AI should move into: The technology – and in particular LLM based agent technology – should become compatible with private small scale setups and technically limited PC or smartphone environments. The Opensource community should consequently engage and realize this objective in all possible ways. In addition education regarding the installation, handling as well as risks of LLMs should become an obligatory subject in school.

The arguments are mainly, but not only based on privacy considerations. In particular AI in business or eGovernment contexts may pose new dangers with respect to espionage. But due to competition also the risk of mistakes and general quality deficits will rise. And the inadequate cycle “AI-training on Internet content” -> more published AI generated content” -> “new AI training on Internet content” should be broken.

Big investments drive a development to enforce the utilization by masses

Some days ago we could read that OpenAI produced a deficit of 5 billion dollars, but has an estimated sharehoder value of around 85 billion. Rumors say that only the daily provision of the AI chatbots come with costs of 700 million dollars barely compensated by customer fees. As with Tesla investors obviously expect that returns of investment will come – e.g. by a kind of company controlled mass distribution to customers and/or with future killer products used and bought by companies or private users.

On the other side of the present hype we see warnings of Goldmann Sachs which pointed out that the present and near future ROIs might be behind expectation, namely only around only 25% of the total investments and costs for investors and consumers, respectively. Yesterdays decline in tech values at international stock markets might be a first reflection of this is. ROI with respect to AI is not coming any time soon. In addition we see a competitive race between the usual big tech companies regarding AI development and marketing. All in all, we are in a typical phase of an investment cycle where giant investments drive a prophecy which must fulfill itself – better sooner than later.

As with the Internet and smart mobile devices the success of a new type of data tool comes with respective services of leading tech companies which make it easy and affordable for normal citizens worldwide to use the technology for “private” purposes. This in turn drives the competition between tech giants who can afford the investments for the required large scale infrastructure.

Without doubt AI in its present form of chat-interfaces to some seemingly “intelligent” machinery has the potential to reach masses. This both applies both to general purpose AI-tools (as chat- and Q&A-applications) and to specialized AI-agents supporting daily routines. As with other services like e.g. search engines and social media profit appears to be a low hanging fruit depending only on vertical and horizontal up-scaling to reach broader masses of the Internet and tech gadget users out there. And the hope is that on the way new ports to even more ROI may be opened by new applications … which is well possible. E.g. in the combination with robotics.

What about profit gneration? Here, we see an old pattern again: Either the tech companies earn money by fees and licenses which the customers have to pay to use highly specialized or high quality AI services. For general purpose tools also another standard approach is used: One generates money with advertisement (e.g. with a chatbot which “by the way” directs you to have a look at certain products to cover your needs) or by indirectly gaining a better understanding of who you are and what your needs are in certain contexts. A 3rd path is to cleverly implement AI assistance as an easily accessible support element within other tool chains (office applications, SW-development, …) – and thus bind you, the customer, to this tool chain. You can bet that this was a motive of Microsoft with its ongoing investments in OpenAI.

Whatever way, you need to reach the masses out there. That is what at least the giant investments in AI request from the big tech companies. This is just how capitalism works – whatever you may think of it politically. Probably, no tech leader in the US would contradict.

Aside mundane profit expectations we should, however, not forget anti-democratic players and actors on the political scenery with interests in large scale data collection about users for the purposes of manipulation and – in the worst case – for a better control of citizens.

The problem with mass directed AI is that presently the up-scaling of the required infrastructure is so capital and so energy-expensive intensive that only some commercial monopolists or states (like China) can finance it. So, in this case we have a recipe for capital, knowledge and power concentration.

Privacy and data protection in a world of AI applications provided by big tech companies ?

Some time ago, I had an interesting conversation with an IT-consultant I had once worked together with. He asked me about privacy and data protection in the present world of generative AI. My counter question was: Is there any if you do not enforce it yourself by a careful design of your own AI/ML environment and your own AI/ML applications?

As with the use of any technology that depends on the user’s interaction with a server system of a big tech company you in general have no control over what happens with the data you transmit. When you interact with Google, Meta, Amazon, Microsoft or even Chinese companies, there is no verifiable data privacy. And there is no reason to trust such service providers that the privacy of customer data should have any priority when a ROI or, in the worst case of anti-democratic players, politically motivated analysis and control of the end users is requested.

Most often when services of big tech players are “cost free”, you pay for them by delivering personal information, information about your interaction with the service, time and geo location data, data about your smartphone or laptop and so on and on. All these data can end up in a user profile about you as a person, your abilities, your CV, your interests and your preferences in a lot of contexts. This certainly holds for search engine services as well as for almost all social media services.

This information about you as a person and potential customer can be and is sold to so called affiliates of the tech giants (just read the conditions of the big tech companies as e.g. Google and Meta in detail). And typically, you as a end-user have almost no control over what in reality happens afterward with your data and where (i.e. in whose databases) they end up. Even the EU regulations have big holes to circumvent the rules.

All of this is nothing new – and we have no indication that this should be different with AI services provided by the big tech players. So, what might be different this time?

Read more in my next post: We need Opensource based LLM tools and private AI services – II – profiles based on the interaction with AI services