Currently needing some deeper insights into LLMs, what they do and especially how they do it, I decided to write down a few lines, which I hope might be helpful. The post is mainly focused on, well, the dry basics and what implications these have or could have. In addition, it contains a short legal perspective, with the question: What is an AI System!?

So, this is the second try at writing this post. I’m honestly having a hard time at keeping the post technical and not drift off into beliefs, philosophy and static definitions, without explanations.

Getting Started

AI in 2025, generally speaking, refers to the many current LLMs, Large Language Models and the many products encapsulating them. To name a few:

Google’s Gemini
OpenAI’s ChatGPT
Anthropics Claude

While advertising similarly, each one seems to have different strengths and weaknesses. And why wouldn’t they? Their “skills” are directly dependent on how they were trained, which in return is different for every single model out there. Just as the many commercial and freemium models, there is a large amount of OpenSource models.

Some of these are trained for general use / purposes and others have very specific skills like generating images or audio, including speech.

Neural Network (Rough) Basics

If you want to be sure, choose a link from the bottom of the page and go for a better, nicer introduction.
Our current LLM approach is based on Neural Networks, which are often compared to the human brain, even though they aren’t quite as impressive, yet. What they have in common, is the approach to learning:
Digital neurons, which we know, and biological neurons, which we presume, grow based and repetitive triggers. If for example feeding a neural network “AB” and “AB” and again and again and again. The neural network will take “A” and “B” as symbols and convert them into “tokens” which are a numerical representation, which, well, the neural network works on. By feeding it the same sequence again and again, the network learns that usually “B” follows on an “A” and thus initially creates a connection between the “A” and the “B” and increases the connection between the “A” and the “B” over time. Working with real world date, which is never perfect, there will be a few “AC” and “AD” and “AA” and so on. Coming from here, each sequence has an individual probability. Having these sequences established, it would complete the sequence start “A” with a “B” and precede a “B” with an “A”.
Now the fascinating thing about neural networks is that they’re a very flexible tool, a concept, a highly flexible set of algorithms. Thus, only talking about tokens, as (next to) everything can be split up into symbolic tokens. This way the same mechanism can be applied to everything! Sounds, images, text, chemical formulas, DNA…
The main challenges occur during the learning phase, when the tokens are generated for the first time. The main issue here probably isn’t the LLM or the algorithms, but us, the human factor. Our own perspective of the input might be biased, incorrect or simply not optimized for the neural network. As such the neural network might simply need a differently prepared data set or otherwise never establish an effective or smart structure and thus not work properly.

Does the Neural Network Understand the Data it Processes?

Philosophy and beliefs :-(
I always want to say no. And this is possibly for the completely wrong reason. I was taught in school, that mankind was superior. Every time I said that an animal “thought something”, I was corrected that only humans “think”, and animals are “triebgesteuert”, driven by instinct or natural urges. In return, I don’t (often) touch a hot stove, because I know that it will hurt like hell. How do I know it? I was both told and have gathered the experience myself over the past few years. Especially the fact, that anything radiating red or even white light will probably be hot. So, what happened? The Neural Network in my brain, well my brain, learned it’s a stupid idea, thus created neurons and neural connections. The vital factor here, is being able to only match a stove to hot, but detailed parameters like the radiating light, which in return also stops me from touching heated and super-heated metal objects.
For the philosophical part, I now have to ask the question whether I myself “understand anything” or I’m just controlled by the neurons in my brain. In return, if we define exactly this as “understanding” and the process of getting from an input, the radiating light, to an output, don’t touch it, as thinking, then yes, Neural Networks also understand and think. They might not have the same capacity as the human brain and additionally not have a such diverse training as an adult human, but they use the same mechanisms.
Which is also the reason why they’re called neural networks :)

How Does an LLM (Technically) Respond?

An LLM is a Neural Network which was trained with stacks of human language. Some LLM providers state they fed their model the whole public internet at a certain date. As such its neurons and their connections were established with text. Just as with our “AB” example, it contains a weighted map of human language, or rather a tokenized version of it. This means, if you ask “What colour is the sun?” and it responds with “Yellow” it does >not< take the sentence apart and:

Derive it being a question from the question mark
Extract the topic “sun”
And the requested parameter “colour”

It much more takes the question, converts it into tokens, then takes these as a sequence and then completes it.
Funnily enough this takes me back to my comment on “understanding”: I feel I need to correct myself and say Neural Networks do not “understand”, but hey, this is one of these questions of faith. However,…

Just as with the input, the output is generated token by token. The system will process the input and then

Return the most probable next token for input
Return the most probable next token for input + output token 1
Return the most probable next token for input + output token 1 + output token 2
Return the most probable next token for input + output token 1 + output token 2 + output token 3

So, we could call it a recursive function, fetching data from a weighted graph, a super complex weighted graph.
The only “issue” right now, it always returns the same response for the same initial request.

Why do LLM Responses Vary?

Building a response based on probabilities, on a static system, should always result in the same response for the same question. So why isn’t this what happens? The marketing response would probably contain some references to artificial intelligence, creativity and thinking systems. The technical response is randomization. Most LLMs have a temperature setting / variable, which may either be static or passed with each request, which allows it to, instead of always responding with the most probable token, select one of the other possible options. The actual mechanism may be more or less random or systematic or with some magic table.
So, they just vary, because they’re made to vary :)

Additionally, the systems surrounding LLMs implement parallel processing of multiple inputs by multiple users. An implication of this, is that multiple users actually affect each other’s interactions. Let’s see if this may result in creative attacks in the future!

Learning!

For most current use cases of LLMs, there are dedicated learning and usage phases. The reason for this is easy: Too many people enjoy having fun! Looking at the various publications on the internet, the first publicly released self-learning chat bots all went slightly bonkers, some turned suicidal others went Nazi. The only way of percepting the world these systems had, were the chats with people, so every chat slightly shifted the weights of its neurons and connections. So if enough people tell it “the sun is green”, the weight of the “the sun is yellow” connection will become less probably than the new, green, alternative.
That said, these systems lacked the concepts of hard beliefs, political standpoints or individual principles. I.e. in the railway world, the protection of human life is the highest good. No matter how awesome a system is, if there is a notable chance of harming somebody, it will be limited in use. In return this principle is nothing else than a principle, it was a decision that turned into policy and now nobody questions.
Yes, one could add these hard principles to AI systems and enforce these, but this would in return create a significant bias. From a dry insurance perspective, for example, a death might not be the worst-case scenario, as it could result in a lower payout than a major long-term injury. If you don’t want to believe this, you’ve just proven the strength of the principals I tried to explain! ;)
Here it is important to distinguish between learning and context!

Context

As described above, the LLM processes input as tokens and goes “recursive” when generating the output. Each token fed into the LLM adds to the context for the next response. So, when chatting with an LLM for 5 minutes, the last reply will be based on the whole conversation that happened previously. Technically you will probably have a specific context which grows per session.
The same concept can be used to establish a larger context, by i.e. feeding contents of a wiki or documents or a different data source into the session. As such the LLM does not learn the new information but uses its skills for processing information on it and then allows it to be used in the rest of the session.
This also allows adjusting a context in contradiction to what the LLM has actually learned. By defining the sun as being green within a single session, the LLM will treat the sun as being green during the ongoing dialogue but not affect any other session / user.

Marketing

Combining the complex structure of Neural Networks with the broad training turning them into an LLM in addition to a little bit of randomization makes a pretty convincing product! Systems based on LLM can perform some impressive tricks and execute tasks, many might have thought were impossible to automate.
Here it is simply vital to understand, that everything the system does, is based on a “static” decision tree in addition to a bit of randomness. Which does not make it less impressive and saleable!

Static Decision Tree

One of the largest challenges working with this kind of system, is that it’s so complex, we as humans currently have no way of measuring the underlying Neural Network to understand why it makes which decision. Here can just offer a small example:
A few years back I was invited to an internal summit at Intel in Portland, where one of the talks covered Machine Learning based fault detection. As an example, for our lack of understanding a Neural Network for image classification was presented. Its main skill: identifying pictures of elephants. To understand how the system actually detected an elephant, parts of the exemplary image were covered with black squares, until only the necessary parts of the image were left.
The necessity for this approach hopefully summarizes the issue sufficiently.

Still, from my current understanding, drop the temperature / creativity / randomization, drop self learning and parallel use and you have the same behaviour as a static decision tree. It might be so massive, that it’s impossible to draw, but well, it is what it is.

Along Comes: The EU AI Act

The EU AI Act is the European Union’s solution to ensure a safe and secure usage of AI, or the try to do so!

The EU AI Act defines an AI system in Article 3

‘AI system’ means a machine-based system that is designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment, and that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments;

As I had a few troubles properly understanding the definition (so much for me “understanding" things), I decided to split it into its parts and comment each one.

As a reference “AI System”, I’ll be rating a simple command line chatbot based on Ollama with a random LLM / model behind it. It just offers the chat via terminal it is not connected to any systems. In addition, the model does not learn on my box.

Raw Text	Explanation in Paragraph 12 in English	Interpretation	Explanation in Paragraph 12 in German	My self hosted LLM based chatbot based on Ollama
‘AI system’	The notion of ‘AI system’ in this Regulation should be clearly defined and should be closely aligned with the work of international organisations working on AI to ensure legal certainty, facilitate international convergence and wide acceptance, while providing the flexibility to accommodate the rapid technological developments in this field. Moreover, the definition should be based on key characteristics of AI systems that distinguish it from simpler traditional software systems or programming approaches and should not cover systems that are based on the rules defined solely by natural persons to automatically execute operations.		Der Begriff „KI-System“ in dieser Verordnung sollte klar definiert und eng mit der Tätigkeit internationaler Organisationen abgestimmt werden, die sich mit KI befassen, um Rechtssicherheit, mehr internationale Konvergenz und hohe Akzeptanz sicherzustellen und gleichzeitig Flexibilität zu bieten, um den raschen technologischen Entwicklungen in diesem Bereich Rechnung zu tragen. Darüber hinaus sollte die Begriffsbestimmung auf den wesentlichen Merkmalen der KI beruhen, die sie von einfacheren herkömmlichen Softwaresystemen und Programmierungsansätzen abgrenzen, und sollte sich nicht auf Systeme beziehen, die auf ausschließlich von natürlichen Personen definierten Regeln für das automatische Ausführen von Operationen beruhen.
means a machine-based system	The term ‘machine-based’ refers to the fact that AI systems run on machines.	Can be ignored during a classification, as everything we currently have “runs on some form of machine”	Die Bezeichnung „maschinenbasiert“ bezieht sich auf die Tatsache, dass KI-Systeme von Maschinen betrieben werden.
that is designed to operate with varying levels of autonomy and	AI systems are designed to operate with varying levels of autonomy, meaning that they have some degree of independence of actions from human involvement and of capabilities to operate without human intervention.	From a Hacker’s perspective, no autonomy is also a level of autonomy, zero. Reading though, the itention seems to be imply the possibility to operate without human intervention, as such being able to do something without interaction is a MUST criteria, when classifying something as a AI System.	KI-Systeme sind mit verschiedenen Graden der Autonomie ausgestattet, was bedeutet, dass sie bis zu einem gewissen Grad unabhängig von menschlichem Zutun agieren und in der Lage sind, ohne menschliches Eingreifen zu arbeiten.	In wouldn’t call it autonomous, but read below, it is :)
that may exhibit adaptiveness after deployment,	The adaptiveness that an AI system could exhibit after deployment, refers to self-learning capabilities, allowing the system to change while in use.	“Adaptiveness” is combined with a “may” in the original text and a “could” in the description, thus this aspect is treated as a “MAY” criteria.	Die Anpassungsfähigkeit, die ein KI-System nach Inbetriebnahme aufweisen könnte, bezieht sich auf seine Lernfähigkeit, durch sie es sich während seiner Verwendung verändern kann.	While, the way LLMs work, it does use contexts, it does NOT learn or change while in use.
and that, for explicit or implicit objectives,	The reference to explicit or implicit objectives underscores that AI systems can operate according to explicit defined objectives or to implicit objectives. The objectives of the AI system may be different from the intended purpose of the AI system in a specific context.		Durch die Bezugnahme auf explizite oder implizite Ziele wird betont, dass KI-Systeme gemäß explizit festgelegten Zielen oder gemäß impliziten Zielen arbeiten können. Die Ziele des KI-Systems können sich — unter bestimmten Umständen — von der Zweckbestimmung des KI-Systems unterscheiden.	Well, it’s an LLM so this applies.
infers,	“A key characteristic of AI systems is their capability to infer. This capability to infer refers to the process of obtaining the outputs, such as predictions, content, recommendations, or decisions, which can influence physical and virtual environments, and to a capability of AI systems to derive models or algorithms, or both, from inputs or data. The techniques that enable inference while building an AI system include machine learning approaches that learn from data how to achieve certain objectives, and logic- and knowledge-based approaches that infer from encoded knowledge or symbolic representation of the task to be solved. The capacity of an AI system to infer transcends basic data processing by enabling learning, reasoning or modelling.	Describing “infering” as a “key charactaristic”, the capability of infering is a MUST criteria, when classifying something as an AI System. The existens of a “learning phase” is an optional aspect. That said, following the last sentence, either learning, reasoning or modelling must have been applied.”	Ein wesentliches Merkmal von KI-Systemen ist ihre Fähigkeit, abzuleiten. Diese Fähigkeit bezieht sich auf den Prozess der Erzeugung von Ausgaben, wie Vorhersagen, Inhalte, Empfehlungen oder Entscheidungen, die physische und digitale Umgebungen beeinflussen können, sowie auf die Fähigkeit von KI-Systemen, Modelle oder Algorithmen oder beides aus Eingaben oder Daten abzuleiten. Zu den Techniken, die während der Gestaltung eines KI-Systems das Ableiten ermöglichen, gehören Ansätze für maschinelles Lernen, wobei aus Daten gelernt wird, wie bestimmte Ziele erreicht werden können, sowie logik- und wissensgestützte Konzepte, wobei aus kodierten Informationen oder symbolischen Darstellungen der zu lösenden Aufgabe abgeleitet wird. Die Fähigkeit eines KI-Systems, abzuleiten, geht über die einfache Datenverarbeitung hinaus, indem Lern-, Schlussfolgerungs- und Modellierungsprozesse ermöglicht werden.	Now either LLMs infer by definition, or they don’t. I’m going for the “they do" in this case.Further details below.
from the input it receives,				It receives input, yes.
how to generate outputs such as				It does generate outputs, yes.
predictions,				A prediction may follow logical patterns which my LLM can do, so yes.
content,				Well, whatever content is, so yes.
recommendations,				I guess, yes?
or decisions				I like to believe I make the decisions, while the LLM gives me the recommendations, so a soft no.
that can influence				This one is getting tough.
physical or				The LLM is not connected to an API, a shell or anything else, it makes me recommendations, I make the decision, then I influence something. So no!
virtual				Same as above!
environments;	For the purposes of this Regulation, environments should be understood to be the contexts in which the AI systems operate, whereas outputs generated by the AI system reflect different functions performed by AI systems and include predictions, content, recommendations or decisions.		Für die Zwecke dieser Verordnung sollten Umgebungen als Kontexte verstanden werden, in denen KI-Systeme betrieben werden, während die von einem KI-System erzeugten Ausgaben verschiedene Funktionen von KI-Systemen widerspiegeln, darunter Vorhersagen, Inhalte, Empfehlungen oder Entscheidungen.
	AI systems can be used on a stand-alone basis or as a component of a product, irrespective of whether the system is physically integrated into the product (embedded) or serves the functionality of the product without being integrated therein (non-embedded).	Just as an extra note

It is important to note the difference between an AI system and an LLM. Reading through the AI act, an AI system is the resulting system / product wrapped around the LLM, or a different AI approach, including the user interface and potential connections to other systems. The LLM is “just” a Model which, depending on its capabilities, might be classified as a General Purpose AI Model.

Here are the applicable definitions:

(63) | ‘general-purpose AI model’ means an AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications, except AI models that are used for research, development or prototyping activities before they are placed on the market;

(66) | ‘general-purpose AI system’ means an AI system which is based on a general-purpose AI model and which has the capability to serve a variety of purposes, both for direct use as well as for integration in other AI systems;

My Self-Hosted LLM based Chat based on Ollama

So is it an AI system? Following the definition and my interpretation, it is not an AI system. While behaving like one, it lacks the direct impact. While definitely influencing me, just the way Google , Wikipedia , stackoverflow and i.e. learn.microsoft.com or any other source of documentation does, it does not affect any system and thus no environment.
Having used the soft phrase ‘influence’ it may be arguable, whether it does influence an environment indirectly via me. Here I’m opting for a strong no, as whatever I do is my responsibility.
In addition, the described system does not act autonomously, but only responds to me upon my input.

Turning it into an AI System

From my understanding the first step would be to give the system an interface allowing direct interaction with an environment, thus for example to execute shell commands on a Linux box.
This way the instruction “Create a new file called cookies.md in /home/sec-bits/” would result in the LLM interpreting my request and then the AI system executing a command on the connected system and well, hopefully create a new file. While I’m definitely responsible for the command, I have no control over the command the system executes. I can just hope, that by statistical testing, it has been ensured it will not by accident run “rm -rf /”. (That said, in a scenario like this, you should probably prohibit certain commands from being executed.)

Autonomy

The final open aspect is autonomy. Is turning my written description of what I want into a shell command then automatically executing it meant by a minimal level of autonomy? I’d be happy to say yes, ~~but am not sure what the authors aimed for.~~
If autonomy much more implies a temporal factor, the AI system “deciding itself” when it’s time to create a new file, the question arises whether a cron job implies autonomy. Yes, it does run by itself, but is started based on a timer, which I honestly wouldn’t treat as being autonomous, but much more automated. Coming from here, I would probably only use the word “autonomous” in combination with the self-learning aspect. For me “switch on the heating, when its colder than 5°C” is the same as a cron job, just triggered by a temperature value and not time. Still both a simply implemented as interrupts or triggers. If in return, I use my AI system to switch on the heating every time the temperature drops below 5°C, and at some point my AI based building management system does it automatically, that was definitely autonomous, but also a new trick it learned. That said, it could also be modelled using a massive context….but…

During my final read of this blogpost, I stumbled upon the EU Commission’s guidelines on AI system definition , which confirms my hunch.

(18) For example, a system that requires manually provided inputs to generate an output by itself is a system with ‘some degree of independence of action’, because the system is designed with the capability to generate an output without this output being manually controlled, or explicitly and exactly specified by a human. Likewise, an expert system following a delegation of process automation by humans that is capable, based on input provided by a human, to produce an output on its own such as a recommendation is a system with ‘some degree of independence of action’.

Openly said, I’m a little sceptical about the definition of “a level of autonomy” here, as it seems, that by this definition, everything that infers automatically has a degree of autonomy, even if it does not perform an action, but just generates an output, like a text response.

Influencing Environments

For completness, here the extract on “influencing environments” from the EU Commission’s guidelines on AI system definition

(60) The seventh element of the definition of an AI system is that system’s outputs ‘can influence physical or virtual environments’. That element should be understood to emphasise the fact that AI systems are not passive, but actively impact the environments in which they are deployed. Reference to ‘physical or virtual environments’ indicates that the influence of an AI system may be both to tangible, physical objects (e.g. robot arm) and to virtual environments, including digital spaces, data flows, and software ecosystems.

A Quick Verification: ChatGPT

Is, following my approach, ChatGPT an AI system? Seems like a trick question?
ChatGPT as you see it when going to chatgpt.com is, just as my example above, a chat interface, that generates predictions, recommendations and other content, but it does not influence any physical or virtual environments. Thus it is not an AI system.
Uproar, complaints, pure chaos –> Read on!

But…

The chat on chatgpt.com is not the only way of interfacing the product. When utilizing ChatGPT via an API it can trivially be used to turn “Create a new file called cookies.md in /home/sec-bits/” into a shell command, which then can be executed via an existing interface. Thus, ChatGPT may very well be the core part of an AI system as defined by the EU AI act as interpreted by myself. But still, this described product utilizing ChatGPT and providing a connection to a system to be able to execute commands would then be the AI system, not ChatGPT itself!

And…

As the ChatGPT chat / webpage is still based on ChatGPT as an LLM, which is a General Purpose AI Model, the underlying component still falls under the regulations of the EU AI act.
Thus it is vital to distinguish between:

ChatGPT: The Website with chat, not an AI system, not an AI model and thus not regulated by the EU AI act
ChatGPT: The LLM, not an AI system, but an AI model, which is regulated by the EU AI act
ChatGPT: As integrated in many other products, which are then themselves AI systems and regulated via the EU AI act

Infer

The term “infer” seems to be the main aspect of AI systems.

Collins Dictionary says:

infer in British English

verb
Word forms: -fers, -ferring, -ferred(when tr, may take a clause as object)
1. to conclude (a state of affairs, supposition, etc) by reasoning from evidence; deduce
2. (transitive) to have or lead to as a necessary or logical consequence; indicate
3. (transitive) to hint or imply

and

infer in American English

verb transitive
Word forms: inˈferred, inˈferring
Origin: L inferre, to bring or carry in, infer < in-, in + ferre, to carry, bear
1. obsolete to bring on or about; cause; induce
2. to conclude or decide from something known or assumed; derive by reasoning; draw as a conclusion
3. to indicate indirectly; imply [in this sense, still regarded as a loose usage by many]

Cambridge Dictionary says:

infer
verb [ T ]
UK
to form an opinion or guess that something is true because of the information that you have:
infer something from something What do you infer from her refusal?
[ + that ] I inferred from her expression that she wanted to leave.

and

infer
verb [ T ]
US
to reach an opinion from available information or facts: 
[ + that clause ] He inferred that she was not interested in a relationship from what she said in her letter.

At this point, yet again, I do not want to drift into a philosophic or talk about beliefs.
From what I understand, “LLM Inference” is simply something that as some point was defined to be term to use.
I myself am having a hard time accepting “infering” as being the correct term and would rather treat it as marketing or a buzz word, but even these tend to stick nowadays. In return, I have to admit, an LLM is by far more than a lookup table, while the results may be static, the process of getting there is highly dynamic, and we do need a proper term describing what an LLM does. I just believe, that LLM inference isn’t the same as the inference we expect from humans . . .

And now?

I have a headache! :)

One Big Question

I give a not-AI system a textfile and ask it to i.e. replace the word “cyber” with “clown”. Was this an act of “influencing a virtual environment”? I don’t think so, but I might change my mind. And this might completely break my current interpretation…

Thanks for Reading!

I hope you have everything you need to rate a system as an AI system, or not. My recommendation would be to slightly ignore the autonomy factor and mainly stress the influencing aspect.
Should you have any thoughts on this, feel free to drop me a message!

What are AI Systems?