6 Limitations

Large Language Models have limitations depending on the usage and context. Some of these limitations are due to technological barriers, others due to their design.

6.1 Context windows

By design, most LLMs have a user interface that resemble a chatbot. This design fosters conversation-like interactions rather than just simple query/response.

One of the most important advances in LLMs is due to the so-called “attention mechanism.” This neural network architecture allows the model to not only to pay attention to the most recent words, but also to all the text from the beginning.

An example of this is how predictive text in cellphones seem to only provide suggestions based in the last 1 or 2 words typed. This is not ideal since the model doesn’t have memory of anything said earlier in the conversation. In order to address this, the attention mechanism incorporates a dynamic encoding of words that captures the evolution of the text.

In practice when using an LLM, this bigger attention span consumes resources and it often is limited to a particular number of context words or context window. This context window can be thought of as the background information that is passed to the model in order to provide specific outputs related to our conversations.

This context window is also used for fine tunning the model. That is, to tell the model the specific expertise, language, or character that should assume.

LLM companies usually have different tiers of models that allow, among other things, to select bigger context windows. This becomes relevant when conversations become long or when reference documents are lengthy.

Usually LLM chatbots append previous outputs as part of the input for new prompts. In this way, the previous information and context is passed to the new queries and the chatbots are able to remember what has happened in the conversation.

Similarly, reference documents are passed to the LLM as part of the context, hence the limitation on the number of active references that a conversation can handle.

Context windows can be very small in some of the free tiers for commercial LLMs. This is an important aspect to have in mind, since paid users might have an advantage over free users in the type of tasks that they can effectively complete.

6.2 Data bias

Large Language Models are trained using big amounts of data. This means that they often include different perspectives or views about something. Different training algorithms face this challenge in various ways, either by ranking the information or by providing some sort of average description.

This limitation influences the type of responses that LLMs are able to provide, usually providing responses that, in some sense, average the information, being less sensitive for outliers or less common sources. More specialized topics or perspectives could be overlooked.

6.3 Complex tasks

There has been great progress in the so-called reasoning feature of LLMs. This usually involves a combination of back-and-forth internal interactions of the LLMs, together with explicit planning and step-by-step strategies about the LLM’s course of action. This is particularly useful for minimizing hallucinations, however when tasks are very complex and/or involve multiple steps, LLMs tend to to reduce in performance. Currently, this threshold is more noticeable for tasks that require one hour or more of processing.

6.4 Computations

LLMs are probabilistic language models. As such, they are not intrinsically suitable for computations. As mentioned before, the way an LLM would understand the prompt 7x8= is to find the most probable characters (tokens) that would follow the sequence of characters (tokens) “7”, “x”, “8”, “=”. It might correctly predict that the next character (token) would be “5” and then the next one would be “6”, but it might as well predict that after “=” would come “4” and then after it would come “2”.

Recently, many companies have found a way around this limitation. In the background, when a computation is prompted, the LLM is asked to generated a script (many times in python) to evaluate the expression. This improve the performance on computational tasks. However, it is important to keep in mind that the LLM could also hallucinate on the code. In the end, if you want to perform computations, just use a calculator.

6.5 Fast-forward v. black-box

Large Language Models are very useful to fast-forwarding tasks. Even at the corporate level, users report that using LLMs can help them accomplish tasks in only 20% of the normal time. This approach enhances the ability of users to do what they know already, but faster.

On the other hand, LLMs also enable a black-box approach, where users have no prior knowledge of a field or task. Here, the models are replacing or outsourcing the human intervention.

These two approaches could be both useful or dangerous for teaching and learning purposes. It depends on the specific goals, usage, and context. However, it is important to fully understand in which sense instructors and students are using LLMs, whether it is to fast-forward a task or as a black-box.

6.6 Vibe-learning

Beyond the black-box approach is what I call vibe-learning. In early 2025 the term vibe-coding was coined to describe the way in which many developers and computer scientists were using LLMs by “focusing on the goal of what needs to be accomplished and forgetting about the syntax of coding.” In this sense, vibe-coding enables developers to forget about the small details and focus on the big picture of the projects they are pursuing.

In general, LLMs can be used with this vib-ing approach. Although, a word of caution is relevant in the teaching and learning setting: one important difference between novices and experts in a field is the attention to details. Experts tend to think more on the big picture and overarching themes, while novices focus on small details. This attention to details is fundamental for the learning process in any field. The overuse -or misuse- of LLMs in teaching and learning environments can hinder students’ ability to effectively learn concepts by skipping or reducing their attention to details.

Since LLMs can output products regardless of the users expertise or knowledge of a subject area, this usage could falsely lead users to feel they are engaging in learning. If there is anything more dangerous than ignorance is the illusion of knowledge.