Generative AI: The End of “Too Expensive” in Business Software?

Dieser Artikel ist auch auf Deutsch verfügbar

Generative AI (GenAI) is changing how businesses build and deliver software products. What started with chatbots and simple automation tools is evolving into something far more powerful — AI systems integrated deeply into software architectures, influencing everything from backend processes to user interfaces. This article explores the current state of GenAI adoption in businesses, highlighting its shift from visible, standalone tools to invisible engines driving core features behind the scenes. For software architects, developers, product designers, and anyone involved in building modern software, understanding this shift is crucial. We’ll explore the challenges of working with AI’s non-deterministic, black-box nature, give examples of how AI is replacing complex, custom ML models, and show how AI-powered features will reshape business software. By the end, you’ll have a clearer picture of how to integrate AI into your software stack and make use of its potential.

The Chatbot Frenzy: A Short-Term Trend

Right now, businesses are focused on building chatbots and custom GPTs to tackle a variety of problems. These AI-driven tools excel in two areas: making internal knowledge accessible and automating customer support. Chatbots are used to build answer systems that help employees quickly retrieve information from vast internal knowledge bases, breaking down silos and improving efficiency. Whether it’s policy documents, technical guides, or onboarding materials, chatbots allow employees to access critical knowledge on demand without sifting through complex databases or relying on keyword-based search engines that often lack context.

Chatbots are invaluable in automating customer support. By handling common queries, troubleshooting issues, and offering real-time assistance, they help companies manage high volumes of requests without overburdening human teams. This reduces operational costs and improves customer satisfaction by providing instant responses to routine problems.

While these tools are useful, the rapid proliferation of chatbots is creating a short-term trend expected to taper off. Businesses are jumping into chatbot implementation without considering long-term goals or scalability. For most, chatbots serve a tactical function, addressing immediate needs rather than driving strategic shifts in AI integration across business systems. This rush to deploy chatbot solutions could lead to market saturation, where AI-powered assistants become common but offer diminishing returns due to lack of innovation or differentiation.

We’re witnessing the first wave of GenAI adoption. While chatbots are useful, they represent only the opaque surface of this technology. As companies gain experience, they’ll see the true value lies in integrating AI beyond visible tools. The future will involve deeper AI-driven features woven into software products without being obvious to end users.

GenAI as an Omnipresent Technology

In the coming years, we’ll see a shift from AI being an explicit, opaque, user-facing tool to being seamlessly integrated into the “feature fabric.” GenAI will enable features like dynamic content generation, intelligent decision-making, and real-time personalization without direct user interaction. This will impact UI design and change how users interact with software.

We’ll see clusters of UI controls like text fields, checkboxes, sliders giving way to more flexible, AI-powered text areas where users can describe their needs in natural language instead of manually inputting specific parameters. This shift will allow users to interact with applications more fluidly, letting AI translate complex instructions into actions. A prime example is already visible in tools like Photoshop, where the “generative fill” feature no longer requires users to tweak multiple controls. Instead, users can describe what they want to fill in a selected area, and AI takes care of the rest. This trend toward natural language input is set to expand across various software, making user interactions more intuitive and less constrained by traditional UI elements.

In terms of feature archetypes and use cases, the challenge lies not in a scarcity of use cases, but in their abundance. Identifying and prioritizing the most promising opportunities becomes the primary difficulty. Applications could feature AI-driven ideation tools, auto-generated summaries for vast data sets, personalized onboarding processes, or real-time design suggestions within creative applications. Features like intelligent content moderation, sentiment analysis or dynamic pricing models could all run silently in the background, optimizing business operations without user intervention.

The Commodity Effect of Large Language Models (LLMs) vs. Custom Machine Learning (ML) Models

One of the most remarkable shifts GenAI has introduced to the industry is the commoditization of AI capabilities, driven by the rise of Large Language Models (LLMs). Before LLMs and diffusion models, companies had to invest significant time, effort, and resources into building custom Machine Learning (ML) models tailored to solve specific problems. These custom ML models required specialized teams to gather domain-specific data, curate features, label, and develop narrowly focused algorithms. The process was not only expensive but also time-consuming, with long development cycles that limited how fast businesses could innovate.

Large Language Models (LLMs) change the way businesses approach problem-solving. Andrej Karpathy makes the point to call them what they are: Autoregressive Models. These versatile tools can be enhanced with private data using architectures like Retrieval-Augmented Generation (RAG), making their wide-ranging skills available to businesses. In many cases, this eliminates the need for specialized teams, extensive data labeling, and complex machine learning pipelines previously required for problems that were challenging to solve algorithmically. LLMs' vast pre-trained knowledge allows them to effectively process and make sense of even unstructured data.

A key aspect of this commoditization is the availability of LLMs through easy-to-use APIs. Almost every developer today knows how to work with API-based services, which makes integrating these models into existing software ecosystems seamless. Businesses can choose to consume LLMs as a service via APIs, taking advantage of powerful models without needing to manage any of the underlying infrastructure. Alternatively, for companies with specific security or privacy concerns, it’s possible to run these models in different sizes on-premises.

To illustrate, imagine a travel expense tracking and reporting app. Historically, such an app might have relied on a custom-trained ML model to categorize uploaded receipts into accounting categories. This setup required a dedicated infrastructure and, ideally, an entire MLOps pipeline to manage data collection, training, and model updates. Today, this custom model could easily be replaced with an LLM — leveraging its general knowledge to handle receipt categorization. What’s more, the multimodal capabilities of LLMs mean that additional services, such as Optical Character Recognition (OCR), could also be eliminated. Whereas previously, businesses would have needed to integrate external OCR libraries to process images of receipts, the LLM can now handle both text recognition and categorization in a single system, greatly simplifying the tech stack. Need the LLM to extract data like net and gross prices, or tax rates? Do it.

This “world model” approach, where LLMs bring a broad understanding of context and language, has made it cheap and accessible to solve problems that previously required custom ML solutions. The impact of this commoditization is immense. It allows companies to move faster, iterate more frequently, and significantly reduce development costs. Problems that once required deep expertise in ML can now be tackled with pre-trained models that “understand the world,” freeing up teams to focus on more strategic and creative applications of AI. LLMs can now handle tasks that were once bottlenecks, enabling businesses to scale AI-driven solutions without needing specialized infrastructure.

AI-Powered Features That Weren’t Possible Before

GenAI is enabling heaps of features that were either too complex, expensive, or entirely out of reach for most businesses before. Features that previously required investments in custom machine learning or complex algorithms. Let’s take a look at some examples.

Vibe-Based Search: Beyond Keywords

One of the most transformative applications of GenAI in business software is the advent of “vibe-based search.” This concept represents a significant leap forward from traditional keyword-based search systems that have dominated information retrieval for decades.

Keyword-based search, exemplified by Google and other search engines, relies on matching specific words or phrases in a query to indexed content. While effective for many purposes, this approach often falls short when users have complex, nuanced needs that are difficult to express in a few keywords.

Vibe-based search, powered by Large Language Models (LLMs), allows users to express their intent in natural language, capturing not just specific terms but the overall “vibe” or context of their request. Consider the difference between these two search queries:

Traditional keyword search: “best restaurants berlin”

Vibe-based search: “I’m a picky connoisseur who loves wine bars that also serve food, preferably locally sourced. Recommend me restaurants in Berlin Mitte and Kreuzberg. Avoid dogmatic Natural Wine Bars.”

In the vibe-based search scenario, the LLM can understand and process:

The user’s self-description as a “picky connoisseur”
Their preference for wine bars that serve food
The desire for locally sourced ingredients
Specific neighborhood preferences (Mitte and Kreuzberg)
The distinction between regular wine bars and “dogmatic Natural Wine Bars”

This level of nuance and context-awareness allows the search feature to provide highly personalized and relevant results that truly match the user’s intent, rather than just matching keywords.

For businesses, implementing vibe-based search can dramatically improve user experience across various applications:

Internal knowledge bases: Employees can find information using natural language queries that describe their specific situation or need.

E-commerce platforms: Customers can describe products in their own words, even if they don’t know the exact terminology.

Customer support systems: Users can describe issues in detail, allowing the system to provide more accurate solutions or route to the appropriate support agent.

Content management systems: Content creators can search for assets or information using descriptive language rather than relying on elaborate tagging or metadata. That’s what Apple and Google do with their respective photo apps on their phone OSes: “photos of my dog playing with that red ball on a beach in Italy.”

Vibe-based search represents a shift from “what you say” to “what you mean,” allowing software to understand and respond to user intent in ways that were previously impossible. As LLMs continue to evolve, we can expect vibe-based search to become an integral part of how users interact with information systems across all types of business software.

Making Sense of Data and Content, with an API call

Sentiment analysis

Consider a practical example: imagine an internal system where employees make status posts about their work or general experiences. An executive might want to assess the overall mood in the team during a particular week. In the past, adding sentiment analysis to these posts with a custom ML model would have been a significant challenge, requiring specialized training and careful integration into the existing infrastructure. With LLMs, this complexity is reduced to a simple API call. The result doesn’t even need to be in human-readable language; it could be output as structured JSON data that the system processes to render suitable icons or graphs. Alternatively, the LLM could output something as simple as emojis to represent the sentiments — making it easier for executives to quickly gauge employee morale. Of course, you’d implement such a feature only if your employees gave their consent.

Get Insights from Complex Data

Let’s explore another example that illustrates the power of LLMs in making sense of complex data: refrigeration alarm management systems. Traditionally, these systems have focused on providing:

A graphical alarm dashboard displaying real-time data and alerts
Complex, filterable tabular representations of time-series data

While these features are undoubtedly useful, they often require significant human interpretation to extract meaningful insights. This is where LLMs can enhance the system’s capabilities, transforming raw data on zero-shot basis ([1], [2]) into actionable intelligence without custom and expensive Machine Learning models.

With LLM integration, the alarm management system can now offer:

Automated Report Generation: LLMs can analyze time-series data and generate detailed reports in natural language. These reports could highlight trends, anomalies, and key performance indicators valuable to both technicians and managers. Imagine a Monday morning report that summarizes the past week’s alarms, identifies recurring issues, and suggests areas for improvement.
In-depth Analysis: Going beyond simple data representation, LLMs can recognize and explain complex patterns in the data. For instance, they could identify sequences of alarms that indicate larger system issues, providing insights that might be missed in a traditional tabular view.
Predictive Insights: By analyzing historical data, LLMs can make predictions about future system states. This enables proactive maintenance, helping to prevent potential failures before they occur. The system could, for example, alert managers to a pattern of minor temperature fluctuations that, based on historical data, often precede a major system failure.
Structured Outputs: In addition to natural language reports, LLMs can output structured data (e.g., JSON). This allows for the creation of dynamic, graphical user interfaces that visually represent complex information. A single API call could generate both a human-readable analysis and the data needed to populate an interactive dashboard.
Natural Language Queries: Technicians could ask the system questions in natural language, such as “Which devices will probably go into failover mode in the coming weeks?” and receive immediate, relevant answers and visualizations. This dramatically lowers the barrier to data access and interpretation. Why not offer technicians this feature via a real-time voice mode on their handheld devices? We’ve seen what immense power ChatGPT’s simple voice mode offers, let alone it’s advanced version that’s being rolled out at the time of writing.

The key advantage here lies in flexibility and accessibility. With a single LLM and appropriate prompts, various analyses and output formats can be generated without the need to train separate ML models for each new requirement. This makes the system valuable for both technically proficient users who need deep dives into the data, and management personnel who require high-level overviews and trend analyses.

Consider how this might work in practice: A facility manager notices an uptick in alarms on the traditional dashboard. Instead of poring over tables of data, they could simply ask the LLM-enhanced system, “What’s causing the increase in alarms this week?” The system would then analyze the data, consider historical patterns, and provide a concise explanation along with recommended actions, quickly.

This application of LLMs doesn’t just add a layer of convenience; it fundamentally changes how organizations can interact with and derive value from their data. By making complex analyses accessible through simple API calls, we’re democratizing data science and enabling more informed, quick decision-making across all levels of an organization.

Multimodality: The Writing, Speaking, Seeing, Hearing Blackbox

Multimodality is playing a crucial role in expanding what’s possible. Models capable of processing text, images, sound, and voice enable complex feature combinations that were once handled by multiple systems, many human-in-the-middle interactions, or just weren’t possible. An example would be an app that helps users generate complex visual content simply by describing what they need in natural language, like modern design tools already offer. These tools integrate features such as image generation, text analysis, and predictive modeling into a single workflow, reducing the cognitive load on users.

Additionally, the ease with which AI can now handle tasks like real-time talking with humans (and other AIs), real-time language translation, voice-based commands, and interactive data analysis is reshaping how users engage with software. Features like sentiment analysis, anomaly detection, and predictive maintenance – previously reserved for high-end, specialized applications – are becoming increasingly available to everyday business software. This means products can offer richer, more interactive features without burdening developers with building custom systems from scratch.

We learned about vibe-based search earlier on. Why not let users add photos, images, or their own voice recordings to a search? We’re right now leaving the age of typing in every single thing we want software to do for us.

These new AI-powered capabilities aren’t just improving existing features but enabling businesses to create solutions that were once difficult or expensive to implement. GenAI is simplifying complex tasks, making them more accessible across various industries and applications.

Technical Limitations and Workarounds

LLMs do come with certain technical limitations. One of the most significant constraints is the context window — the amount of text an LLM can process in a single operation. This limitation can impact the model’s ability to handle large documents or maintain context over extended interactions.

The Context Window Challenge

Most LLMs have a finite context window, typically ranging from a few thousand to tens of thousands of tokens. For instance, GPT-4o has a context window of about 128,000 tokens, while Gemini 1.5 Pro can handle up to 2,000,000 tokens. While this may seem substantial, it can quickly become a bottleneck when dealing with large documents or videos, or complex tasks that require referencing a vast amount of information.

Model	Input Context Window Size
GPT-4o	128,000 tokens
Claude 3.5 Sonnet	200,000 tokens
Gemini 1.5 Pro	2,000,000 tokens
Llama 3.2	128,000 tokens
Mistral Large 2	128,000 tokens

Consider a financial software application that needs to analyze heaps of lengthy insurance contracts or a tool that analyzes street traffic video streams. In such cases, the context window limitation could prevent the LLM from considering the entire input at once, potentially leading to incomplete or inaccurate analyses.

Workarounds

Fortunately, there are several strategies available to work around these limitations:

Chunking and Summarization: This approach involves breaking large documents into smaller, manageable chunks that fit within the context window. Each chunk is processed separately, and the results are then combined. For even larger input, a hierarchical approach can be employed: chunks are summarized, and these summaries are then processed together to create a higher-level understanding.
Retrieval-augmented Generation (RAG): RAG combines the “world knowledge” of LLMs with private knowledge bases. Instead of relying solely on the model’s internal knowledge, relevant information is retrieved from a separate database and injected into the prompt. This allows the model to access a much larger pool of information without exceeding its context window.
Domain Adaptation: While fine-tuning models on specific domains can create versions that handle industry-specific concepts more efficiently, it’s important to approach this with caution. Fine-tuning can sometimes lead to unexpected side effects, particularly in the model’s general language use. A more nuanced approach involves careful prompt engineering and the use of domain-specific knowledge bases in conjunction with general-purpose models. This allows for domain expertise without risking the model’s broader capabilities.
Sliding Window Technique: For tasks that require analyzing long sequences of text, such as in time-series data or long documents, a sliding window approach can be used. The model processes overlapping segments of the text, maintaining some context while moving through the entire document.
Multi-step Reasoning: Complex problems can be broken down into a series of smaller steps. Each step uses the LLM within its context window limitations, with the results of previous steps informing subsequent ones. This approach mimics human problem-solving and can handle tasks that would otherwise exceed the context window.
Hybrid Approaches: Combining LLMs with traditional algorithms can create powerful systems that leverage the strengths of both. For instance, using classic information retrieval techniques like TF-IDF and BM25 to select relevant passages before applying LLM analysis can significantly reduce the amount of text that needs to be processed at once.

As this technology continues to evolve, we can expect to see improvements in context window sizes and more efficient processing techniques. The key for businesses is to understand the existing limitations and strategically design their AI implementations to maximize the strengths of LLMs while mitigating their constraints. As with any technology, the most successful applications will come from those who deeply understand both the capabilities and limitations of the tools at their disposal.

Looking Forward: GenAI as a Standard Component in Business Software

As businesses continue to adopt GenAI, it’s clear that AI-driven features will become a standard part of the software landscape. What is now seen as cutting-edge – LLMs, diffusion models, and multimodal systems – will soon be as commonplace as traditional software frameworks, embedded deeply into everyday enterprise applications.

GenAI will no longer be enabling specialized features or features-as-products but a foundational component that powers a wide range of functionalities. For businesses, this means adopting AI as part of the standard software development stack and treating it as an integral resource in creating new features and improving existing ones. Future-proofing software development involves not just adding AI-driven tools but preparing infrastructure, design patterns, and operations to accommodate AI’s growing influence.

With this shift, the role of software architects, developers, and product designers will continue to evolve. They’ll need to adopt new skills and strategies for managing AI components, handling non-deterministic outputs, and ensuring seamless integration across different business systems. Collaborating across roles – technical and non-technical – will be more important than ever to create AI systems that are powerful, reliable, and ethically sound.

Businesses that embrace this shift early, learning to work effectively with AI at every layer of the product, will be the ones driving innovation. As GenAI becomes part of the standard feature fabric in software, it will drive forward both business productivity and user experiences in ways that we are only beginning to imagine.

Feedback and discussion

This article is also up on LinkedIn. I’d love to hear your thoughts in the comments there.

Acknowledgments

Special thanks to Philipp Schirmacher and Marco Steinke for their valuable feedback on earlier drafts of this article.

Article