The first wave of enterprise AI demand was heavily shaped by language. Buyers wanted systems that could read documents, summarize reports, answer questions, draft communications, and assist with knowledge work. These are genuinely valuable applications, and the market responded with a wide range of text-based AI products. For a period, this felt like the full scope of enterprise AI.
That scope is expanding. Enterprise buyers are now beginning to ask for AI systems that can perceive and reason about more than text. They want systems that understand images, diagrams, floor plans, equipment states, spatial relationships, and physical environments. The shift is not a rejection of text-based AI. It is a recognition that most enterprise operations involve more than language — they involve physical systems, visual observations, and spatial contexts that text alone cannot fully capture.
This demand shift is driven by operational realities. A manufacturing company needs AI that can assess equipment from images, not just process maintenance reports. A logistics company needs AI that understands spatial layouts, not just shipment records. A construction firm needs AI that can reason about site conditions from photographs, not just manage project documents. As enterprises move AI from knowledge management toward operational automation, the limits of text-only systems become more visible.
The implications for AI vendors and enterprise buyers are significant. Vendors that deliver only text-based capability will find their addressable market increasingly constrained as operational AI demand grows. Buyers that evaluate AI systems only on language task performance will miss the capabilities that matter most for their most valuable automation opportunities. The market is broadening, and the organizations that recognize this shift early will be better positioned to pursue the automation opportunities it opens.