Many people have focused on the sophistication of ChatGPT’s text-processing and coding capabilities, but few have noted the chatbot’s ability to process images.
Similarly, though some have spent time playing about with image-generating tools like Midjourney, I believe that text-to-image software could be transformative for the way we engage with technology, particularly when it comes to investment. Recent leaps forward in generative technology have dramatically expanded the possibilities of artificial intelligence, yet I believe many of the most exciting applications of the technology are in the visual realm.
Already, some at the cutting technological edge of the finance industry have experimented with investment uses of ChatGPT’s sister service, DALL-E (an image generator that uses the same kind of natural language processing as ChatGPT). Effectively, OpenAI drew on the vast trove of data contained within images on the internet to create a visual version of ChatGPT.
Now, in GPT-4, OpenAI has merged the two services and expanded the accuracy and range of the bot’s image-generating ability. Finance may not feel like the most obvious application of these tools, but we should recognize the importance of approaching early-stage technology like this in a spirit of radical openness.
Human cognitive ability is primarily visual. We respond to images in a way that is more immediate and powerful than our response to other sense stimuli, or, indeed, to text. First with DALL-E and now with ChatGPT-4, there is the possibility to distill and streamline complex data into a single image, dense with meaning. Investing is often about animal spirits, about the way that the crowd responds to an event. What if you were able to feed news flow or other information into an image generator and then analyze that image to ascertain its mood? It could be something as seemingly abstract as asking whether the image produced is dark or light, whether it inspires feelings of positivity or loss. But if you can refine a large and diverse amount of information into an image that is easily and immediately readable, you may have found a new way of engaging with information and data and created a new approach to identifying investment signals.
On top of that, in that image you have potentially created a valuable asset. It doesn’t require too much of a leap to imagine algorithms that produce images that effectively and reliably express the investment implications of a particular event or data set. These images could be tradable in themselves, crystallizations of valuable insights, immediately accessible windows into the deep workings of machine learning. And of course, these images would be readily comprehensible to anyone, anywhere, no matter what language they speak, no matter what level of education they have attained. You could imagine a whole new skill set developing, one that puts a premium on intuition and visualization, that rewards those best able to read the messages wrapped within these algorithm-generated images.
This idea — using the immediacy and power of an image to come to swifter, more readily comprehensible conclusions — feels like it could have a transformational effect not just on the way we understand the markets, but also on our engagement with data more broadly. Here the human brain in all its complexity is just one interpretive device employed to decode information. We are always told that we use just a fraction of our brain’s potential. Exploiting the interplay of image and text could be a way of accessing that untapped capacity, drawing on subliminal and visual elements of thought that operate beyond what we consider traditional consciousness, and a way of pairing this elevated brainpower with the vast, superhuman capabilities of the algorithms that underlie the large language models.
In the very beginning of a revolution, openness, creativity, and imagination acquire extraordinary value, first because they enable you to engage with evolving technologies in new and powerful ways, and second because this open-mindedness and inventiveness create a feedback loop to the technology itself, sculpting its development and breaking down barriers. I believe there’s no question that AI is going to revolutionize the way we engage with the world in the same way the internet did a quarter-century ago. ChatGPT-4 and Midjourney 5.1 are merely early staging posts on the journey to true artificial intelligence. But we must all be clear that the way we use technologies like this now — the questions we ask of them, the extent to which we push their capabilities — will define the shape that future, more powerful iterations will take.
Igor Tulchinsky is founder, chairman, and CEO of WorldQuant.