Tech

The Rise of ChatGPT: Revolutionizing AI and Reshaping the Tech Industry

Published on Nov 8, 2023
Image Credit: Andrew Neel

In April 2009, Silicon Valley's startup guru, Paul Graham, wrote an article documenting the five entrepreneurs who had the greatest influence on him. He mentioned Sam Altman, who was 24 years old at the time, alongside the founders of Apple and Google. When mentoring startup companies, Graham would ask, "What would Steve Jobs do?" when it came to design issues, but for matters of strategy or ambition, he would ask, "What would Sam do?"

At that time, Altman had only founded one company, Loopt, a social networking company that failed to establish network effects. Over the next decade, his popularity in the Silicon Valley startup community grew along with Graham's incubator, Y Combinator. However, it wasn't until OpenAI, specifically since the release of ChatGPT in November of last year, that the world witnessed his ambition and matching strategic abilities for the first time.

On November 6th, OpenAI held its first developer event. In a small venue reminiscent of early Apple product launches, Altman didn't focus on new technologies but rather provided a comprehensive summary of the developments over the past year, showcasing the technical prowess of OpenAI's large models.

Over the past year, ChatGPT evolved from a web-based application capable of processing only text information to an app that can handle text, voice, and image data, attracting hundreds of millions of users weekly.

OpenAI transformed from a research institution into a super startup valued at 90 billion dollars, attracting two million developers to build various applications using its technology.

The world has also undergone changes due to the wave sparked by ChatGPT. According to the Collins Dictionary statistics, the mention of "AI" in 2023 has quadrupled compared to the previous year. Sequoia Capital reports that their inbox is filled with various entrepreneurial ideas like "AI Salesforce," "AI Adobe," "AI Instagram," and the entire tech industry is caught up in a frenzy of talent acquisition and GPU procurement.

After experiencing the expansion of internet companies and anti-trust concerns, governments worldwide remain vigilant during this AI wave. Over the past year, the European Union and the United States have been pushing for regulation concerning artificial intelligence at an unprecedented speed.

When ChatGPT was first released last year, it was merely a chatbot with limited information processing capabilities. It would respond to sentences composed of text or code with text and code feedback. Thanks to the massive amount of data it consumed and the extensive training conducted by university students, ChatGPT surpassed all previous chatbots. It was able to provide responses comparable to those of humans for a wide range of questions posed by users.

Although sometimes it would produce nonsensical responses (referred to as "hallucinations" within the industry), through astonishing replies time and time again, it showcased what happens when artificial intelligence approaches the realm of a versatile general assistant.

In September of this year, based on descriptions of symptoms and examination reports, ChatGPT helped a mother identify the cause of her child's illness - a condition called tethered spinal cord syndrome (TCS), which has an incidence rate of only 0.025% in newborns. This once again pushed the boundaries of people's understanding of ChatGPT's capabilities. Prior to receiving the results from ChatGPT, the mother had sought medical help for her child for three years, consulting 17 different doctors, but none had identified the true cause of the illness.

Within a year, the large language model behind ChatGPT evolved from GPT 3.5 to GPT-4. It can now handle not only code and text but also various types of files, automatically generating charts from complex data by invoking Python code, and processing a 300-page novel to answer questions based on its content. Moreover, the occurrence of hallucinations has decreased.

In September, OpenAI opened GPT-4 Vision (GPT-4V) to paying users, enabling the model to process images as a form of information and provide responses to questions based on the understanding of the image's content.

"The dawn of the era of large models." In a 166-page report published in October, Microsoft researchers stated, "GPT-4V possesses unprecedented capabilities in handling interleaved multimodal (text and image) information and is the most powerful multimodal general artificial intelligence system today."

OpenAI began as an open research institution, and in 2020, it released detailed technical information when introducing GPT-3. With the success of ChatGPT igniting the AI market, the difficulty for the entire industry to catch up was reduced by GPT-3.

Meta Llama 2, a model of equivalent level to GPT-3.5, was released in July of this year. Meta made it open source and allowed commercial use. Overnight, the entire industry reached a new starting point. Some companies claim to have caught up with GPT-4 in certain capabilities, such as Google's PaLM 2, OpenAI's biggest competitor Anthropic's Claude 2, and Baidu's ERNIE BOT 4.0.

In certain specific functionalities, the competitors have even surpassed what OpenAI has done. In May of this year, Anthropic released the large model Claude-100k, expanding the length of text that large models can handle to 100k. At a time when GPT-4 could handle a maximum of 32k text, this opened up research directions for large models to process longer texts. Handling longer texts allows large models to be applied in more scenarios, such as finance, law, etc.

However, currently, no large model can compare with GPT-4V in terms of understanding images and videos. The competitors also lack a clear implementation path because, after the success of ChatGPT, OpenAI has gradually reduced the information it discloses. GPT-4 is not open source, and even the data sources and parameter sizes are no longer made public.

In May of this year, Google announced that they had begun developing the multimodal large model Gemini. Some practitioners of large models mentioned in media interviews that YouTube possesses the largest and richest collection of images, audio, and subtitle (text) data on the Internet, making it the "trump card" for Google's development of multimodal large models. However, Google has not yet released Gemini.

Furthermore, OpenAI's actual progress may be more significant than what has been publicly disclosed. According to the technical report released by OpenAI, GPT-4, including GPT-4V released in September, completed training in 2022. Sam Altman stated during an event in early October that OpenAI has already initiated the training of GPT-5 and GPT-6, and will continue to advance in the direction of multimodal capabilities, not only in terms of multimodal inputs but also multimodal outputs, improving the model's reliability, and developing personalized large models.

OpenAI has been established for eight years and has experimented with at least six different technological products, ranging from robotic hands to AI game robots, in search of a breakthrough for AI popularization. After ChatGPT became popular, it became the carrier for most of OpenAI's technological products. In the past year, OpenAI has devoted its full efforts to continuously incorporating the technologies developed over the years into ChatGPT.

When it was first released, ChatGPT was only a temporary product that could be used through a web interface. Now, OpenAI has developed a mobile application with an intuitive and user-friendly interface, gradually adding features to transform it into a super app:

In May, the iOS application was launched, adding features such as connectivity plugins to enhance ChatGPT's capabilities. For example, using search engines and other tools to compensate for the knowledge gap in ChatGPT's model, which was trained until September 2021.

In July, the "Code Interpreter" feature was launched, allowing paid users to analyze various complex data and generate images, among other capabilities.

In August, the "Prompt Examples" feature was introduced, supporting users in uploading multiple files to initiate inquiries.

In September, the functionality of DALL-E3, a text-to-image generation tool, was added. Users can input text and generate images according to their requirements. Additionally, the capabilities of listening and speaking were added, allowing users to communicate directly with ChatGPT using voice. The ability to understand user input images and answer questions based on them was also added. Paid users were offered the opportunity to invite new users to experience GPT-4 for free.

In October, support for uploading various file formats, including PDF, was added. Different plugins can be automatically switched within a conversation to solve different problems.

In November, the system analyzes user questions and automatically invokes the most appropriate plugins to provide answers. The knowledge base was updated until April of this year.


These advancements reflect OpenAI's continuous efforts to improve and expand the capabilities of ChatGPT, making it a versatile and powerful application for users.

Over the course of one year, ChatGPT has transformed from a web page into a product used by 100 million people every week. The number of users surpasses that of other internet productivity tools. Altman applied his growth methodology for entrepreneurs, which he taught during his time at YC, to ChatGPT. He iterated the product rapidly, aiming to retain the widest range of users. The experience in internet entrepreneurship shows that only by capturing the most users through a single entry point can a platform economy be established and unlimited income be obtained. ChatGPT is currently the largest AI entry point.

However, during the process of building this entry point, OpenAI clashed directly with its major investor, Microsoft. In February of this year, Microsoft launched New Bing, allowing users to invoke ChatGPT while using Bing search. But three months later, OpenAI introduced a plugin that enables users to invoke search engines when asking questions in ChatGPT. Both combinations involve the use of the large GPT model and Bing search, but the difference in entry points determines which company the users belong to. While the ChatGPT mobile application attracted a large number of users, Bing's market share in the global search engine market dropped to the level of 2018.

Image Credit: OpenAI DevDay Screenshot

In a recent developer event, OpenAI introduced the GPTs feature. According to Altman's demonstration, users only need to input requirements, upload specific data files, etc., to create a customized version of ChatGPT without writing any code.

OpenAI is also planning to launch a GPT Store, allowing users to upload or download various customized versions of GPT, similar to how Apple developed the App Store for mobile applications. The ambition of building a platform product for the AI era is fully disclosed.

In addition to ChatGPT's rapid growth, no other company or entrepreneur has challenged established business rules in an industry solely based on large models. Existing large companies that already have users or a paid business model, such as Microsoft Office, Salesforce, Adobe, etc., are incorporating large models into their mature products, charging users an additional $10-20 per month.

Some companies are also hoping to use large models to find new stories for their struggling innovative businesses. The most typical example is Meta, which released Meta Smart Glasses, a hardware product equipped with an AI assistant, in September of this year.

"Before the breakthrough of artificial intelligence last year, I believed that only with the introduction of stronger screens, holographic imaging, and other technologies, smart glasses would be ubiquitous," said Meta CEO Mark Zuckerberg. "Now, I believe that artificial intelligence technology is just as important for the popularity of smart glasses and other AR technologies."

Large companies with cloud computing businesses primarily enter the market of large models to sell resources. Companies like Microsoft, Google, Amazon, Alibaba, etc., invest in large model companies and sell their large model APIs as part of their platforms. They also train large models, but mainly as a customer acquisition strategy to attract customers to use their cloud computing resources for training and deploying large models.

Over the past year, a group of startups has started exploring specific applications of large models. For example, Character.AI uses large models to create various virtual characters, Inflection AI develops personal super assistants, Speak offers virtual English teachers, Jasper utilizes large models for marketing campaigns, and Harvey serves as an AI legal assistant.

Whether it is a large company developing products or a small startup creating AI applications, the biggest challenge they currently face is the high cost of using large models. Microsoft's programming assistant GitHub Copilot, based on GPT-4, has attracted over a million paid users, but the computational power required is significant, resulting in an average loss of $20 per user.

Large models differ from other software applications in that they are not only expensive to train but also costly to operate. To process user input, a large model essentially needs to run through the entire model for each word. Large models with billions of parameters require multiple A100 GPUs worth $10,000 each for each run. For example, if a large model needs to process a 100-word question, it has to run the model 100 times. Similar situations arise when generating a response, which incurs even higher costs.

In order to attract developers to their platform, OpenAI has significantly reduced the cost of using GPT-3.5 and GPT-4.

GPT-3.5 is currently the most affordable large model. The price for processing a 500-word question and generating a 500-word response is approximately $0.003. This may seem like a negligible amount, but if it is used to process 10 million queries per day, the cost would amount to $30,000. If we consider the usage of GPT-4, the annual cost would be astronomical.

So some companies that had planned to collaborate with OpenAI have turned to open-source alternatives. Salesforce, for example, initially planned to leverage GPT-4 to transform its extensive enterprise service business. However, they have now started developing their own large models or using open-source models as a replacement for GPT-4 in order to reduce costs. A senior vice president from Salesforce stated, "As AI products reach larger scales, we are starting to focus on cost-effectiveness, and cost will only become more important."

Companies like OpenAI are continuously adjusting the algorithms of their large models to lower the operating costs. However, they cannot escape the "tax" imposed by NVIDIA. According to a report by Robert Castellano, the president of consulting firm The Information Network, NVIDIA purchases critical components from TSMC and SK Hynix for less than $4,000, manufactures the H100 chip, and sells it for $40,000, resulting in a gross profit margin of over 90%.

Currently, the entire industry of large models has found two main solutions. One is for tech giants to develop their own chips. Companies such as Google, Amazon, Microsoft, and even OpenAI are considering self-developed chips specifically for AI computations.

The other solution is to have consumers purchase smartphones and computers that are more suitable for large model computations, sharing the computational cost. When Qualcomm and Apple recently released new laptop chips, they emphasized the ability to run large models with billions of parameters, introducing the concept of "AI PCs."

According to a research report by McKinsey in April of this year, 40% of companies have decided to increase their investments in artificial intelligence due to the emergence of generative AI. Goldman Sachs predicts that global enterprises will invest $110.2 billion in the field of artificial intelligence this year, representing a 20% increase from the previous year.

According to media reports, OpenAI's annual revenue has reached $1.3 billion thanks to ChatGPT, which is 43 times higher than its revenue from the previous year. OpenAI has proven for the first time that a company solely based on advanced AI technology can generate substantial income. OpenAI's valuation has also increased from less than $20 billion in October of last year to nearly $90 billion, making it the third-largest unicorn in the world, surpassed only by ByteDance and SpaceX, the rocket manufacturer.

However, OpenAI is not the biggest beneficiary. Capital markets now have more faith in the potential of tech giants in the field of AI.

These tech giants possess essential resources such as data, computing power, use cases, and customer bases, which are indispensable in the field of artificial intelligence. Compared to the leading large model technology of OpenAI, these infrastructure and resource advantages are just as scarce, if not more so, than the technology itself.

Since the release of ChatGPT last year, the S&P 500 has only increased by 10%. However, Microsoft's stock price has grown by nearly 50%, resulting in a market value increase of over $740 billion. Their $10 billion investment in OpenAI has essentially become a cost-neutral venture.

Google, initially thought to be vulnerable, has seen its market value increase by over $320 billion. Meta, which caught up with the trend of open-source large models, has experienced a market value surge of nearly $500 billion.

And of course, we cannot overlook NVIDIA. Leveraging its dominant position in the GPU market, NVIDIA's market value has grown by over $710 billion in the past year, making it the first trillion-dollar company in the chip industry. Just a year ago, they were struggling with sluggish graphics card sales and had to resort to price cuts and promotions.

In 2021, the EU proposed a framework for regulating artificial intelligence, but it did not progress further. After all, artificial intelligence was not yet considered an outdated trend at that time.

However, following the release of ChatGPT, AI legislation worldwide gained momentum. In June, the European Parliament, the primary legislative body of the EU, voted to pass the draft Artificial Intelligence Act (A.I. Act) that had been under consideration for two years. This act imposes strict limitations on the use scenarios and scope of artificial intelligence technology. It requires generative AI systems like ChatGPT to disclose the content generated by artificial intelligence, ensure the models are designed to prevent the generation of harmful content, and disclose the copyrighted data used to train the model, among other requirements.

Image Credit: Screenshot of ai.gov

Last month, U.S. President Joe Biden signed legislation to regulate artificial intelligence. According to a briefing from the White House, the United States has focused its oversight on the next generation of large models. Large AI companies are now required to notify the government when developing large models that pose "serious risks to national security, national economic security, etc."

Disruptive new technologies and regulatory policies often find themselves in opposition, but a balance is gradually being struck in this conflict. When the internet emerged, encrypting data transmission was seen as a measure to protect against terrorism, leading to different levels of encryption technology being used by browsers in different countries. Ride-hailing services faced regulatory violations in various parts of the world, and cryptocurrencies continue to operate in a legal gray area.

Artificial intelligence is among the few emerging industries actively embracing regulation. Six months after the release of ChatGPT, Altman testified before the U.S. Senate and urged lawmakers to regulate artificial intelligence. He emphasized the potential seriousness of the consequences if issues arise with this technology.

At the end of May, the non-profit organization Center for AI Safety published an open letter urging government agencies to prioritize "mitigating the risk of existential threats caused by artificial intelligence" and treat it with the same caution as epidemics and nuclear war. Prominent figures such as OpenAI executives, Google DeepMind head Demis Hassabis, and Anthropic CEO Dario Amodei signed the letter. They presented evidence suggesting that large models could be misused, such as promoting the spread of false information and aiding in the creation of harmful substances. However, there are equally strong voices opposing stringent AI regulation. Supporters of the open-source movement believe that when technology is freely accessible to everyone, the associated risks can be resolved.

Following strict regulation, new entrants and small companies often face challenges affording compliance costs. In industries such as banking, energy, and tobacco, there are few new companies emerging after the implementation of stringent regulations. Established companies that were already leading in these industries benefit from this situation.

Ultimately, Altman acknowledges that only a few companies will have the capacity to build powerful models. This reality has its pros and cons, as you have fewer companies to scrutinize.

Tags

Comments