GPT-3: we are at the beginning of a new application ecosystem

Join Transform 2021 to learn about the most important topics in AI and business data. To know more.


The most impressive thing about OpenAI’s natural language processing (NLP) model, GPT-3, is its size. With over 175 billion weighted connections between words known as parameters, the transformer-encoder-decoder model blows its predecessor of 1.5 billion parameters, the GPT-2, out of the water. This allowed the model to generate a text surprisingly similar to the human, after receiving only a few examples of the task you want to accomplish.

Its launch in 2020 dominated the headlines, and people were struggling to get on the waiting list to access its API hosted on OpenAI’s cloud service. Now, months later, as more users gain access to the API (myself included), interesting applications and use cases come up every day. For example, Debuild.co has some really interesting demos where you can build an application by giving the program some simple instructions in English.

Despite the hype, doubts remain about whether the GPT-3 will be the foundation on which an ecosystem of NLP applications will rest or whether newer and stronger NLP models will bring it down from its throne. As companies begin to imagine and develop NLP applications, here’s what they should know about GPT-3 and its potential ecosystem.

GPT-3 and the NLP arms race

As I described in the past, there are actually two approaches to pre-training an NLP model: generalized and non-generalized.

A non-generalized approach has specific pre-training objectives that are aligned with a known use case. Basically, these models delve into a smaller and more focused data set, rather than a massive data set. An example of this is Google’s PEGASUS model, which is built specifically to allow text summary. PEGASUS is pre-trained on a data set that resembles its ultimate goal. It is then adjusted in text summary data sets to provide state-of-the-art results. The benefit of the non-generalized approach is that it can dramatically increase accuracy for specific tasks. However, it is also significantly less flexible than a generalized model and still requires many training examples before it starts to achieve accuracy.

A generalized approach, in contrast, is broad. These are the 175 billion GPT-3 parameters in action and are essentially pre-trained across the Internet. This allows the GPT-3 to perform basically any NLP task with just a few examples, although its accuracy is not always ideal. In fact, the OpenAI team highlights the limits of widespread pre-training and even cedes that the GPT-3 has “notable weaknesses in text synthesis”.

OpenAI decided that going bigger is better when it comes to precision problems, with each version of the model increasing the number of parameters by orders of magnitude. Competitors noticed. Google researchers recently released an article highlighting a NLP model of the Switch Transformer that has 1.6 trillion parameters. This is just a ridiculous number, but it could mean that we will see a little bit of an arms race when it comes to generalized models. While these are by far the two largest generalized models, Microsoft has Turing-NLG on 17 billion parameters and may be looking to enter the arms race as well. When you consider that GPT-3 training cost OpenAI nearly $ 12 million, this arms race can be expensive.

Promising GPT-3 applications

The flexibility of the GPT-3 is what makes it appealing from an application ecosystem perspective. You can use it to do just about anything you can imagine with the language. Predictably, startups started exploring how to use GPT-3 to power the next generation of NLP applications. Here is a list of interesting GPT-3 products compiled by Alex Schmitt of Cherry Ventures.

Many of these applications are consumer-oriented, such as the “Love Letter Generator”, but there are also more technical applications, such as the “HTML Generator”. As companies consider how and where they can incorporate GPT-3 into their business processes, some of the most promising first use cases are in health, finance and video conferencing.

For healthcare, financial services and insurance companies, simplifying research is a huge need. The data in these fields is growing exponentially and it is becoming impossible to stay on top of your field in the face of this peak. NLP applications built on the GPT-3 can override reports, articles, results, etc. the most recent and contextually summarize key researchers’ time-saving findings.

And as videoconferencing and telehealth became increasingly important during the pandemic, we saw an increase in demand for NLP tools that can be applied to videoconferences. What the GPT-3 offers is the ability to not only make scripts and take notes from an individual meeting, but also to generate “a lot of time; did not read ”abstracts (TL; DR).

How companies and startups can build a gap

Despite these promising use cases, the primary inhibitor of a GPT-3 application ecosystem is the ease with which a copycat can replicate the performance of any application developed using the GPT-3 API.

Everyone using the GPT-3 API is receiving the same pre-trained NLP model on the same data, so the only difference is the fine-tuning data that an organization uses to specialize the use case. The more fine-tuning data you use, the more differentiated and sophisticated the output will be.

What this means? Larger organizations with more users or more data than their competitors will be better able to take advantage of the GPT-3 promise. GPT-3 will not lead to disruptive starts; this will allow companies and large organizations to optimize their offerings due to the advantage they have.

What does this mean for companies and startups in the future?

Applications built with the GPT-3 API are just beginning to scratch the surface of possible use cases, and therefore, we have not yet seen an interesting proof-of-concept ecosystem develop. How this ecosystem would monetize and mature is also an open question.

As differentiation in this context requires fine-tuning, I expect companies to adopt the generalization of GPT-3 for certain NLP tasks while maintaining non-generalized models, such as PEGASUS, for more specific NLP tasks.

In addition, as the number of parameters expands exponentially among large NLP participants, we can see users changing ecosystems depending on who is in the lead at the moment.

Regardless of whether a GPT-3 application ecosystem matures or is replaced by another NLP model, companies should be enthusiastic about the relative ease with which it is becoming possible to create highly articulated NLP models. They should explore use cases and consider how they can take advantage of their position in the market to quickly build value-added for their customers and their own business processes.

Dattaraj Rao is an innovation and R&D architect at Persistent Systems and author of the book Keras for Kubernetes: the journey from a machine learning model to production. At Persistent Systems, he leads the AI ​​Research Lab. He holds 11 patents in machine learning and computer vision.

VentureBeat

VentureBeat’s mission is to be a digital city square for technical decision makers to gain insight into transformative technology and transact. Our website provides essential information on technologies and data strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on subjects of interest
  • our newsletters
  • leading closed-minded content and discounted access to our award-winning events such as Transform
  • network resources and more

Become a member

Source