Large Language Models and ChatGPT

Written by Vivoka

Discover | Latest

Why voice data may no longer be centric in AI?

Natural Language Processing – An Overview on what makes an AI “conversational”

Vivoka challenges the voice assistant giants with its offline solution

Since its launch in November 2022, ChatGPT has become a hot topic and has taken up more and more space in the media sphere. More domains are integrating Large Language Models (LLM) as part of their services. However, the question of whether AI’s expanding presence in our lives is beneficial or detrimental remains a subject of debate. Today, the trends show that most people are excited to reap the benefits of LLMs and conversational AI. But what exactly are large language models? How are they created and why are they so powerful? You’ll find the answers to these questions in this blog post!

Before diving into the world of LLMs, it is important to understand the intuition behind them and their limitations. This article aims to equip you with the knowledge you need to understand the technology and shed some light on the (in-)consistencies of ChatGPT.

This article was written in collaboration with our R&D team members: Firas Hmida, PhD Machine Learning & NLP and Nora Lindvall, NLP master student.

Language Model Intuition

Before answering the question “does ChatGPT really rely on NLU?”, let’s see how LMs work and the intuition behind their technology.

If you ask people if “It is raining cats and mice” sounds natural, most answers would be “no, it should be it is raining cats and dogs“. It is partially right. From an NLP point of view, this word sequence is not frequent/probable. More precisely, humans speak using words that “often” occur together in a well-defined order. Even if we provide the first part of the sentence “it is raining”, a native English speaker tends to produce “cats and dogs” because it is a frequent word combination in English. This is how Language Models (LMs) work.

Current LMs are Neural Networks that have been trained on texts that humans produced (so considered “ground truth”). The mechanism we use during the training process aims to teach the language model to guess the next word for a given set of words. For example, the model takes as input “Once upon a [blank]” and should fill the [blank] with “time” but not with “region”. We tend to talk about language models in terms of probability. A “natural” (humanly acceptable) utterance is a sequence of words with high probability.

At this level (and to simplify things), during the training pipeline, the language model can predict/guess all the possible words. You usually rank them depending on their probability of occurring in a given utterance. That means that the LM should handle all words in a given language. The more words a language includes, the larger the LM.

Large Language Models

We call them “Large” Language Models (LLMs) because these models have a high memory and size requirement. They reach several gigabytes, due to the inclusion of billions of parameters. Parameters can be thought of as adjustable settings that allow the model to learn. With more parameters, the model can grasp more complex concepts. LLMs like GPT3, GPT-4, and ChatGPT, which are used in production, rely on numerous supercomputers running on data center servers.

Over time, these models are trained on massive datasets, leading to their continuous growth in size and significant increase in power. Due to the large amount of data that they use during training, LLMs can perform a wide range of tasks with limited or no human guidance. It includes essay writing, answering questions about science and technology, summarizing documents and even coding. However, their fundamental purpose is to predict the next word in a sentence, similar to the autocomplete feature when composing an email.

Why are LLMs so Powerful?

If LLMs, like ChatGPT and others, just perform predicting which word will come next for a given sentence, it is crucial to understand that this constitutes a highly specialized “reasoning” or “thinking” from a human point of view – only one way of thinking.

In fact, the initial concept of language models was introduced by Claude Shannon in the 50’s. What is new today is the rise of computing that can be reached thanks to data center servers, and their combination with Machine Learning Algorithms.

So why are they so strong?

There are two essential components that contribute to the success of these models:

The first aspect involves their ability to blend word contexts in a manner that greatly enhances its proficiency in predicting the next word;
The other component of this key factor lies in the training methodology. Large Language Models undergo training using massive quantities of data gathered from various online sources. These sources encompass books, blogs, news websites, Wikipedia articles, discussions on platforms, and conversations from social media.

Throughout training, we provide a bench of text sourced from one of these platforms and tell the model to predict the next word. If the model’s prediction is incorrect, we make slight adjustments to the model until it produces the correct answer. When considering the objective of training an LLM, it aims to generate text that could feasibly have been found on the internet. Since it cannot memorize the entirety of the internet, it relies on encoded representations to make compromises. This may occasionally result in slight inaccuracies, albeit hopefully not significant ones.

Will ChatGPT take over?

Don’t let the human-like interaction fool you – ChatGPT may seem to have a life of its own, but it is just an illusion. Behind the scenes, it simply generates output based on its database of texts written by humans. It predicts the next word based on extensive context. It is by no means conscious or has its own will. Contrary to what we can see in movies, there is no need to worry about ChatGPT suddenly turning against humanity, seeking world domination. Dry as it may sound, it is just a model spitting out predictions. One last thing: if you want to know the opinion of Yann LeCun, one of the pioneering researchers behind deep learning, we suggest you check this interview out.

For developers, by developers

Try our voice solutions now

Sign up first on the Console

Before integrating with VDK, test our online playground: Vivoka Console.

Develop and test your use cases

Design, create and try all of your features.

Submit your project

Share your project and talk about it with our expert for real integration.

It's always the right time to learn more about voice technologies and their applications

Browse our content

Voice Technology: my voice is a risky personal data

Discover, Latest

Voice technology has become an integral part of modern life, embedded in everything from smartphones to home assistants and business usages. However, the conveniences it offers come with significant...

Christophe Couvreur, Esteemed AI and Voice Technology Executive, Named CEO of Vivoka

May 21, 2024 | Latest, Press Releases

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.

Necessary

Always Enabled

Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Functional

Performance

Analytics

Others