1 00:00:00,000 --> 00:00:05,834 Welcome to this Team Treehouse micro course, Introducing Large Language Models. 2 00:00:05,834 --> 00:00:10,766 This micro course was written with the help of a large language model 3 00:00:10,766 --> 00:00:13,631 called Chat GPT, let's get started. 4 00:00:16,740 --> 00:00:21,199 Large Language Models or LLMs are a type of machine learning model, 5 00:00:21,199 --> 00:00:26,141 specifically trained to understand and generate natural language text. 6 00:00:26,141 --> 00:00:30,764 It's a subset of artificial intelligence, or AI, that allows computers to 7 00:00:30,764 --> 00:00:36,110 automatically improve their performance in a task by learning from examples provided. 8 00:00:36,110 --> 00:00:38,808 In order to understand this concept better, 9 00:00:38,808 --> 00:00:41,436 let's first talk about machine learning. 10 00:00:41,436 --> 00:00:46,199 Machine learning is a method of teaching computers to learn from data without 11 00:00:46,199 --> 00:00:48,188 being explicitly programmed. 12 00:00:48,188 --> 00:00:52,449 It's a type of AI that allows systems to automatically improve their 13 00:00:52,449 --> 00:00:54,440 performance with experience. 14 00:00:54,440 --> 00:00:57,984 Let me give you an example to help illustrate this idea. 15 00:00:57,984 --> 00:01:02,317 Imagine you want to teach a computer to recognize pictures of cats. 16 00:01:02,317 --> 00:01:07,103 You would start by providing the computer with pictures of cats along with 17 00:01:07,103 --> 00:01:10,201 pictures of other animals like dogs and birds. 18 00:01:10,201 --> 00:01:14,503 As the computer sees more and more pictures, it learns to recognize 19 00:01:14,503 --> 00:01:18,737 the characteristics of a cat like its tail, ears, and whiskers. 20 00:01:18,737 --> 00:01:20,924 This process is called training. 21 00:01:20,924 --> 00:01:25,159 Once the computer has been trained and you show it a new picture, 22 00:01:25,159 --> 00:01:29,013 it will be able to tell you if it's a picture of a cat or not. 23 00:01:31,129 --> 00:01:36,066 Large Language Models began in the mid-20th century with the development of 24 00:01:36,066 --> 00:01:37,982 artificial neural networks, 25 00:01:37,982 --> 00:01:42,634 a type of machine learning model inspired by the way the human brain works. 26 00:01:42,634 --> 00:01:46,810 Large Language Models are typically implemented as neural networks. 27 00:01:46,810 --> 00:01:50,866 Neural networks are good at handling large amounts of data and 28 00:01:50,866 --> 00:01:54,454 can be trained to perform well on a wide range of tasks, 29 00:01:54,454 --> 00:01:58,128 so they're the most popular choice for building LLMs. 30 00:01:58,128 --> 00:02:02,769 The parameters or values that are learned during the training process 31 00:02:02,769 --> 00:02:04,820 are used to make predictions. 32 00:02:04,820 --> 00:02:09,195 They are called large because they typically have a large number of 33 00:02:09,195 --> 00:02:12,321 parameters which allows them to understand and 34 00:02:12,321 --> 00:02:16,086 generate human language with a high degree of accuracy. 35 00:02:16,086 --> 00:02:20,895 LLMs are used in natural language processing or NLP, a subfield of AI that 36 00:02:20,895 --> 00:02:25,330 focuses on the interaction between human language and computers. 37 00:02:25,330 --> 00:02:28,480 There are many types of machine learning, but 38 00:02:28,480 --> 00:02:32,534 the two main categories are supervised and unsupervised. 39 00:02:32,534 --> 00:02:37,184 In supervised learning, the computer is given a labelled data set, 40 00:02:37,184 --> 00:02:41,685 which means that the correct answer is provided for each example. 41 00:02:41,685 --> 00:02:46,710 In our cat recognition example, each picture is labeled as cat or not cat. 42 00:02:46,710 --> 00:02:50,033 The computer learns to find the patterns in the data that 43 00:02:50,033 --> 00:02:52,163 correspond to the correct labels. 44 00:02:52,163 --> 00:02:56,634 In unsupervised learning, the computer is given an unlabeled data set, and 45 00:02:56,634 --> 00:03:00,234 it must find the patterns and structure in the data on its own. 46 00:03:00,234 --> 00:03:04,827 Unsupervised learning can be more challenging than supervised learning, 47 00:03:04,827 --> 00:03:09,418 because there is no clear guidance on what the model should be learning, but 48 00:03:09,418 --> 00:03:14,105 it can still be a powerful tool for uncovering patterns and insights in data. 49 00:03:14,105 --> 00:03:19,063 LLMs are trained using unsupervised learning, because it allows them to learn 50 00:03:19,063 --> 00:03:23,725 patterns in large amounts of text data, such as the vast amount of text data 51 00:03:23,725 --> 00:03:28,170 available on the internet without the need for explicit supervision. 52 00:03:28,170 --> 00:03:33,026 This makes them ideal for tasks such as natural language, understanding, and 53 00:03:33,026 --> 00:03:33,905 generation. 54 00:03:33,905 --> 00:03:38,257 But also raises concerns about their ability to perpetuate biases, 55 00:03:38,257 --> 00:03:40,817 which we'll talk about a little later. 56 00:03:40,817 --> 00:03:44,038 Here are some examples of unsupervised learning. 57 00:03:44,038 --> 00:03:48,620 Clustering is a technique used to group similar data points together. 58 00:03:48,620 --> 00:03:53,569 For example, a clustering algorithm can be used to group customers with similar 59 00:03:53,569 --> 00:03:56,201 purchasing habits into the same cluster. 60 00:03:56,201 --> 00:04:01,116 Dimensionality reduction is the process of reducing the number of features in 61 00:04:01,116 --> 00:04:04,920 the data while preserving as much information as possible. 62 00:04:04,920 --> 00:04:08,680 A real-world example is in the field of image compression. 63 00:04:08,680 --> 00:04:13,392 Digital images are often large in size and can take up a lot of storage space. 64 00:04:13,392 --> 00:04:17,190 Dimensionality reduction can be used to reduce the number of 65 00:04:17,190 --> 00:04:21,068 pixels in an image while preserving its overall appearance. 66 00:04:21,068 --> 00:04:26,061 Anomaly detection is a technique used to identify data points that deviate from 67 00:04:26,061 --> 00:04:28,230 the normal or expected behavior. 68 00:04:28,230 --> 00:04:31,338 For instance, this can be used to detect fraud. 69 00:04:31,338 --> 00:04:36,378 If a credit card transaction is abnormal, it can be flagged as suspicious. 70 00:04:36,378 --> 00:04:40,954 Unsupervised learning, can help to discover hidden patterns and 71 00:04:40,954 --> 00:04:45,460 relationships in data even when we don't have any labeled data. 72 00:04:45,460 --> 00:04:50,717 And it can be useful for many industrial applications such as quality control, 73 00:04:50,717 --> 00:04:52,684 fraud detection, and more. 74 00:04:52,684 --> 00:04:57,388 Next, let's take a look at the ways biases can occur in training data. 75 00:04:57,388 --> 00:05:02,092 Bias in training data refers to the phenomenon where the training data 76 00:05:02,092 --> 00:05:06,717 used to train a machine learning model does not accurately represent 77 00:05:06,717 --> 00:05:08,722 the population of interest. 78 00:05:08,722 --> 00:05:13,083 This can lead to models that perform well on the training data but 79 00:05:13,083 --> 00:05:17,545 poorly on new unseen data causing inaccurate or unfair results. 80 00:05:17,545 --> 00:05:21,772 Biases in training data may be introduced in a number of ways. 81 00:05:21,772 --> 00:05:26,543 Sampling bias occurs when the training data is not randomly sampled from 82 00:05:26,543 --> 00:05:29,985 the population, leading to over representation or 83 00:05:29,985 --> 00:05:32,815 under representation of certain groups. 84 00:05:32,815 --> 00:05:37,569 Measurement bias occurs when the features used to represent the data are not 85 00:05:37,569 --> 00:05:41,440 relevant or are measured differently for different groups. 86 00:05:41,440 --> 00:05:46,383 Demographic bias occurs when the majority of data used to train the model is 87 00:05:46,383 --> 00:05:51,175 from a specific group and not representative of other minority groups. 88 00:05:51,175 --> 00:05:55,412 It can lead to a model that performs well on the majority group, but 89 00:05:55,412 --> 00:05:57,237 poorly on minority groups. 90 00:05:57,237 --> 00:06:01,490 Temporal bias occurs when the data used to train the model is from 91 00:06:01,490 --> 00:06:06,000 a specific time period and not representative of the current time. 92 00:06:06,000 --> 00:06:10,120 This can lead to a model that performs well on historical data, but 93 00:06:10,120 --> 00:06:11,677 poorly on current data. 94 00:06:11,677 --> 00:06:15,962 It's important to be aware of potential biases in training data and 95 00:06:15,962 --> 00:06:19,499 address them by collecting more representative data or 96 00:06:19,499 --> 00:06:24,710 by using techniques such as resampling, data preprocessing, or reweighting. 97 00:06:24,710 --> 00:06:31,648 Next, let's look at some LLMs in use today and the types of products that use them. 98 00:06:31,648 --> 00:06:36,132 Some examples of Large Language Models in use today 99 00:06:36,132 --> 00:06:40,000 include GPT-3, BERT, T5, and XLNet. 100 00:06:40,000 --> 00:06:44,190 Which have been trained on massive amounts of text data and 101 00:06:44,190 --> 00:06:48,636 can perform a variety of language tasks such as translation, 102 00:06:48,636 --> 00:06:51,810 summarization, and question answering. 103 00:06:51,810 --> 00:06:56,693 In fact, the majority of this micro course was written with ChatGPT, 104 00:06:56,693 --> 00:06:58,450 a variation of GPT-3. 105 00:06:58,450 --> 00:07:02,631 ChatGPT is trained on an immense dataset of conversational text, 106 00:07:02,631 --> 00:07:07,182 which allows the model to understand and generate human like text that is 107 00:07:07,182 --> 00:07:10,930 appropriate for a wide range of conversational contexts. 108 00:07:10,930 --> 00:07:14,545 It can be used in many natural language processing tasks, 109 00:07:14,545 --> 00:07:19,070 such as creating learning resources like the one you're watching now. 110 00:07:19,070 --> 00:07:23,006 Large Language Models are used in a variety of products, 111 00:07:23,006 --> 00:07:26,128 including language translation services. 112 00:07:26,128 --> 00:07:31,098 A great example of a language translation service is Google Translate, 113 00:07:31,098 --> 00:07:36,069 a free online language translation service developed by Google that can 114 00:07:36,069 --> 00:07:41,380 translate text, speech, images, and web pages in over 100 languages. 115 00:07:41,380 --> 00:07:46,534 LLMs are also used to power the conversational abilities of chatbots and 116 00:07:46,534 --> 00:07:48,146 virtual assistants. 117 00:07:48,146 --> 00:07:52,201 They can understand and respond to a wide range of inputs, 118 00:07:52,201 --> 00:07:55,777 from simple questions to complex conversations. 119 00:07:55,777 --> 00:07:58,919 Siri and Alexa are a few well known examples. 120 00:07:59,960 --> 00:08:04,570 Content generation for social media and other forms of digital media. 121 00:08:04,570 --> 00:08:08,986 LLMs can generate human like text, which can be used to generate articles, 122 00:08:08,986 --> 00:08:11,960 product descriptions and instructional videos. 123 00:08:13,080 --> 00:08:16,171 Automated writing and content creation tools, 124 00:08:16,171 --> 00:08:20,700 ChatGPT is a recent high profile example of a product in this category. 125 00:08:21,810 --> 00:08:26,253 Sentiment analysis, LLMs can be used to understand the emotions and 126 00:08:26,253 --> 00:08:29,785 opinions expressed in text and social media content. 127 00:08:29,785 --> 00:08:34,360 Text summarization and analysis tools, GitHub Copilot is one example. 128 00:08:34,360 --> 00:08:39,151 Copilot is a code completion and code assistance tool developed by GitHub 129 00:08:39,151 --> 00:08:44,018 designed to help developers write code more efficiently and accurately by 130 00:08:44,018 --> 00:08:48,747 providing suggestions and predictions for code snippets as they type. 131 00:08:48,747 --> 00:08:53,432 GitHub Copilot uses machine learning models to understand the context 132 00:08:53,432 --> 00:08:57,499 of the code and predict what the developer is trying to write. 133 00:08:58,570 --> 00:09:03,051 Demand for professionals in these areas has grown rapidly in recent years, 134 00:09:03,051 --> 00:09:06,280 as more and more companies adopt these technologies. 135 00:09:07,710 --> 00:09:10,900 So, what does the future hold for Large Language Models? 136 00:09:12,000 --> 00:09:15,321 The world is likely to see continued advancements and 137 00:09:15,321 --> 00:09:19,897 increased adoption in a wide range of industries as the technology behind 138 00:09:19,897 --> 00:09:23,829 language models improves, and becomes more sophisticated. 139 00:09:23,829 --> 00:09:29,150 The outlook for career opportunities in Large Language Models is positive. 140 00:09:29,150 --> 00:09:32,186 With more and more companies using language models for 141 00:09:32,186 --> 00:09:36,740 tasks like language translation and chatbots, there are many opportunities for 142 00:09:36,740 --> 00:09:41,294 individuals with the right skills and experience to work on these applications, 143 00:09:41,294 --> 00:09:44,280 and to work within the broader field of data science. 144 00:09:45,770 --> 00:09:50,190 LLMs are only a subset within the vast field known as data science. 145 00:09:51,280 --> 00:09:53,892 Data science is a rapidly growing field, 146 00:09:53,892 --> 00:09:59,150 encompassing a wide range of techniques, tools, and applications. 147 00:09:59,150 --> 00:10:02,448 With technologies advancing every day, the demand for 148 00:10:02,448 --> 00:10:05,475 data scientists will continue to advance as well. 149 00:10:05,475 --> 00:10:09,160 In fact, faster than the average for most other occupations. 150 00:10:10,290 --> 00:10:15,240 As new discoveries are made in the field, so too will new professions emerge. 151 00:10:16,650 --> 00:10:22,560 So, if you're asking yourself, how can I get started in the field of LLMs? 152 00:10:22,560 --> 00:10:24,560 Here are a few next steps to consider. 153 00:10:26,020 --> 00:10:30,856 Study the fundamentals of machine learning and natural language processing, 154 00:10:30,856 --> 00:10:34,980 and experiment with pretrained LLMs like GPT-3, BERT, or T5. 155 00:10:36,100 --> 00:10:38,665 This will help you understand how they work and 156 00:10:38,665 --> 00:10:41,110 how to use them in real world applications. 157 00:10:42,520 --> 00:10:45,391 Join a community or forum where people discuss and 158 00:10:45,391 --> 00:10:47,520 share their experiences with LLMs. 159 00:10:49,040 --> 00:10:51,290 Develop your own LLMs. 160 00:10:51,290 --> 00:10:54,610 You can use open source libraries such as TensorFlow or 161 00:10:54,610 --> 00:10:58,450 PyTorch to build your own models and gain hands on experience. 162 00:10:59,880 --> 00:11:04,360 It's worth noting that the field of data science is rapidly evolving. 163 00:11:04,360 --> 00:11:08,000 The skills required to work with LLMs are constantly changing. 164 00:11:08,000 --> 00:11:11,349 So, staying up to date with the latest developments is essential for 165 00:11:11,349 --> 00:11:12,860 professionals in this field. 166 00:11:14,270 --> 00:11:18,914 Finally, consider taking a course or earning a degree in a related field such 167 00:11:18,914 --> 00:11:23,070 as computer science, data science, or artificial intelligence. 168 00:11:24,190 --> 00:11:26,950 To continue your learning journey with Treehouse, 169 00:11:26,950 --> 00:11:28,900 check out these Treehouse courses. 170 00:11:30,350 --> 00:11:33,003 In the course Machine learning basics, 171 00:11:33,003 --> 00:11:36,210 dive deeper into machine learning frameworks. 172 00:11:37,320 --> 00:11:40,459 You'll learn to use a Python library called scikit-learn, 173 00:11:40,459 --> 00:11:44,800 which includes well designed tools for performing common machine learning tasks. 174 00:11:44,800 --> 00:11:46,259 As well as Anaconda, 175 00:11:46,259 --> 00:11:51,298 a Python-based platform focused on data science and machine learning. 176 00:11:54,198 --> 00:11:58,930 In introduction to algorithms, take your first steps toward understanding 177 00:11:58,930 --> 00:12:03,092 the world of algorithms, time complexity, and data structures. 178 00:12:03,092 --> 00:12:07,592 In this course, our teaching team will examine algorithmic thinking, and 179 00:12:07,592 --> 00:12:10,701 you will learn how to implement algorithms in code. 180 00:12:12,688 --> 00:12:17,744 And data analysis basics provides a comprehensive overview 181 00:12:17,744 --> 00:12:22,016 of charting, visualizing, and analyzing data. 182 00:12:24,174 --> 00:12:28,456 We hope you found this micro course helpful in introducing you to 183 00:12:28,456 --> 00:12:30,131 Large Language Models. 184 00:12:30,131 --> 00:12:35,033 As we've seen, machine learning and AI are positioned as the future of tech. 185 00:12:35,033 --> 00:12:39,782 We at Treehouse hope this step we've taken into AI generated 186 00:12:39,782 --> 00:12:43,610 content has been a positive experience for you. 187 00:12:43,610 --> 00:12:47,070 We want to know what you thought of this video, 188 00:12:47,070 --> 00:12:52,530 please email your honest feedback to feedback@teamtreehouse.com. 189 00:12:52,530 --> 00:12:53,811 Thanks for watching.