1
00:00:00,000 --> 00:00:05,834
Welcome to this Team Treehouse micro
course, Introducing Large Language Models.

2
00:00:05,834 --> 00:00:10,766
This micro course was written with
the help of a large language model

3
00:00:10,766 --> 00:00:13,631
called Chat GPT, let's get started.

4
00:00:16,740 --> 00:00:21,199
Large Language Models or
LLMs are a type of machine learning model,

5
00:00:21,199 --> 00:00:26,141
specifically trained to understand and
generate natural language text.

6
00:00:26,141 --> 00:00:30,764
It's a subset of artificial intelligence,
or AI, that allows computers to

7
00:00:30,764 --> 00:00:36,110
automatically improve their performance in
a task by learning from examples provided.

8
00:00:36,110 --> 00:00:38,808
In order to understand
this concept better,

9
00:00:38,808 --> 00:00:41,436
let's first talk about machine learning.

10
00:00:41,436 --> 00:00:46,199
Machine learning is a method of teaching
computers to learn from data without

11
00:00:46,199 --> 00:00:48,188
being explicitly programmed.

12
00:00:48,188 --> 00:00:52,449
It's a type of AI that allows systems
to automatically improve their

13
00:00:52,449 --> 00:00:54,440
performance with experience.

14
00:00:54,440 --> 00:00:57,984
Let me give you an example to
help illustrate this idea.

15
00:00:57,984 --> 00:01:02,317
Imagine you want to teach a computer
to recognize pictures of cats.

16
00:01:02,317 --> 00:01:07,103
You would start by providing the computer
with pictures of cats along with

17
00:01:07,103 --> 00:01:10,201
pictures of other animals like dogs and
birds.

18
00:01:10,201 --> 00:01:14,503
As the computer sees more and
more pictures, it learns to recognize

19
00:01:14,503 --> 00:01:18,737
the characteristics of a cat like
its tail, ears, and whiskers.

20
00:01:18,737 --> 00:01:20,924
This process is called training.

21
00:01:20,924 --> 00:01:25,159
Once the computer has been trained and
you show it a new picture,

22
00:01:25,159 --> 00:01:29,013
it will be able to tell you if
it's a picture of a cat or not.

23
00:01:31,129 --> 00:01:36,066
Large Language Models began in the
mid-20th century with the development of

24
00:01:36,066 --> 00:01:37,982
artificial neural networks,

25
00:01:37,982 --> 00:01:42,634
a type of machine learning model inspired
by the way the human brain works.

26
00:01:42,634 --> 00:01:46,810
Large Language Models are typically
implemented as neural networks.

27
00:01:46,810 --> 00:01:50,866
Neural networks are good at
handling large amounts of data and

28
00:01:50,866 --> 00:01:54,454
can be trained to perform well
on a wide range of tasks,

29
00:01:54,454 --> 00:01:58,128
so they're the most popular choice for
building LLMs.

30
00:01:58,128 --> 00:02:02,769
The parameters or values that
are learned during the training process

31
00:02:02,769 --> 00:02:04,820
are used to make predictions.

32
00:02:04,820 --> 00:02:09,195
They are called large because they
typically have a large number of

33
00:02:09,195 --> 00:02:12,321
parameters which allows
them to understand and

34
00:02:12,321 --> 00:02:16,086
generate human language with
a high degree of accuracy.

35
00:02:16,086 --> 00:02:20,895
LLMs are used in natural language
processing or NLP, a subfield of AI that

36
00:02:20,895 --> 00:02:25,330
focuses on the interaction between
human language and computers.

37
00:02:25,330 --> 00:02:28,480
There are many types of machine learning,
but

38
00:02:28,480 --> 00:02:32,534
the two main categories are supervised and
unsupervised.

39
00:02:32,534 --> 00:02:37,184
In supervised learning,
the computer is given a labelled data set,

40
00:02:37,184 --> 00:02:41,685
which means that the correct answer
is provided for each example.

41
00:02:41,685 --> 00:02:46,710
In our cat recognition example,
each picture is labeled as cat or not cat.

42
00:02:46,710 --> 00:02:50,033
The computer learns to find
the patterns in the data that

43
00:02:50,033 --> 00:02:52,163
correspond to the correct labels.

44
00:02:52,163 --> 00:02:56,634
In unsupervised learning, the computer
is given an unlabeled data set, and

45
00:02:56,634 --> 00:03:00,234
it must find the patterns and
structure in the data on its own.

46
00:03:00,234 --> 00:03:04,827
Unsupervised learning can be more
challenging than supervised learning,

47
00:03:04,827 --> 00:03:09,418
because there is no clear guidance on
what the model should be learning, but

48
00:03:09,418 --> 00:03:14,105
it can still be a powerful tool for
uncovering patterns and insights in data.

49
00:03:14,105 --> 00:03:19,063
LLMs are trained using unsupervised
learning, because it allows them to learn

50
00:03:19,063 --> 00:03:23,725
patterns in large amounts of text data,
such as the vast amount of text data

51
00:03:23,725 --> 00:03:28,170
available on the internet without
the need for explicit supervision.

52
00:03:28,170 --> 00:03:33,026
This makes them ideal for tasks such
as natural language, understanding, and

53
00:03:33,026 --> 00:03:33,905
generation.

54
00:03:33,905 --> 00:03:38,257
But also raises concerns about
their ability to perpetuate biases,

55
00:03:38,257 --> 00:03:40,817
which we'll talk about a little later.

56
00:03:40,817 --> 00:03:44,038
Here are some examples of
unsupervised learning.

57
00:03:44,038 --> 00:03:48,620
Clustering is a technique used to
group similar data points together.

58
00:03:48,620 --> 00:03:53,569
For example, a clustering algorithm can
be used to group customers with similar

59
00:03:53,569 --> 00:03:56,201
purchasing habits into the same cluster.

60
00:03:56,201 --> 00:04:01,116
Dimensionality reduction is the process
of reducing the number of features in

61
00:04:01,116 --> 00:04:04,920
the data while preserving as
much information as possible.

62
00:04:04,920 --> 00:04:08,680
A real-world example is in
the field of image compression.

63
00:04:08,680 --> 00:04:13,392
Digital images are often large in size and
can take up a lot of storage space.

64
00:04:13,392 --> 00:04:17,190
Dimensionality reduction can be
used to reduce the number of

65
00:04:17,190 --> 00:04:21,068
pixels in an image while
preserving its overall appearance.

66
00:04:21,068 --> 00:04:26,061
Anomaly detection is a technique used to
identify data points that deviate from

67
00:04:26,061 --> 00:04:28,230
the normal or expected behavior.

68
00:04:28,230 --> 00:04:31,338
For instance,
this can be used to detect fraud.

69
00:04:31,338 --> 00:04:36,378
If a credit card transaction is abnormal,
it can be flagged as suspicious.

70
00:04:36,378 --> 00:04:40,954
Unsupervised learning,
can help to discover hidden patterns and

71
00:04:40,954 --> 00:04:45,460
relationships in data even when
we don't have any labeled data.

72
00:04:45,460 --> 00:04:50,717
And it can be useful for many industrial
applications such as quality control,

73
00:04:50,717 --> 00:04:52,684
fraud detection, and more.

74
00:04:52,684 --> 00:04:57,388
Next, let's take a look at the ways
biases can occur in training data.

75
00:04:57,388 --> 00:05:02,092
Bias in training data refers to
the phenomenon where the training data

76
00:05:02,092 --> 00:05:06,717
used to train a machine learning
model does not accurately represent

77
00:05:06,717 --> 00:05:08,722
the population of interest.

78
00:05:08,722 --> 00:05:13,083
This can lead to models that perform
well on the training data but

79
00:05:13,083 --> 00:05:17,545
poorly on new unseen data causing
inaccurate or unfair results.

80
00:05:17,545 --> 00:05:21,772
Biases in training data may be
introduced in a number of ways.

81
00:05:21,772 --> 00:05:26,543
Sampling bias occurs when the training
data is not randomly sampled from

82
00:05:26,543 --> 00:05:29,985
the population,
leading to over representation or

83
00:05:29,985 --> 00:05:32,815
under representation of certain groups.

84
00:05:32,815 --> 00:05:37,569
Measurement bias occurs when the features
used to represent the data are not

85
00:05:37,569 --> 00:05:41,440
relevant or are measured differently for
different groups.

86
00:05:41,440 --> 00:05:46,383
Demographic bias occurs when the majority
of data used to train the model is

87
00:05:46,383 --> 00:05:51,175
from a specific group and not
representative of other minority groups.

88
00:05:51,175 --> 00:05:55,412
It can lead to a model that performs
well on the majority group, but

89
00:05:55,412 --> 00:05:57,237
poorly on minority groups.

90
00:05:57,237 --> 00:06:01,490
Temporal bias occurs when the data
used to train the model is from

91
00:06:01,490 --> 00:06:06,000
a specific time period and
not representative of the current time.

92
00:06:06,000 --> 00:06:10,120
This can lead to a model that
performs well on historical data, but

93
00:06:10,120 --> 00:06:11,677
poorly on current data.

94
00:06:11,677 --> 00:06:15,962
It's important to be aware of
potential biases in training data and

95
00:06:15,962 --> 00:06:19,499
address them by collecting
more representative data or

96
00:06:19,499 --> 00:06:24,710
by using techniques such as resampling,
data preprocessing, or reweighting.

97
00:06:24,710 --> 00:06:31,648
Next, let's look at some LLMs in use today
and the types of products that use them.

98
00:06:31,648 --> 00:06:36,132
Some examples of
Large Language Models in use today

99
00:06:36,132 --> 00:06:40,000
include GPT-3, BERT, T5, and XLNet.

100
00:06:40,000 --> 00:06:44,190
Which have been trained on
massive amounts of text data and

101
00:06:44,190 --> 00:06:48,636
can perform a variety of language
tasks such as translation,

102
00:06:48,636 --> 00:06:51,810
summarization, and question answering.

103
00:06:51,810 --> 00:06:56,693
In fact, the majority of this micro
course was written with ChatGPT,

104
00:06:56,693 --> 00:06:58,450
a variation of GPT-3.

105
00:06:58,450 --> 00:07:02,631
ChatGPT is trained on an immense
dataset of conversational text,

106
00:07:02,631 --> 00:07:07,182
which allows the model to understand and
generate human like text that is

107
00:07:07,182 --> 00:07:10,930
appropriate for
a wide range of conversational contexts.

108
00:07:10,930 --> 00:07:14,545
It can be used in many natural
language processing tasks,

109
00:07:14,545 --> 00:07:19,070
such as creating learning resources
like the one you're watching now.

110
00:07:19,070 --> 00:07:23,006
Large Language Models are used
in a variety of products,

111
00:07:23,006 --> 00:07:26,128
including language translation services.

112
00:07:26,128 --> 00:07:31,098
A great example of a language
translation service is Google Translate,

113
00:07:31,098 --> 00:07:36,069
a free online language translation
service developed by Google that can

114
00:07:36,069 --> 00:07:41,380
translate text, speech, images, and
web pages in over 100 languages.

115
00:07:41,380 --> 00:07:46,534
LLMs are also used to power the
conversational abilities of chatbots and

116
00:07:46,534 --> 00:07:48,146
virtual assistants.

117
00:07:48,146 --> 00:07:52,201
They can understand and
respond to a wide range of inputs,

118
00:07:52,201 --> 00:07:55,777
from simple questions to
complex conversations.

119
00:07:55,777 --> 00:07:58,919
Siri and
Alexa are a few well known examples.

120
00:07:59,960 --> 00:08:04,570
Content generation for social media and
other forms of digital media.

121
00:08:04,570 --> 00:08:08,986
LLMs can generate human like text,
which can be used to generate articles,

122
00:08:08,986 --> 00:08:11,960
product descriptions and
instructional videos.

123
00:08:13,080 --> 00:08:16,171
Automated writing and
content creation tools,

124
00:08:16,171 --> 00:08:20,700
ChatGPT is a recent high profile
example of a product in this category.

125
00:08:21,810 --> 00:08:26,253
Sentiment analysis, LLMs can be
used to understand the emotions and

126
00:08:26,253 --> 00:08:29,785
opinions expressed in text and
social media content.

127
00:08:29,785 --> 00:08:34,360
Text summarization and analysis tools,
GitHub Copilot is one example.

128
00:08:34,360 --> 00:08:39,151
Copilot is a code completion and
code assistance tool developed by GitHub

129
00:08:39,151 --> 00:08:44,018
designed to help developers write code
more efficiently and accurately by

130
00:08:44,018 --> 00:08:48,747
providing suggestions and predictions for
code snippets as they type.

131
00:08:48,747 --> 00:08:53,432
GitHub Copilot uses machine learning
models to understand the context

132
00:08:53,432 --> 00:08:57,499
of the code and predict what
the developer is trying to write.

133
00:08:58,570 --> 00:09:03,051
Demand for professionals in these areas
has grown rapidly in recent years,

134
00:09:03,051 --> 00:09:06,280
as more and
more companies adopt these technologies.

135
00:09:07,710 --> 00:09:10,900
So, what does the future hold for
Large Language Models?

136
00:09:12,000 --> 00:09:15,321
The world is likely to see
continued advancements and

137
00:09:15,321 --> 00:09:19,897
increased adoption in a wide range of
industries as the technology behind

138
00:09:19,897 --> 00:09:23,829
language models improves, and
becomes more sophisticated.

139
00:09:23,829 --> 00:09:29,150
The outlook for career opportunities
in Large Language Models is positive.

140
00:09:29,150 --> 00:09:32,186
With more and
more companies using language models for

141
00:09:32,186 --> 00:09:36,740
tasks like language translation and
chatbots, there are many opportunities for

142
00:09:36,740 --> 00:09:41,294
individuals with the right skills and
experience to work on these applications,

143
00:09:41,294 --> 00:09:44,280
and to work within the broader
field of data science.

144
00:09:45,770 --> 00:09:50,190
LLMs are only a subset within
the vast field known as data science.

145
00:09:51,280 --> 00:09:53,892
Data science is a rapidly growing field,

146
00:09:53,892 --> 00:09:59,150
encompassing a wide range of techniques,
tools, and applications.

147
00:09:59,150 --> 00:10:02,448
With technologies advancing every day,
the demand for

148
00:10:02,448 --> 00:10:05,475
data scientists will
continue to advance as well.

149
00:10:05,475 --> 00:10:09,160
In fact, faster than the average for
most other occupations.

150
00:10:10,290 --> 00:10:15,240
As new discoveries are made in the field,
so too will new professions emerge.

151
00:10:16,650 --> 00:10:22,560
So, if you're asking yourself, how can
I get started in the field of LLMs?

152
00:10:22,560 --> 00:10:24,560
Here are a few next steps to consider.

153
00:10:26,020 --> 00:10:30,856
Study the fundamentals of machine
learning and natural language processing,

154
00:10:30,856 --> 00:10:34,980
and experiment with pretrained
LLMs like GPT-3, BERT, or T5.

155
00:10:36,100 --> 00:10:38,665
This will help you
understand how they work and

156
00:10:38,665 --> 00:10:41,110
how to use them in real
world applications.

157
00:10:42,520 --> 00:10:45,391
Join a community or
forum where people discuss and

158
00:10:45,391 --> 00:10:47,520
share their experiences with LLMs.

159
00:10:49,040 --> 00:10:51,290
Develop your own LLMs.

160
00:10:51,290 --> 00:10:54,610
You can use open source
libraries such as TensorFlow or

161
00:10:54,610 --> 00:10:58,450
PyTorch to build your own models and
gain hands on experience.

162
00:10:59,880 --> 00:11:04,360
It's worth noting that the field of
data science is rapidly evolving.

163
00:11:04,360 --> 00:11:08,000
The skills required to work with
LLMs are constantly changing.

164
00:11:08,000 --> 00:11:11,349
So, staying up to date with the latest
developments is essential for

165
00:11:11,349 --> 00:11:12,860
professionals in this field.

166
00:11:14,270 --> 00:11:18,914
Finally, consider taking a course or
earning a degree in a related field such

167
00:11:18,914 --> 00:11:23,070
as computer science, data science,
or artificial intelligence.

168
00:11:24,190 --> 00:11:26,950
To continue your learning
journey with Treehouse,

169
00:11:26,950 --> 00:11:28,900
check out these Treehouse courses.

170
00:11:30,350 --> 00:11:33,003
In the course Machine learning basics,

171
00:11:33,003 --> 00:11:36,210
dive deeper into machine
learning frameworks.

172
00:11:37,320 --> 00:11:40,459
You'll learn to use a Python
library called scikit-learn,

173
00:11:40,459 --> 00:11:44,800
which includes well designed tools for
performing common machine learning tasks.

174
00:11:44,800 --> 00:11:46,259
As well as Anaconda,

175
00:11:46,259 --> 00:11:51,298
a Python-based platform focused on
data science and machine learning.

176
00:11:54,198 --> 00:11:58,930
In introduction to algorithms,
take your first steps toward understanding

177
00:11:58,930 --> 00:12:03,092
the world of algorithms,
time complexity, and data structures.

178
00:12:03,092 --> 00:12:07,592
In this course, our teaching team will
examine algorithmic thinking, and

179
00:12:07,592 --> 00:12:10,701
you will learn how to
implement algorithms in code.

180
00:12:12,688 --> 00:12:17,744
And data analysis basics
provides a comprehensive overview

181
00:12:17,744 --> 00:12:22,016
of charting, visualizing,
and analyzing data.

182
00:12:24,174 --> 00:12:28,456
We hope you found this micro course
helpful in introducing you to

183
00:12:28,456 --> 00:12:30,131
Large Language Models.

184
00:12:30,131 --> 00:12:35,033
As we've seen, machine learning and
AI are positioned as the future of tech.

185
00:12:35,033 --> 00:12:39,782
We at Treehouse hope this step
we've taken into AI generated

186
00:12:39,782 --> 00:12:43,610
content has been a positive experience for
you.

187
00:12:43,610 --> 00:12:47,070
We want to know what you
thought of this video,

188
00:12:47,070 --> 00:12:52,530
please email your honest feedback
to feedback@teamtreehouse.com.

189
00:12:52,530 --> 00:12:53,811
Thanks for watching.