Join 50,000+ readers learning how to use AI in just 5 minutes daily.
Completely free, unsubscribe at any time.
Summary
In Module 2 of the AI for Everyone course, the focus is on providing an intuitive understanding of how AI systems work. The module breaks down complex concepts by showing AI's applications in analyzing tabular data, recognizing images, processing natural language, and understanding speech. Participants are introduced to the fundamental mathematical principles such as linear regression and neural networks, demonstrating that AI is essentially sophisticated mathematics. Practical examples, like predicting housing prices and facial recognition systems, are used to illustrate the concepts, highlighting the importance of math skills acquired in secondary school as foundational for understanding AI.
Highlights
AI is essentially a complex form of maths, making it accessible with basic math skills! 🎓
Predict housing prices efficiently using linear regression techniques. 🏡
Neural networks transform how AI interprets data, from housing prices to speech. 🔍
Facial recognition involves matching vector profiles to stored data using cosine similarity. 🖼️
NLP pipelines convert language into vectors for AI training and sentiment analysis. 📖
Speech processing involves transforming audio waves into vectors for recognition. 🔊
Advanced AI models like Megatron Turing use billions of parameters for tasks like NLP. 🤖
Key Takeaways
AI is essentially advanced mathematics, not magic! ✨
Linear regression can be used to predict outcomes in various scenarios, like housing prices. 🏠
Neural networks are complex versions of simple mathematical equations. 📐
Facial recognition systems use neural networks trained with vast datasets. 🤳
Natural Language Processing (NLP) transforms text into numeric vectors for analysis. 💬
AI is trained using real data and algorithms such as gradient descent. 📊
Understanding local language nuances is crucial for accurate speech processing. 🗣️
The skills learned in secondary school math are foundational for grasping AI concepts. 📚
Overview
Module 2 dives deeper into understanding how AI operates by simplifying complex systems into relatable examples and accessible math. From predicting real estate prices using linear regression to understanding the fundamentals of neural networks, this module breaks down AI into its mathematical essence. It's all about seeing AI as an extension of what you might have learned in school, even as it tackles tasks as sophisticated as facial recognition.
Participants are guided through the applications of AI in recognizing images and processing language. The power of AI is made apparent through examples like facial recognition systems leveraging trained neural networks. These systems classify images, analyze sentiments in text, and even transform audiowaves into interpretable data, demonstrating AI's broad range and capability.
By demystifying AI and connecting it back to math principles, learners are empowered to see AI systems as an extension of mathematical learnings. The module emphasizes that AI is not an enigma but a tool grounded in math fundamentals accessible to those familiar with them, preparing participants for real-world AI applications in future modules.
AI4E V3 Module 2 Transcription
00:00 - 00:30 welcome back to module 2 of ai for everyone in
this section we'll help you get an intuitive understanding of how ai system works we're not
going to be very robust or rigorous here the intent is to get you to have a good idea of how ai
works and appreciate that ai is really just maths we'll show you how ai can be used on tabular data how ai can see how ai reads and how ai can
hear once you understand this you can better
00:30 - 01:00 understand the various ai applications and
systems that you will interact with every day we briefly discuss y equals to mx plus c in module
1 of this course let's dive a little bit deeper here let's assume we can predict hdb price with
just the floor area we collect prices of recently sold hdb price and note the corresponding floor
area say we collected four data points as shown
01:00 - 01:30 it looks like we can build a model by
drawing a straight line through the points one common method to fit a straight
line is to use the least squares method the basic idea here is to minimize the
errors the distance between the data point and the estimated line the error function is as
shown and from your math and secondary school to minimize the function you can
differentiate the error function and set the value to zero if you work out the
maths we get the final equations for m and c
01:30 - 02:00 once the model has been built we can
then determine the price of a hdb flat by entering the floor area however the real world is more complicated a better model to predict the hdb flat
should probably include not only the floor area but also which floor the flat is on
and whether it is near the mrt or a school
02:00 - 02:30 fortunately we can extend the earlier y equals mx
plus c into y equals to m1 x1 plus m2 x2 and so on this is known as multiple linear regression
multiple linear regression is a form of machine learning algorithm simple but very powerful and
easy to understand do note that not all problems can be modeled as such assuming the hdb price
can be modeled linearly using the same least
02:30 - 03:00 squares method but this time for multiple linear
regression we can find the values of m1 m2 m3 m4 and c with m1 m2 m3 m4 found we can now apply
it to say a 120 square meter flat near a school no mrt and on the fifth floor if we plug in the
numbers we get 300 000. however note that the
03:00 - 03:30 algorithms have no concept or idea of what
is floor area which floor the hdb flat is on what is the mrt or what is the school to
the algorithm it is just a set of variables we will now show intuitively how an artificial
neuron works using the hdb price prediction as an example this neuron was shown earlier the
weighted inputs are sum an activation function is then applied the activation function is
modeled after how the brain neuron works
03:30 - 04:00 if the inputs are strong enough it will then fire
now we overlay the hdb example you can see how the same multiple linear regression can be mapped
into the form of a neuron intuitively neural networks is nothing more than a more complicated
version of the familiar y equals to mx plus c and a neat way to represent this
is through vector and matrices
04:00 - 04:30 again which is something that you
have learned in secondary school let's extend our intuition further the input
layer is connected to a node in the hidden layer which further connects to an output node in
the output layer typically the hidden layer is a bit more complicated than just one node we
can add another node and connect them as shown
04:30 - 05:00 and then another node this will be a
typical diagram for a neural network also we can see that the mathematics involve from the input layer to the hidden layer is your
secondary school mats of vectors and matrices question how many weights or amps
is there in this neural network 15. so we need a way to find the 15 m's or
parameters because of the non-linear activation
05:00 - 05:30 function used we cannot use the least quest
method here we will need another way let's use predicting hdb flat price again and the earlier
neural network constructed we will initialize the weights m1 to m15 with some random values it
doesn't really matter what it is in the beginning we'll take the first row of data of hdb price and
feed it to the neural network this input data is
05:30 - 06:00 often represented as a vector we will compute the
predicted price which will obviously be incorrect because the weights were randomly initialized
the computation in this forward pass is basically vector matrix multiplication
we now compute the error if the predicted price is higher than the actual price we do
a backward pass and adjust the weight smaller
06:00 - 06:30 similarly if the predict price is lower than
the actual price we increase the weights we then move on to the next set of price data and
continue to adjust the weights based on the errors this technique of propagating the errors backwards
is known as back propagation of course we do not just adjust the weights randomly we use mats
specifically minimization of the arrow function similar to what you saw in earlier slides and
a technique known as gradient descent again
06:30 - 07:00 maths which you will have likely learnt in your
secondary school we may need thousands of training examples to find an optimal set of weights which
can predict the price of a hdb flat correctly one question you may have is how do i know how
many nodes and hidden layers do i need or how many rows of data do i need well it is often more art
than science and includes lots of trial and error
07:00 - 07:30 the earlier neural network only has 15 weights
or parameters to optimize microsoft and nvidia announced on 11th october 2021 megatron during
a neural network with 530 billion parameters it has hundreds of hidden layers with many nodes
per layer the megatron turing is a neural network for natural language processing now let's explore
how a computer see images are represented as
07:30 - 08:00 pixels in a computer in black and white pictures
the areas here where the picture appears the pixel would be a one and where there are no picture it
will be a zero the number one here is shown in a five by five pixel square we can stretch out the
pixels vertically as shown and it now becomes a vector of zeros and ones which we can now use to
train a neural network this neural network will have an input layer of 25 nodes and if we use a
hidden layer with four nodes we will end up with
08:00 - 08:30 a hundred amps of parameters we need to find as we
want to predict numbers from one to nine including zero we could have an output layer with 10 nodes
it's representing the value of 1 to 9 and 0. there is a total of only 140 parameters in
this simple example with 25 pixels image today's cameras and even your smartphone
have sensors that are 24 megapixels or bigger
08:30 - 09:00 which means you have at least 100 million
parameters neural network researchers have developed more advanced techniques today
such as convolution neural networks or cnn that reduce the number of parameters required
now that you understand how a computer can see an image and can be trained to recognize it let's
briefly discuss a common computer vision system you may encounter in office facial recognition
entry systems typically the vendor would train the
09:00 - 09:30 neural network with millions of pictures of faces
which is tagged with a corresponding unique id typically 128 bit long vector once the model has
been trained it will be deployed into your office your company will then ask you to provide your
latest photo the photo will be shown to the neural network and it will generate a unique id a 128
bit long vector this id together with your name and staff code will be saved into a database and
when you come back to the office after the weekend
09:30 - 10:00 and try to enter the office your picture will be
captured at the door and the same neural network will generate a unique 128 bit long vector now the
system will find the closest match to this vector and to see how close two vectors are you can use
cosine similarity a technique you probably studied in your secondary school trigonometry class why do
we not try to find the exact same 128-bit vector
10:00 - 10:30 well remember the vector stored into the
database was based on the photo you submitted which could be taken a few months ago or a
few years ago or photoshop over the weekend you went to sentosa and got a 10 or you now
spot a different haircut or you have lost weight these differences would mean the neural
network would have generated a 128-bit vector that is different from the ones stored
in the database based on your original photo
10:30 - 11:00 but this is okay we only need to find the
closest match and the vendor will work with your company to determine the level of closeness
that best fits the company's security policies now let's discuss natural
language processing or nlp next nlp uses techniques from computer science ai and
linguistics it has been used to classify documents translate one language to a different language
chatbots and auto completion of sentences a
11:00 - 11:30 feature i find very useful in the gmail
email client computers can only work with numbers specifically vectors and matrices so
text data needs to be converted into vectors here is a typical nlp pipeline say we have
the following sentences this laksa is spicy we love it we can do sentence segmentation
to convert the text into two sentences
11:30 - 12:00 and then word tokenization to break
up the sentences into individual words we may also want to remove stop words words
like is the this to reduce the number of words for the algorithm we then apply stemming
or lamentization to get to the root word for example the root word for spicy here would be
spice once the data is clean we can apply various algorithms to convert it into vectors let's
see a simple example let's see how we convert the following sentence this is spicy we love it
into a vector the typical vocabulary of a person
12:00 - 12:30 is thirty thousand words so assume we have a
dictionary of thirty thousand words as shown we then place a one into each location of the
dictionary where the words in the sentence appear so we'll end up with a thirty thousand long
vector with lots of zeros and only seven ones with the vectors formed we could then feed it
to the neural network and train it for example
12:30 - 13:00 for positive and negative sentiments so here is
a positive sentiment that is tagged with a one we use supervised learning
to train the neural network and this is another sentence which has a
negative sentiment and is tagged with a minus one will of course need thousands of labeled sentences some negative and some positive
sentiments to train the neural network once the neural network has been trained
it can then be used to classify positive
13:00 - 13:30 and negative sentiments in text or
comments made on social media for example note that in using the trained neural network
there is no need to label the sentence anymore 30 000 long vectors don't really make sense so
researchers have developed algorithms to convert text to vectors that are shorter
typically 32 or 128 bit long vectors there are several methods to do text
vectorization but we will not cover them here
13:30 - 14:00 more importantly word vectors have been found to
embed meanings in them for example here the words puppy and dog are close to each other since puppy
is a young dog whereas you will not expect to find the word cat or house to be close to the dog
vector another interesting aspect of word vectors is that you can do mathematics with them for
example king minus man plus woman equals queen
14:00 - 14:30 with images you converted images into
numbers vectors specifically with nlp the same you converted it into vectors
or what is known as word embeddings with speech what do you need to do
yes convert it into a vector again let's say we have an audio clip of the
word hello we then slice the audio clip in 20 millisecond slices and use the
amplitude as the value of the vector
14:30 - 15:00 and we know its output value or label is a word
hello we can now use supervised machine learning to train the neural network of course we have
to collect thousands of hours of audio clips slice them annotate them and use the annotated
audio clip to train the neural network with a trained neural network we can now
present an unlabeled audio clip which goes through the same process of slicing
the audio in 20 millisecond slices
15:00 - 15:30 to extract out the value of the vector but
this time we do not know what the output is we fit the vector to the neural network
and the neural network will produce output it may generate hello or a low something close
but not necessarily exact we take the generated output and pass it through a dictionary and
we get the best case what is important here
15:30 - 16:00 is that you need properly trained annotators
to listen carefully to the spoken sentences and label the sentences correctly often with
local nuances especially a language like singlish this is particularly important to understand
and that is why for speech annotation it is often hard to outsource to someone who is not
local and may not understand the local slang i hope by now you can see that ai is really just
maths ai is not magic and the meds you need to
16:00 - 16:30 understand how ai works intuitively is something
that you have learned in secondary school this ends module two i hope you now have
an intuitive understanding of ai the mess required to understand how ai works is something
you have already studied in secondary school with techniques like least quest method
differentiation and finding the minimum and maximum of functions we also showed the
maths behind how ai is used for computer vision
16:30 - 17:00 natural language processing and
speech processing in the next module we'll walk you through several real world
use cases of ai done right here in singapore