Understanding AI through Intuition

AI4E V3 Module 2

Estimated read time: 1:20

    AI is evolving every day. Don't fall behind.

    Join 50,000+ readers learning how to use AI in just 5 minutes daily.

    Completely free, unsubscribe at any time.

    Summary

    In Module 2 of the AI for Everyone course, the focus is on providing an intuitive understanding of how AI systems work. The module breaks down complex concepts by showing AI's applications in analyzing tabular data, recognizing images, processing natural language, and understanding speech. Participants are introduced to the fundamental mathematical principles such as linear regression and neural networks, demonstrating that AI is essentially sophisticated mathematics. Practical examples, like predicting housing prices and facial recognition systems, are used to illustrate the concepts, highlighting the importance of math skills acquired in secondary school as foundational for understanding AI.

      Highlights

      • AI is essentially a complex form of maths, making it accessible with basic math skills! 🎓
      • Predict housing prices efficiently using linear regression techniques. 🏡
      • Neural networks transform how AI interprets data, from housing prices to speech. 🔍
      • Facial recognition involves matching vector profiles to stored data using cosine similarity. 🖼️
      • NLP pipelines convert language into vectors for AI training and sentiment analysis. 📖
      • Speech processing involves transforming audio waves into vectors for recognition. 🔊
      • Advanced AI models like Megatron Turing use billions of parameters for tasks like NLP. 🤖

      Key Takeaways

      • AI is essentially advanced mathematics, not magic! ✨
      • Linear regression can be used to predict outcomes in various scenarios, like housing prices. 🏠
      • Neural networks are complex versions of simple mathematical equations. 📐
      • Facial recognition systems use neural networks trained with vast datasets. 🤳
      • Natural Language Processing (NLP) transforms text into numeric vectors for analysis. 💬
      • AI is trained using real data and algorithms such as gradient descent. 📊
      • Understanding local language nuances is crucial for accurate speech processing. 🗣️
      • The skills learned in secondary school math are foundational for grasping AI concepts. 📚

      Overview

      Module 2 dives deeper into understanding how AI operates by simplifying complex systems into relatable examples and accessible math. From predicting real estate prices using linear regression to understanding the fundamentals of neural networks, this module breaks down AI into its mathematical essence. It's all about seeing AI as an extension of what you might have learned in school, even as it tackles tasks as sophisticated as facial recognition.

        Participants are guided through the applications of AI in recognizing images and processing language. The power of AI is made apparent through examples like facial recognition systems leveraging trained neural networks. These systems classify images, analyze sentiments in text, and even transform audiowaves into interpretable data, demonstrating AI's broad range and capability.

          By demystifying AI and connecting it back to math principles, learners are empowered to see AI systems as an extension of mathematical learnings. The module emphasizes that AI is not an enigma but a tool grounded in math fundamentals accessible to those familiar with them, preparing participants for real-world AI applications in future modules.

            AI4E V3 Module 2 Transcription

            • 00:00 - 00:30 welcome back to module 2 of ai for everyone in  this section we'll help you get an intuitive   understanding of how ai system works we're not  going to be very robust or rigorous here the   intent is to get you to have a good idea of how ai  works and appreciate that ai is really just maths   we'll show you how ai can be used on tabular data how ai can see how ai reads and how ai can  hear once you understand this you can better
            • 00:30 - 01:00 understand the various ai applications and  systems that you will interact with every day we briefly discuss y equals to mx plus c in module  1 of this course let's dive a little bit deeper   here let's assume we can predict hdb price with  just the floor area we collect prices of recently   sold hdb price and note the corresponding floor  area say we collected four data points as shown
            • 01:00 - 01:30 it looks like we can build a model by  drawing a straight line through the points   one common method to fit a straight  line is to use the least squares method   the basic idea here is to minimize the  errors the distance between the data point   and the estimated line the error function is as  shown and from your math and secondary school   to minimize the function you can  differentiate the error function   and set the value to zero if you work out the  maths we get the final equations for m and c
            • 01:30 - 02:00 once the model has been built we can  then determine the price of a hdb flat   by entering the floor area however the real world is more complicated   a better model to predict the hdb flat  should probably include not only the floor   area but also which floor the flat is on  and whether it is near the mrt or a school
            • 02:00 - 02:30 fortunately we can extend the earlier y equals mx  plus c into y equals to m1 x1 plus m2 x2 and so on   this is known as multiple linear regression  multiple linear regression is a form of machine   learning algorithm simple but very powerful and  easy to understand do note that not all problems   can be modeled as such assuming the hdb price  can be modeled linearly using the same least
            • 02:30 - 03:00 squares method but this time for multiple linear  regression we can find the values of m1 m2 m3 m4   and c with m1 m2 m3 m4 found we can now apply  it to say a 120 square meter flat near a school   no mrt and on the fifth floor if we plug in the  numbers we get 300 000. however note that the
            • 03:00 - 03:30 algorithms have no concept or idea of what  is floor area which floor the hdb flat is   on what is the mrt or what is the school to  the algorithm it is just a set of variables   we will now show intuitively how an artificial  neuron works using the hdb price prediction as   an example this neuron was shown earlier the  weighted inputs are sum an activation function   is then applied the activation function is  modeled after how the brain neuron works
            • 03:30 - 04:00 if the inputs are strong enough it will then fire  now we overlay the hdb example you can see how the   same multiple linear regression can be mapped  into the form of a neuron intuitively neural   networks is nothing more than a more complicated  version of the familiar y equals to mx plus c and a neat way to represent this  is through vector and matrices
            • 04:00 - 04:30 again which is something that you  have learned in secondary school   let's extend our intuition further the input  layer is connected to a node in the hidden layer   which further connects to an output node in  the output layer typically the hidden layer   is a bit more complicated than just one node we  can add another node and connect them as shown
            • 04:30 - 05:00 and then another node this will be a  typical diagram for a neural network   also we can see that the mathematics involve   from the input layer to the hidden layer is your  secondary school mats of vectors and matrices   question how many weights or amps  is there in this neural network 15. so we need a way to find the 15 m's or  parameters because of the non-linear activation
            • 05:00 - 05:30 function used we cannot use the least quest  method here we will need another way let's use   predicting hdb flat price again and the earlier  neural network constructed we will initialize   the weights m1 to m15 with some random values it  doesn't really matter what it is in the beginning we'll take the first row of data of hdb price and  feed it to the neural network this input data is
            • 05:30 - 06:00 often represented as a vector we will compute the  predicted price which will obviously be incorrect   because the weights were randomly initialized  the computation in this forward pass is   basically vector matrix multiplication  we now compute the error if the predicted   price is higher than the actual price we do  a backward pass and adjust the weight smaller
            • 06:00 - 06:30 similarly if the predict price is lower than  the actual price we increase the weights we   then move on to the next set of price data and  continue to adjust the weights based on the errors   this technique of propagating the errors backwards  is known as back propagation of course we do not   just adjust the weights randomly we use mats  specifically minimization of the arrow function   similar to what you saw in earlier slides and  a technique known as gradient descent again
            • 06:30 - 07:00 maths which you will have likely learnt in your  secondary school we may need thousands of training   examples to find an optimal set of weights which  can predict the price of a hdb flat correctly   one question you may have is how do i know how  many nodes and hidden layers do i need or how many   rows of data do i need well it is often more art  than science and includes lots of trial and error
            • 07:00 - 07:30 the earlier neural network only has 15 weights  or parameters to optimize microsoft and nvidia   announced on 11th october 2021 megatron during  a neural network with 530 billion parameters   it has hundreds of hidden layers with many nodes  per layer the megatron turing is a neural network   for natural language processing now let's explore  how a computer see images are represented as
            • 07:30 - 08:00 pixels in a computer in black and white pictures  the areas here where the picture appears the pixel   would be a one and where there are no picture it  will be a zero the number one here is shown in a   five by five pixel square we can stretch out the  pixels vertically as shown and it now becomes a   vector of zeros and ones which we can now use to  train a neural network this neural network will   have an input layer of 25 nodes and if we use a  hidden layer with four nodes we will end up with
            • 08:00 - 08:30 a hundred amps of parameters we need to find as we  want to predict numbers from one to nine including   zero we could have an output layer with 10 nodes  it's representing the value of 1 to 9 and 0.   there is a total of only 140 parameters in  this simple example with 25 pixels image   today's cameras and even your smartphone  have sensors that are 24 megapixels or bigger
            • 08:30 - 09:00 which means you have at least 100 million  parameters neural network researchers have   developed more advanced techniques today  such as convolution neural networks or cnn   that reduce the number of parameters required  now that you understand how a computer can see   an image and can be trained to recognize it let's  briefly discuss a common computer vision system   you may encounter in office facial recognition  entry systems typically the vendor would train the
            • 09:00 - 09:30 neural network with millions of pictures of faces  which is tagged with a corresponding unique id   typically 128 bit long vector once the model has  been trained it will be deployed into your office   your company will then ask you to provide your  latest photo the photo will be shown to the neural   network and it will generate a unique id a 128  bit long vector this id together with your name   and staff code will be saved into a database and  when you come back to the office after the weekend
            • 09:30 - 10:00 and try to enter the office your picture will be  captured at the door and the same neural network   will generate a unique 128 bit long vector now the  system will find the closest match to this vector   and to see how close two vectors are you can use  cosine similarity a technique you probably studied   in your secondary school trigonometry class why do  we not try to find the exact same 128-bit vector
            • 10:00 - 10:30 well remember the vector stored into the  database was based on the photo you submitted   which could be taken a few months ago or a  few years ago or photoshop over the weekend   you went to sentosa and got a 10 or you now  spot a different haircut or you have lost weight   these differences would mean the neural  network would have generated a 128-bit   vector that is different from the ones stored  in the database based on your original photo
            • 10:30 - 11:00 but this is okay we only need to find the  closest match and the vendor will work with   your company to determine the level of closeness  that best fits the company's security policies now let's discuss natural  language processing or nlp next   nlp uses techniques from computer science ai and  linguistics it has been used to classify documents   translate one language to a different language  chatbots and auto completion of sentences a
            • 11:00 - 11:30 feature i find very useful in the gmail  email client computers can only work with   numbers specifically vectors and matrices so  text data needs to be converted into vectors   here is a typical nlp pipeline say we have  the following sentences this laksa is spicy we   love it we can do sentence segmentation  to convert the text into two sentences
            • 11:30 - 12:00 and then word tokenization to break  up the sentences into individual words   we may also want to remove stop words words  like is the this to reduce the number of words   for the algorithm we then apply stemming  or lamentization to get to the root word   for example the root word for spicy here would be  spice once the data is clean we can apply various   algorithms to convert it into vectors let's  see a simple example let's see how we convert   the following sentence this is spicy we love it  into a vector the typical vocabulary of a person
            • 12:00 - 12:30 is thirty thousand words so assume we have a  dictionary of thirty thousand words as shown   we then place a one into each location of the  dictionary where the words in the sentence appear   so we'll end up with a thirty thousand long  vector with lots of zeros and only seven ones   with the vectors formed we could then feed it  to the neural network and train it for example
            • 12:30 - 13:00 for positive and negative sentiments so here is  a positive sentiment that is tagged with a one   we use supervised learning  to train the neural network and this is another sentence which has a  negative sentiment and is tagged with a minus one   will of course need thousands of labeled sentences   some negative and some positive  sentiments to train the neural network once the neural network has been trained  it can then be used to classify positive
            • 13:00 - 13:30 and negative sentiments in text or  comments made on social media for example   note that in using the trained neural network  there is no need to label the sentence anymore   30 000 long vectors don't really make sense so  researchers have developed algorithms to convert   text to vectors that are shorter  typically 32 or 128 bit long vectors   there are several methods to do text  vectorization but we will not cover them here
            • 13:30 - 14:00 more importantly word vectors have been found to  embed meanings in them for example here the words   puppy and dog are close to each other since puppy  is a young dog whereas you will not expect to   find the word cat or house to be close to the dog  vector another interesting aspect of word vectors   is that you can do mathematics with them for  example king minus man plus woman equals queen
            • 14:00 - 14:30 with images you converted images into  numbers vectors specifically with nlp   the same you converted it into vectors  or what is known as word embeddings   with speech what do you need to do  yes convert it into a vector again let's say we have an audio clip of the  word hello we then slice the audio clip   in 20 millisecond slices and use the  amplitude as the value of the vector
            • 14:30 - 15:00 and we know its output value or label is a word  hello we can now use supervised machine learning   to train the neural network of course we have  to collect thousands of hours of audio clips   slice them annotate them and use the annotated  audio clip to train the neural network   with a trained neural network we can now  present an unlabeled audio clip which goes   through the same process of slicing  the audio in 20 millisecond slices
            • 15:00 - 15:30 to extract out the value of the vector but  this time we do not know what the output is   we fit the vector to the neural network  and the neural network will produce output   it may generate hello or a low something close  but not necessarily exact we take the generated   output and pass it through a dictionary and  we get the best case what is important here
            • 15:30 - 16:00 is that you need properly trained annotators  to listen carefully to the spoken sentences   and label the sentences correctly often with  local nuances especially a language like singlish   this is particularly important to understand  and that is why for speech annotation it is   often hard to outsource to someone who is not  local and may not understand the local slang i hope by now you can see that ai is really just  maths ai is not magic and the meds you need to
            • 16:00 - 16:30 understand how ai works intuitively is something  that you have learned in secondary school   this ends module two i hope you now have  an intuitive understanding of ai the mess   required to understand how ai works is something  you have already studied in secondary school   with techniques like least quest method  differentiation and finding the minimum   and maximum of functions we also showed the  maths behind how ai is used for computer vision
            • 16:30 - 17:00 natural language processing and  speech processing in the next module   we'll walk you through several real world  use cases of ai done right here in singapore