Exploring the Depth of RNNs

Recurrent Neural Network (RNN) Tutorial | RNN LSTM Tutorial | Deep Learning Tutorial | Simplilearn

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    Simplilearn's tutorial on Recurrent Neural Networks (RNNs) dives into the fundamentals of neural networks, emphasizing RNNs which are tailored for sequential data tasks. Richard Kirschner elucidates the vanishing and exploding gradient problems inherent in RNNs and showcases the application of Long Short-Term Memory (LSTM) networks to address these issues. The video delves into how RNNs are pivotal for tasks like language processing, time series prediction, and machine translation. A hands-on example predicting Google stock prices illustrates the power of RNNs in real-world scenarios.

      Highlights

      • RNNs mimic the human brain's sequential data processing. 🧠
      • Vanishing and exploding gradients are major challenges for RNNs. 🚧
      • LSTM networks excel at managing long-term dependencies in data. 🔄
      • Hands-on Google stock prediction using LSTM demonstrates RNN capabilities. 📊

      Key Takeaways

      • Understand the basics of RNNs and why they're crucial for handling sequential data. 🤖
      • Explore how LSTM networks solve the vanishing and exploding gradient problems. 📈
      • Learn how RNNs are used in everyday applications like Google's autocomplete and stock prediction. 🚀

      Overview

      The tutorial begins with a comprehensive overview of neural networks, particularly focusing on Recurrent Neural Networks (RNNs). Richard Kirschner from Simplilearn guides us through the unique features of RNNs, particularly their ability to handle sequential data by maintaining memory of previous inputs, making them essential for time-dependent tasks.

        RNNs face significant challenges like the vanishing and exploding gradient problems, which can hinder their performance. However, Long Short-Term Memory (LSTM) networks are introduced as a robust solution to these issues, with their ability to remember long-term dependencies, enhancing the functionality of traditional RNNs.

          A practical use case is presented on predicting Google's stock prices from historical data, showcasing how to implement an LSTM model using Python libraries. This example illustrates not only the setup and training of the model but also provides insights into evaluating its performance and visualizing results.

            Chapters

            • 00:00 - 00:30: Introduction to RNN Tutorial This chapter provides an introduction to the RNN (Recurrent Neural Network) tutorial led by Richard Kirschner from the SimplyLearn team. The focus is on understanding neural networks, popular neural networks, and specifically recurrent neural networks, including their importance and framework. The goal is to build foundational knowledge before delving deeper into RNNs.
            • 00:30 - 05:00: Understanding Recurrent Neural Networks (RNN) This chapter introduces recurrent neural networks (RNNs), emphasizing their working principles and challenges such as the vanishing and exploding gradient problem.
            • 05:00 - 10:00: Neural Networks in Deep Learning The chapter titled 'Neural Networks in Deep Learning' starts with a focus on LSTM, which is essentially a type of RNN (Recurrent Neural Network). It emphasizes the significance of understanding use cases before delving deeply into the technical details. The discussion is geared towards providing an introductory overview of RNNs, a fundamental type of neural network used in deep learning. The transcript mentions the practical application of RNNs using the example of Google's auto-complete feature, which predicts the continuation of a user's sentence. This feature not only exemplifies a real-world application of RNNs but also underscores their importance. It highlights the efficiency and time-saving benefits, like reducing the need for typing when using Google's predictive text capabilities.
            • 10:00 - 15:00: Feed Forward Neural Networks This chapter delves into feed-forward neural networks and their role in analyzing data consisting of frequently occurring sequences of words. It explains how these models are used to predict the next word in a sentence based on past data. For example, when typing a partial search term like 'what is the best food to eat in Los', the network uses historical data to predict 'Vegas' as a likely completion, offering typical autocomplete suggestions.
            • 15:00 - 20:00: Recurrent Neural Networks and Their Applications The chapter delves into the concept of Recurrent Neural Networks (RNNs) and their applications. It begins by discussing the convenience and power of autofill features in technology as a precursor to understanding neural networks. This highlights the adaptability and efficiency of AI in enhancing user experience, as seen in tools like Google Search and Microsoft Word, among others. The text then transitions to a broader explanation of neural networks, foundational to deep learning, which are composed of multiple layers that allow for complex data processing. The chapter promises to explore these networks in greater detail, offering insights into how RNNs stand out in handling sequential data and tasks such as language modeling and time-series prediction.
            • 20:00 - 25:00: Vanishing and Exploding Gradient Problem The chapter discusses the inspiration of deep learning from the structure and functions of the human brain. It highlights how artificial intelligence, including deep learning, is often evaluated in comparison to human functions. The learning process involves processing huge volumes of data and utilizing complex algorithms for training neural networks. An example is provided involving image pixels of different dog breeds: a floppy-eared Labrador and a German Shepherd. The discussion underscores the importance of such evaluations and comparisons to advance AI technologies.
            • 25:00 - 30:00: Long Short Term Memory (LSTM) Networks This chapter explains the process of how image data is fed into a neural network, specifically focusing on LSTM networks. It describes the workflow starting from the input layer where images are formatted to uniform sizes and color content, moving through hidden layers where data is processed, and ending in the output layer. The changes and specifics of data propagation in recurrent neural networks like LSTMs are briefly touched upon, indicating further detailed discussion later in the text.
            • 30:00 - 40:00: LSTM Network Processing Steps The chapter discusses the final output layer of an LSTM network, specifically tailored to identify and classify the breed of a dog, such as a German Shepherd or a Labrador. It highlights the forward propagation characteristic of these networks, emphasizing that they do not require memorizing past outputs as they process data in a forward manner. The content also casually notes an unrelated picture of someone dressed in a suit, adding a touch of personal anecdote from the speaker.
            • 40:00 - 60:00: Case Study: Predicting Stock Prices Using LSTM This chapter begins with an overview of popular neural networks, including the feed-forward neural network for regression and classification problems, the convolutional neural network for image recognition, the deep neural network for acoustic modeling, the deep belief network for cancer detection, and the recurrent neural network for speech recognition. The chapter explores how these networks can be mixed and adapted for various applications, setting the stage for a detailed case study on predicting stock prices using LSTM (Long Short-Term Memory) networks.
            • 60:00 - 80:00: Training and Evaluating the LSTM Model The chapter begins with a discussion on the general applications of LSTM models beyond their classic use cases. It also delves into the architecture of feed-forward neural networks, highlighting how information in these models flows in a linear, unidirectional manner from input to output layers through any hidden layers. The narrative explains that unlike in other network structures, there are no cycles or loops involved in the information flow, signifying a straightforward progression of data processing. The chapter aims to orient the reader on the structural and functional aspects of feed-forward neural networks in the context of training and evaluating LSTM models.
            • 80:00 - 95:30: Visualizing Results and Conclusion The chapter focuses on the limitations of feedforward neural networks, particularly their inability to handle sequential data due to the lack of memory or time scope. This results in decisions being based solely on the current input, without considering past or future inputs, which recurrent neural networks can address.

            Recurrent Neural Network (RNN) Tutorial | RNN LSTM Tutorial | Deep Learning Tutorial | Simplilearn Transcription

            • 00:00 - 00:30 welcome to the rnn tutorial that's the recurrent neural network my name is richard kirschner i'm with the simplylearn team that's www.simplylearn.com get certified get ahead what's in it for you we will start with the course of fundamentals what is a neural network in popular neural networks it's important to know the framework we're in and what we're going to be looking at specifically then we'll touch on why a recurrent neural network
            • 00:30 - 01:00 what is a recurrent neural network and how does an rn in work one of the big things about rnns is what they call the vanishing and exploding gradient problem so we'll look at that and then we're going to be using a use case study that's going to be in keras on tensorflow cross is a python module for doing neural networks in deep learning and in there there's the what they call long short term memory lstm and then we'll use the use case to implement our lstm on the keras so when you see that
            • 01:00 - 01:30 lstm that is basically the rnn network and we'll get into that the use case is always my favorite part before we dive into any of this we're going to take a look at what is an rnn or an introduction to the rnn do you know how google's auto complete feature predicts the rest of the words a user is typing i love that auto complete feature as i'm typing away saves me a lot of time i can just kind of hit the enter key and it auto fills everything and i don't have to type as much well first there's a
            • 01:30 - 02:00 collection of large volumes of most frequently occurring consecutive words this is fed into a recurrent neural network analysis the data by finding the sequence of words occurring frequently and builds a model to predict the next word in the sentence and then google what is the best food to eat in loss i'm guessing you're going to say loss mexico no it's going to be las vegas so the google search will take a look at that and say hey the most common autocomplete is going to be vegas in there it usually gives you three or four different
            • 02:00 - 02:30 choices so it's a very powerful tool it saves us a lot of time especially when we're doing a google search or even in microsoft words has a some people get very mad at it auto fills with the wrong stuff but you know you're typing away and it helps you autofill i have that in a lot of my different packages it's just a standard feature that we're all used to now so before we dive into the rnn and getting into the depths let's go ahead and talk about what is a neural network neural networks used in deep learning consist of different layers
            • 02:30 - 03:00 connected to each other and work on the structure and functions of a human brain you're going to see that thread human in human brain and human thinking throughout deep learning the only way we can evaluate an artificial intelligence or anything like that is to compare it to human function very important note on there and it learns from a huge volumes of data and it uses complex algorithm to train a neural net so in here we have image pixels of two different breeds of dog uh one looks like a nice floppy eared lab and one a german shepherd you
            • 03:00 - 03:30 know both wonderful breeds of animals that image then goes into an input layer that input layer might be formatted at some point because you have to let it know like you know different pictures are going to be different sizes and different color content then it'll feed into hidden layers so each of those pixels or each point of data goes in and then splits into the hidden layer which then goes into another hidden layer which then goes to an output layer r and n there's some changes in there which we're going to get into so it's not just a straightforward propagation of data
            • 03:30 - 04:00 like we've covered in many other tutorials and finally you have an output layer and the output layer has two outputs it has one that lights up if it's a german shepherd and another that lights up is if it's a labrador so identify as a dog's breed set networks do not require memorizing the past output so our forward propagation is just that it goes forward and doesn't have to be memorized stuff and you can see there that's not actually me in the picture uh dressed up in my suit i haven't worn a suit in years so as
            • 04:00 - 04:30 we're looking at this we're going to change it up a little bit before we cover that let's talk about popular neural networks first there's the feed forward neural network used in general regression and classification problems and we have the convolution neural network used for image recognition deep neural network used for acoustic modeling deep belief network used for cancer detection and recurrent neural network used for speech recognition now taken a lot of these and mixed them around a little bit so just because it's
            • 04:30 - 05:00 used for one thing doesn't mean it can't be used for other modeling but generally this is where the field is and this is how those models are generally being used right now so we talk about a feed forward neural network in a feed forward neural network information flows only in the forward direction from the input nodes through the hidden layers if any into the output nodes there are no cycles or loops in the network and so you can see here we have our input layer i was talking about how it just goes straight forward into the hidden layers so each one of those connects and then connects to the next hidden layer connects to the output layer and of
            • 05:00 - 05:30 course we have a nice simplified version where it has a predicted output the refer to the input is x a lot of times and the output as y decisions are based on current input no memory about the past no future scope why recurrent neural network issues in feed forward neural network so one of the biggest issues is because it doesn't have a scope of memory or time a feed forward neural network doesn't know how to handle sequential data it only considers only the current input so if you have a
            • 05:30 - 06:00 series of things and because three points back affects what's happening now and what your output affects what's happening that's very important so whatever i put as an output is going to affect the next one a feed forward doesn't look at any of that it just looks at this is what's coming in and it cannot memorize previous inputs so it doesn't have that list of inputs coming in solution to feed forward neural network you'll see here where it says recurrent neural network and we have our x on the bottom going to h going to y that's your feed forward uh but right in the middle it has a value c so there's a
            • 06:00 - 06:30 whole another process so it's memorizing what's going on in the hidden layers and the hidden layers as they produce data feed into the next one so your hidden layer might have an output that goes off to y but that output goes back into the next prediction coming in what this does is this allows it to handle sequential data it considers the current input and also the previously received inputs and if we're going to look at general drawings and solutions we should also look at applications of the rnn image captioning
            • 06:30 - 07:00 rnn is used to caption an image by analyzing the activities present in it a dog catching a ball in midair that's very tough i mean you know we have a lot of stuff that analyzes images of a dog and the image of a ball but it's able to add one more feature in there that's actually catching the ball in midair time series prediction any time series problem like predicting the prices of stocks in a particular month can be solved using rnn and we'll dive into that in our use case and actually take a look at some stock one of the things you
            • 07:00 - 07:30 should know about analyzing stock today is that it is very difficult and if you're analyzing the whole stock the stock market at the new york stock exchange in the u.s produces somewhere in the neighborhood if you count all the individual trades in fluctuations by the second um it's like three terabytes a day of data so we're going to look at one stock just analyzing one stock is really tricky in here we'll give you a little jump on that so that's exciting but don't expect to get rich off of it immediately another application of the rnn is natural language processing text
            • 07:30 - 08:00 mining and sentiment analysis can be carried out using rnn for natural language processing and you can see right here the term natural language processing when you stream those three words together is very different than ice if i said processing language natural leap so the time series is very important when we're analyzing sentiments it can change the whole value of a sentence just by switching the words around or if you're just counting the words you may get one sentiment where if you actually look at the order they're in you get a completely different sentiment when it rains look
            • 08:00 - 08:30 for rainbows when it's dark look for stars both of these are positive sentiments and they're based upon the order of which the sentence is going in machine translation given an input in one language rnn can be used to translate the input into a different languages as output i myself very linguistically challenged but if you study languages and you're good with languages you know right away that if you're speaking english you would say big cat and if you're speaking spanish you would say cat big so that
            • 08:30 - 09:00 translation is really important to get the right order to get there's all kinds of parts of speech that are important to know by the order of the words here this person is speaking in english and getting translated and you can see here a person is speaking in english in this little diagram i guess that's denoted by the flags i have a flag i own it no um but they're speaking in english and it's getting translated into chinese italian french german and spanish languages some of the tools coming out are just so cool so somebody
            • 09:00 - 09:30 like myself who's very linguistically challenged i can now travel into worlds i would never think of because i can have something translate my english back and forth readily and i'm not stuck with a communication gap so let's dive into what is a recurrent neural network recurrent neural network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer sounds a little confusing when we start breaking it down it'll make more sense and usually we have a propagation
            • 09:30 - 10:00 forward neural network with the input layers the hidden layers the output layer with the recurrent neural network we turn that on its side so here it is and now our x comes up from the bottom into the hidden layers into y and they usually draw very simplified x to h with c as a loop a to y where a b and c are the perimeters a lot of times you'll see this kind of drawing in here digging closer and closer into the h and how it works going from left to right you'll see that the c goes in and then the x
            • 10:00 - 10:30 goes in so the x is going upward bound and c is going to the right a is going out and c is also going out that's where it gets a little confusing so here we have xn cn and then we have y out and c out and c is based on ht minus 1. so our value is based on the y and the h value are connected to each other they're not necessarily the same value because h can be its own thing and usually we draw this or we represent it as a function h of t equals a function of c where h of t
            • 10:30 - 11:00 minus 1 that's the last h output and x of t going in so it's the last output of h combined with the new input of x where h t is the new state fc is a function with the parameters c that's a common way of denoting it h t minus 1 is the old state coming out and then x of t is an input vector at time of step t well we need to cover types of recurrent neural networks and so the first one is the most common one which is a
            • 11:00 - 11:30 one-to-one single output one-to-one neural network is usually known as a vanilla neural network used for regular machine learning problems why because vanilla is usually considered kind of a just a real basic flavor but because it's very basic a lot of times they'll call it the vanilla neural network which is not the common term but it is you know kind of a slang term people will know what you're talking about usually if you say that then we run one to mini so you have a single input and you might have a multiple outputs in this case image captioning as we looked at earlier
            • 11:30 - 12:00 where we have not just looking at it as a dog but a dog catching a ball in the air and then you have mini to one network takes in a sequence of inputs examples sentiment analysis where a given sentence can be classified as expressing positive or negative sentiments and we looked at that as we were discussing if it rains look for a rainbow so positive sentiment where rain might be a negative sentiment if you're just adding up the words in there and then the course if you're going to do a one-to-one mini to one one to many
            • 12:00 - 12:30 there's many to many networks takes in a sequence of inputs and generates a sequence of outputs example machine translation so we have a lengthy sentence coming in english and then going out in all the different languages uh you know just a wonderful tool very complicated set of computations you know if you're a translator you realize just how difficult it is to translate into different languages one of the biggest things you need to understand when we're working with this neural network is what's called the vanishing gradient problem while training an rnn your slope
            • 12:30 - 13:00 can be either too small or very large and this makes training difficult when the slope is too small the problem is known as vanishing gradient and you'll see here they have a nice uh image loss of information through time so if you're pushing not enough information forward that information is lost and then when you go to train it you start losing the third word in the sentence or something like that or it doesn't quite follow the full logic of what you're working on exploding gradient problem oh this is one that runs into everybody when you're
            • 13:00 - 13:30 working with this particular neural network when the slope tends to grow exponentially instead of decaying this problem is called exploding gradient issues in gradient problem long tracking time poor performance bad accuracy and i'll add one more in there your computer if you're on a lower end computer testing out a model will lock up and give you the memory error explaining gradient problem consider the following two examples to understand what should be the next word in the sequence the person who took my bike and blank a
            • 13:30 - 14:00 thief the students who got into engineering with blank from asia and you can see in here we have our x value going in we have the previous value going forward and then you back propagate the error like you do with any neural network and as we're looking for that missing word maybe we'll have the person took my bike and blank was a thief and the student who got into engineering with a blank were from asia consider the following example the person who took the bike so we'll go back to the person who took the bike was
            • 14:00 - 14:30 blank a thief in order to understand what would be the next word in the sequence the rnn must memorize the previous context whether the subject was singular noun or a plural noun so was a thief as singular the student who got into engineering well in order to understand what would be the next word in the sequence the rnn must memorize the previous context whether the subject was singular noun or a plural noun and so you can see here the students who got into engineering with blank were from asia it might be sometimes difficult for
            • 14:30 - 15:00 the air to back propagate to the beginning of the sequence to predict what should be the output so when you run into the gradient problem we need a solution the solution to the gradient problem first we're going to look at exploding gradient where we have three different solutions depending on what's going on one is identity initialization so the first thing we want to do is see if we can find a way to minimize the identities coming in instead of having it identify everything just the important information we're looking at
            • 15:00 - 15:30 next is to truncate the back propagation so instead of having whatever information it's sending to the next series we can truncate what it's sending we can lower that particular set of layers make those smaller and finally is a gradient clipping so when we're training it we can clip what that gradient looks like and narrow the training model that we're using when you have a vanishing gradient the option problem we can take a look at weight initialization very similar to the identity but we're going to add more
            • 15:30 - 16:00 weights in there so it can identify different aspects of what's coming in better choosing the right activation function that's huge so we might be activating based on one thing and we need to limit that we haven't talked too much about activation function so we'll look at that just minimally there's a lot of choices out there and then finally there's long short term memory networks the lstms and we can make adjustments to that so just like we can clip the gradient as it comes out we can also expand on that we can increase the
            • 16:00 - 16:30 memory network the size of it so it handles more information and one of the most common problems in today's setup is what they call long-term dependencies suppose we try to predict the last word in the text the clouds are in the and you probably said sky here we do not need any further context it's pretty clear that the last word is going to be sky suppose we try to predict the last word in the text i have been staying in spain for the last 10 years i can speak fluent maybe you said portuguese or
            • 16:30 - 17:00 french no you probably said spanish the word we predict will depend on the previous few words in context here we need the context of spain to predict the last word in the text it's possible that the gap between the relevant information and the point where it is needed to become very large lstms help us solve this problem so the lstms are a special kind of recurrent neural network capable of learning long-term dependencies remembering information for long periods of time is
            • 17:00 - 17:30 their default behavior all recurrent neural networks have the form of a chain of repeating modules of neural network connections in standard rnns this repeating module will have a very simple structure such as a single tangent h layer lstms also have a chain-like structure but the repeating module has a different structure instead of having a single neural network layer there are four interacting layers communicating in a very special way lstms are a special kind of recurrent neural network capable
            • 17:30 - 18:00 of learning long-term dependencies remembering information for long periods of time is their default behavior ls tmss also have a chain like structure but the repeating module has a different structure instead of having a single neural network layer there are four interacting layers communicating in a very special way as you can see the deeper we dig into this the more complicated the graphs get in here i want you to note that you have x of t minus one coming in you have x of t coming in and you have x at t plus one
            • 18:00 - 18:30 and you have h of t minus one and h of t coming in and h of t plus one going out and of course uh on the other side is the output a um in the middle we have our tangent h but it occurs in two different places so not only when we're computing the exit t plus one or we getting the tangent h from x to t but we're also getting that value coming in from the x of t minus one so the short of it is as you look at these layers not only does it does the propagate through the first layer goes into the second
            • 18:30 - 19:00 layer back into itself but it's also going into the third layer so now we're kind of stacking those up and this can get very complicated as you grow that in size it also grows in memory too and in the amount of resources it takes but it's a very powerful tool to help us address the problem of complicated long sequential information coming in like we were just looking at in the sentence and when we're looking at our long short term memory network uh there's three steps of processing sensing in the lstms
            • 19:00 - 19:30 that we look at the first one is we want to forget irrelevant parts of the previous state you know a lot of times like you know is as in a unless we're trying to look at whether it's a plural noun or not they don't really play a huge part in the language so we want to get rid of them then selectively update cell state values so we only want to update the cell state values that reflect what we're working on and finally we want to put only output certain parts of the cell state so whatever is coming out we want to limit what's going out too and let's dig a
            • 19:30 - 20:00 little deeper into this let's just see what this really looks like uh so step one decides how much of the past it should remember first step in the lstm is to decide which information to be omitted in from the cell in that particular time step it is decided by the sigmoid function it looks at the previous state h to t minus 1 and the current input x t and computes the function so you can see over here we have a function of t equals the sigmoid function of the weight of f the h at t minus 1 and then
            • 20:00 - 20:30 x a t plus of course you have a bias in there with any of our neural networks so we have a bias function so f of t equals forget gate decides which information to delete that is not important from the previous time step considering an stm is fed with the following inputs from the previous and present time step alice is good in physics john on the other hand is good in chemistry so previous output john plays football well he told me yesterday over the phone that he had served as a captain of his college
            • 20:30 - 21:00 football team that's our current input so as we look at this the first step is the forget gate realizes there might be a change in context after encountering the first full stop compares with the current input sentence of x a t so we're looking at that full stop and then compares it with the input of the new sentence the next sentence talks about john so the information on alice is deleted okay that's important to know so we have this input coming in and if we're going to continue on with john then that's going to be the primary
            • 21:00 - 21:30 information we're looking at the position of the subject is vacated and is assigned to john and so in this one we've seen that we've weeded out a whole bunch of information and we're only passing information on john since that's now the new topic so step two is then to decide how much should this unit add to the current state in the second layer there are two parts one is the sigmoid function and the other is the tangent h in the sigmoid function it decides which values to let through 0 or 1. tangent h function gives a weightage to the values
            • 21:30 - 22:00 which are passed deciding their level of importance minus one to one and you can see the two formulas that come up uh the i of t equals the sigmoid of the weight of i h to t minus one x of t plus the bias of i and the c of t equals the tangent of h of the weight of c of h to t minus 1 x of t plus the bias of c so our i of t equals the input gate determines which information to let through based on its significance in the current time step if this seems a little complicated don't worry because a lot of
            • 22:00 - 22:30 the programming is already done when we get to the case study understanding though that this is part of the program is important when you're trying to figure out these what to set your settings at you should also note when you're looking at this it should have some semblance to your forward propagation neural networks where we have a value assigned to a weight plus a bias very important steps than any of the neural network layers whether we're propagating into them the information from one to the next or we're just doing a straightforward neural network
            • 22:30 - 23:00 propagation let's take a quick look at this what it looks like from the human standpoint as i step out of my suit again consider the current input at x of t john plays football well he told me yesterday over the phone that he had served as a captain of his college football team that's our input input gate analysis the important information john plays football and he was a captain of his college team is important he told me over the phone yesterday is less important hence it is forgotten this process of adding some new information
            • 23:00 - 23:30 can be done via the input gate now this example is as a human form and we'll look at training this stuff in just a minute but as a human being if i wanted to get this information from a conversation maybe it's a google voice listening in on you or something like that um how do we weed out the information that he was talking to me on the phone yesterday well i don't want to memorize that he talked to me on the phone yesterday or maybe that is important but in this case it's not i want to know that he was the captain of the football team i want to know that he served i want to know that john plays
            • 23:30 - 24:00 football and he was a captain of the college football team those are the two things that i want to take away as a human being again we measure a lot of this from the human viewpoint and that's also how we try to train them so we can understand these neural networks finally we get to step three decides what part of the current cell state makes it to the output the third step is to decide what will be our output first we run a sigmoid layer which decides what parts of the cell state make it to the output then we put the cell state through the tangent h to push the values to be
            • 24:00 - 24:30 between -1 and 1 and multiply it by the output of the sigmoid gate so when we talk about the output of t we set that equal to the sigmoid of the weight of 0 of the h of t minus 1 and back one step in time by the x of t plus of course the bias the h of t equals the outer t times the tangent of the tangent h of c of t so our o t equals the output gate allows the passed in information to impact the output in the current time step let's
            • 24:30 - 25:00 consider the example to predicting the next word in the sentence john played tremendously well against the opponent and one for his team for his contributions brave blank was awarded player of the match there could be a lot of choices for the empty space current input brave is an adjective adjectives describe a noun john could be the best output after brave thumbs up for john awarded player of the match and if you were to pull just the nouns out of the sentence team doesn't look right because that's not really the subject we're
            • 25:00 - 25:30 talking about contributions you know brave contributions or brave teen brave player brave match so you look at this and you can start to train this these this neural network so starts looking at and goes oh no john is what we're talking about so brave is an adjective john's going to be the best output and we give john a big thumbs up and then of course we jump into my favorite part the case study use case implementation of lstm let's predict the prices of stocks
            • 25:30 - 26:00 using the lstm network based on the stock price data between 2012 2016. we're going to try to predict the stock prices of 2017 and this will be a narrow set of data we're not going to do the whole stock market it turns out that the new york stock exchange generates roughly three terabytes of data per day that's all the different trades up and down of all the different stocks going on and each individual one second to second or nanosecond to nanosecond but we're going to limit that to just some very basic fundamental
            • 26:00 - 26:30 information so don't think you're going to get rich off this today but at least you can give an eye you can give a step forward in how to start processing something like stock prices a very valid use for machine learning in today's markets use case implementation of lstm let's dive in we're going to import our libraries we're going to import the training set and get the scaling going now if you watch any of our other tutorials a lot of these pieces just start to look very familiar because it's
            • 26:30 - 27:00 very similar setup let's take a look at that and just reminder we're going to be using anaconda the jupiter notebook so here i have my anaconda navigator when we go under environments i've actually set up a cross python 36 i'm in python36 and nice thing about anaconda especially the newer version i remember a year ago messing with anaconda in different versions of python in different environments anaconda now has a nice interface
            • 27:00 - 27:30 and i have this installed both on a ubuntu linux machine and on windows so it works fine on there you can go in here and open a terminal window and then in here once you're in the terminal window this is where you're going to start installing using pip to install your different modules and everything now we've already pre-installed them so we don't need to do that in here but if you don't have them installed in your particular environment you'll need to do that and of course you don't need to use the anaconda or the jupiter you can use whatever favorite python id you like i'm just a big fan of this because it keeps
            • 27:30 - 28:00 all my stuff separate you can see on this machine i have specifically installed one for cross since we're going to be working with cross under tensorflow we go back to home i've gone up here to application and that's the environment i've loaded on here and then we'll click on the launch jupiter notebook now i've already in my jupiter notebook have set up a lot of stuff so that we're ready to go kind of like martha stewart's in the old cooking show so we want to make sure we have all our tools for you so you're not waiting for them to load and if we go up here to where it
            • 28:00 - 28:30 says new you can see where you can create a new python 3. that's what we did here underneath the setup so it already has all the modules installed on it and i'm actually renamed this if you go under file you can rename it we've i'm calling it rnn stock and let's just take a look and start diving into the code let's get into the exciting part now we've looked at the tool and of course you might be using a different tool which is fine let's start putting that code in there and seeing what those imports and uploading everything looks like now first half is kind of boring when we hit the run button because we're
            • 28:30 - 29:00 going to be importing numpy as np that's uh the number python which is your numpy array and the matplot library because we're going to do some plotting at the end and our pandas for our data set our pandas as pd and when i hit run it really doesn't do anything except for load those modules just a quick note let me just do a quick draw here oops shift alt there we go you'll notice when we're doing this setup if i was to divide this up oops i'm going to actually let's overlap these here we go
            • 29:00 - 29:30 this first part that we're going to do is our data prep a lot of prepping involved um in fact depending on what your system is since we're using karass i put an overlap here but you'll find that almost maybe even half of the code we do is all about the data prep and the reason i overlapped this with uh cross let me
            • 29:30 - 30:00 just put that down because that's what we're working in uh is because cross has like their own preset stuff so it's already pre-built in which is really nice so there's a couple steps a lot of times that are in the kara setup we'll take a look at that to see what comes up in our code as we go through and look at stock and the last part is to evaluate and if you're working with shareholders or classroom whatever it is you're working with uh the evaluate is the next biggest piece um so the actual code here crossed
            • 30:00 - 30:30 is a little bit more but when you're working with some of the other packages you might have like three lines that might be it all your stuff is in your pre-processing and your data since cross has is is cutting edge and you load the individual layers you'll see that there's a few more lines here and crosses a little bit more robust and then you spin a lot of times like i said with the evaluate you want to have something you present to everybody else and say hey this is what i did this is what it looks like so let's go through those steps this is like a kind of just general overview and let's just take a look and see what the next set of code
            • 30:30 - 31:00 looks like and in here we have a data set train and it's going to be read using the pd or pandas.readcsv and it's a googlestockpricetrain.csv and so under this we have trainingset equals datasettrain.ilocation and we've kind of sorted out part of that so what's going on here let's just take a look at let's look at the actual file and see what's going on there now if we look at this uh ignore all the extra files on this i already have a
            • 31:00 - 31:30 train and a test set where it's sorted out this is important to notice because a lot of times we do that as part of the pre-processing of the data we take 20 of the data out so we can test it and then we train the rest of it that's what we use to create our neural network that way we can find out how good it is uh but let's go ahead and just take a look and see what that looks like as far as the file itself and i went ahead and just opened this up in a basic word pad and text editor just so we can take a look at it certainly you can open up an excel or any other kind of spreadsheet
            • 31:30 - 32:00 and we note that this is a comma separated variables we have a date uh open high low close volume this is the standard stuff that we import into our stock or the most basic set of information you can look at in stock it's all free to download in this case we downloaded it from google that's why we call it the google stock price and this specifically is google this is the google stock values from as you can see here we started off at 1 3 2012. so when we look at this first setup up
            • 32:00 - 32:30 here we have a data set train equals pd underscore csv and if you noticed on the original frame let me just go back there they had it set to home ubuntu downloads google stock price train i went ahead and changed that because we're in the same file where i'm running the code so i've saved this particular python code and i don't need to go through any special paths or have the full path on there and then of course we want to take out certain values in here and you're going
            • 32:30 - 33:00 to notice that we're using our data set and we're now in pandas so pandas basically it looks like a spreadsheet and in this case we're going to do i location which is going to get specific locations the first value is going to show us that we're pulling all the rows in the data and the second one is we're only going to look at columns one and two and if you remember here from our data as we switch back on over columns
            • 33:00 - 33:30 we saw we start with zero which is the date and we're going to be looking at open and high which would be one and two we'll just label that right there so you can see now when you go back and do this you certainly can extrapolate and do this on all the columns but for the example let's just limit a little bit here so that we can focus on just some key aspects of stock and then we'll go up here and run the code and again i said the first half is
            • 33:30 - 34:00 very boring whenever we hit the run button it doesn't do anything because we're still just loading the data and setting it up now that we've loaded our data we want to go ahead and scale it we want to do what they call feature scaling and in here we're going to pull it up from the sk learn or the sk kit pre-processing import min max scalar and when you look at this you got to remember that biases in our data we want to get rid of that so if you have something that's like a really high
            • 34:00 - 34:30 value let's just draw a quick graph and i have something here like the maybe the stock has a value one stock has a value of a hundred and another stock has a value of five um you start to get a bias between different stocks and so when we do this we go ahead and say okay 100 is going to be the max and 5 is going to be the min and then everything else goes and then we change this so we just squish it down i like the word squish so it's between 1
            • 34:30 - 35:00 and 0. so 100 equals one or one equals a hundred and zero equals five and you can just multiply it's usually just a simple multiplication we're using uh multiplication so it's going to be uh minus five and then 100 divided or 95 divided by one so or whatever value is is divided by ninety-five and uh once we've actually created our scale we've tolling is going to be from zero to one we wanna take our training set and we're gonna create a training set scaled and we're gonna use our
            • 35:00 - 35:30 scalar sc we're going to fit we're going to fit and transform the training set uh so we can now use the sc this this particular object we'll use it later on our testing set because remember we have to also scale that when we go to test our model and see how it works and we'll go ahead and click on the run again it's not going to have any output yet because we're just setting up all the variables okay so we pasted the data in here and we're going to create the data structure with the 60 time steps and output
            • 35:30 - 36:00 first note we're running 60 time steps and that is where this value here also comes in so the first thing we do is we create our x train and y train variables we set them to an empty python array very important to remember what kind of array we're in what we're working with and then we're going to come in here we're going to go for i in range 60 to 1258 there's our 60 60 time steps and the reason we want to do this is as we're adding the data in there there's nothing below the 60 so if we're going
            • 36:00 - 36:30 to use 60 time steps we have to start at point 60 because it includes everything underneath of it otherwise you'll get a pointer error and then we're going to take our x train and we're going to append training set scaled this is a scaled value between 0 and 1. and then as i is equal to 60 this value is going to be 60 minus 60 is 0. so this actually is 0 to i so it's going to be 0 60 1 to 61. let me just circle this part right here
            • 36:30 - 37:00 1 to 61 2 to 62 and so on and so on and if you remember i said 0 to 60 that's incorrect because it does not count remember it starts at 0 so this is a count of 60. so it's actually 59. important to remember that as we're looking at this and then the second part of this that we're looking at so if you remember correctly here we go we go from 0 to 59 of i and then we have a comma a 0 right here and so finally we're just going to look at the open value now i know we did put it
            • 37:00 - 37:30 in there for one to two if you move quickly it doesn't count the second one so it's just the open value we're looking at just open and then finally we have y train dot append training set i to zero and if you remember correctly i two or i comma zero if you remember correctly this is 0 to 59 so there's 60 values in it so we do i down here this is number 60. so we're going to do this is we're creating an array and we have 0
            • 37:30 - 38:00 to 59 and over here we have number 60 which is going into the y train it's being appended on there and then this just goes all the way up so this is down here is a 0 to 59 and we'll call it 60 since that's the value over here and it goes all the way up to 12 58. that's where this value here comes in that's the length of the data we're loading so we've loaded two arrays we've loaded
            • 38:00 - 38:30 one array that has which is filled with arrays from 0 to 59 and we loaded one array which is just the value and what we're looking at you want to think about this as a time sequence uh here's my open open open openopenopen what's the next one in the series so we're looking at the google stock and each time it opens we want to know what the next one 0 through 59 what's 60 1 through 60 what's 61. 2 through 62 what's 62 and so on and so on going up and then once we've loaded
            • 38:30 - 39:00 those in our for loop we go ahead and take x-train and y-train equals np.array x-train dot np array y-train we're just converting this back into a numpy array that way we can use all the cool tools that we get with numpy array including reshaping so if we take a look and see what's going on here we're going to take our x train we're going to reshape it wow what the heck does reshape mean that means we have an array if you remember correctly so many numbers by 60.
            • 39:00 - 39:30 that's how wide it is and so we're when you when you do x train dot shape that gets one of the shapes and you get um x train dot shape of one gets the other shape and we're just making sure the data is formatted correctly and so you use this to pull the fact that it's 60 by in this case where's that value 60 by 1199 1258 minus 60 1199 and we're making sure that that is shaped correctly so the
            • 39:30 - 40:00 data is grouped into 11 99 by 60 different arrays and then the one on the end just means at the end because this when you're dealing with shapes and numpy they look at this as layers and so the in layer needs to be one value that's like the leaf of a tree where this is the branch and then it branches out some more um and then you get the leaf np reshape comes from and using the existing shapes to form it we'll go ahead and run this piece of
            • 40:00 - 40:30 code again there's no real output and then we'll import our different cross modules that we need so from cross models we're going to import the sequential model dealing with sequential data we have our dense layers we have actually three layers we're going to bring in our dents our lstm which is what we're focusing on and our dropout and we'll discuss these three layers more in just a moment but you do need the with the lstm you do need the dropout and then the final layer will be the dents but let's go ahead and run this and they'll bring port our modules
            • 40:30 - 41:00 and you'll see we get an error on here and if you read it closer it's not actually an error it's a warning what does this warning mean these things come up all the time when you're working with such cutting edge modules that are completely being updated all the time we're not going to worry too much about the warning all it's saying is that the h5py module which is part of cross is going to be updated at some point and if you're running new stuff on cross and you start updating your cross system you better make sure that your h5 pi is updated too otherwise you're going to
            • 41:00 - 41:30 have an error later on and you can actually just run an update on the h5 pi now if you wanted to not a big deal we're not going to worry about that today and i said we were going to jump in and start looking at what those layers mean i meant that and we're going to start off with initializing the rnn and then we'll start adding those layers in and you'll see that we have the lstm and then the dropout lstm then dropout lstm then dropout what the heck is that doing so let's explore that we'll start by initializing the rnn regressor equals
            • 41:30 - 42:00 sequential because we're using the sequential model and we'll run that and load that up and then we're going to start adding our lstm layer and some dropout regularization and right there should be the q dropout regularization and if we go back here and remember our exploding gradient well that's what we're talking about the dropout drops out unnecessary data so we're not just shifting huge amounts of data through the network so and so we go in here let's just go ahead and add this in i'll
            • 42:00 - 42:30 go ahead and run this and we had three of them so let me go ahead and put all three of them in and then we can go back over them there's the second one and let's put one more in let's put that in and we'll go ahead and put two more in i mean i said one more in but it's actually two more in and then let's add one more after that and as you can see each time i run these they don't actually have an output so let's take a closer look and see what's going on here so we're going to add our first lstm layer in here we're going to have units 50. the units is the positive integer and it's the dimensionality of
            • 42:30 - 43:00 the output space this is what's going out into the next layer so we might have 60 coming in but we have 50 going out we have a return sequence because it is a sequence data so we want to keep that true and then you have to tell it what shape it's in well we already know the shape by just going in here and looking at x train shape so input shape equals the x train shape of one comma one it makes it really easy you don't have to remember all the numbers that put in 60 or whatever else is in there you just let it tell the regressor what model to
            • 43:00 - 43:30 use and so we follow our stm with a dropout layer now understanding the dropout layer is kind of exciting because one of the things that happens is we can over train our network that means that our neural network will memorize such specific data that it has trouble predicting anything that's not in that specific realm to fix for that each time we run through the training mode we're going to take 0.2 or 20 percent of our neurons they just turn them off so we're only going to train on the other ones and it's going to be
            • 43:30 - 44:00 random that way each time we pass through this we don't over train these nodes come back in in the next training cycle we randomly pick a different 20. and finally i see a big difference as we go from the first to the second and third and fourth the first thing is we don't have to input the shape because the shapes already the output units is 50 here this item the next step automatically knows this layer is putting out 50 and because it's the next layer it automatically sets that and says so 50 is coming out from our last layer that's coming up you know goes
            • 44:00 - 44:30 into the regressor and of course we have our dropout and that's what's coming into this one and so on and so on and so the next three layers we don't have to let it know what the shape is it automatically understands that and we're gonna keep the units the same we're still gonna do 50 units it's still a sequence coming through 50 units and a sequence now the next piece of code is what brings it all together let's go ahead and take a look at that and we come in here we put the output layer the dense layer and if you remember up here we had the three layers we had lstm
            • 44:30 - 45:00 dropout and dents dense just says we're going to bring this all down into one output instead of putting out a sequence we just know i want to know the answer at this point and let's go ahead and run that and so in here you notice all we're doing is setting things up one step at a time so far we've brought in our way up here we brought in our data we brought in our different modules we formatted the data for training it we set it up you know we have our y x train and our y train we have our source of data and the answers where we know so far that we're
            • 45:00 - 45:30 going to put in there we've reshaped that we've come in and built our cross we've imported our different layers and we have in here if you look we have what uh five total layers now cross is a little different than a lot of other systems a lot of other systems put this all in one line and do it automatic but they don't give you the options of how those layers interface and they don't give you the options of how the data comes in cross is cutting edge for this reason so even though there's a lot of extra steps in building the model this has a huge impact on the output and what
            • 45:30 - 46:00 we can do with this these new models from cross so we brought in our dense we have our full model put together a regressor so we need to go ahead and compile it and then we're going to go ahead and fit the data we're going to compile the pieces so they all come together and then we're going to run our training data on there and actually recreate our regressor so it's ready to be used so let's go ahead and compile that and i can go ahead and run that and if you've been looking at any of our other tutorials on neural networks you'll see we're going to use the
            • 46:00 - 46:30 optimizer atom atom is optimized for big data there's a couple other optimizers out there beyond the scope of this tutorial but certainly atom will work pretty good for this and loss equals mean squared value so when we're training it this is what we want to base the loss on how bad is our error we're going to use the mean squared value for our error and the atom optimizer for its differential equations you don't have to know the math behind them but certainly it helps to know what they're doing and where they fit into the bigger models and then finally we're going to do our fit fitting the rn into the training set
            • 46:30 - 47:00 we have the regressor.fit x train y train epochs and batch size so we know where this is this is our data coming in for the x train our y train is the answer we're looking for of our data our sequential input epics is how many times we're going to go over the whole data set we created a whole data set of x train so this is each each of those rows which includes a time sequence of 60. and bad size another one of those things where cross really shines is if you were
            • 47:00 - 47:30 pulling the save from a large file instead of trying to load it all into ram it can now pick smaller batches up and load those indirectly we're not worried about pulling them off a file today because this isn't big enough to cause a computer too much of a problem to run not too straining on the resources but as we run this you can imagine what happened if i was doing a lot more than just one column in one set of stock in this case google stock imagine if i was doing this across all the stocks and i had instead of just the open i had open close high low and you
            • 47:30 - 48:00 can actually find yourself with about 13 different variables times 60 because it's a time sequence suddenly you find yourself with a gig of memory you're loading into your ram which will just completely you know if it's just if you're not on multiple computers or cluster you can start running into resource problems but for this we don't have to worry about that so let's go ahead and run this and this will actually take a little bit on my computer because it's an older laptop and give it a second to kick in there there we go all right so we have epic so
            • 48:00 - 48:30 this is going to tell me it's running the first run through all the data and as it's going through it's batching them in 32 pieces so 32 lines each time and there's 1198 i think i said 11.99 earlier but it's 11.98 i was off by one and each one of these is 13 seconds so you can imagine this is roughly 20 to 30 minutes run time on this computer like i said it's an older laptop running at uh 0.9 gigahertz on a dual processor and that's fine what we'll do is i'll go ahead and stop go get a drink of coffee and come back and let's see what happens
            • 48:30 - 49:00 at the end and where this takes us and like any good cooking show i've kind of gotten my latte i also have some other stuff running in the background so you'll see these numbers jumped up to like 19 seconds 15 seconds but you can scroll through and you can see we've run it through 100 steps or 100 epics so the question is what does all this mean one of the first things you'll notice is that our loss can is over here it kind of stopped at 0.0014 but you can see it kind of goes down until we hit about 0.014 three times in a row so we guessed our epic pretty close since our losses remain the same
            • 49:00 - 49:30 on there so to find out we're looking at we're going to go ahead and load up our test data the test data that we didn't process yet and a real stock price data set test eye location this is the same thing we did when we prepped the data in the first place so let's go ahead and go through this code and we can see we've labeled it part three making the predictions and visualizing the results so the first thing we need to do is go ahead and read the data in from our test csv you see i've changed the path on it for my computer and
            • 49:30 - 50:00 then we'll call it the real stock price and again we're doing just the one column here and the values from i location so it's all the rows and just the values from these that one location that's the open stock open let's go ahead and run that so that's loaded in there and then let's go ahead and create we have our inputs we're going to create inputs here and this should all look familiar because this is the same thing we did before we're going to take our data set total we're going to do a little pandas concat from the datastate train now remember the end of the
            • 50:00 - 50:30 dataset train is part of the data going in let's just visualize that just a little bit here's our train data let me just put tr for train and it went up to this value here but each one of these values generated a bunch of columns it was 60 across and this value here equals this one and this value here equals this one and this value here equals this one and so we need these top 60 to go into our new data so to find out what we're looking at we're going to go ahead and
            • 50:30 - 51:00 load up our test data the test data that we didn't process yet and a real stock price data set test eye location this is the same thing we did when we prepped the data in the first place so let's go ahead and go through this code and we can see we've labeled it part three making the predictions and visualizing the results so the first thing that we need to do is go ahead and read the data in from our test csv you see i've changed the path on it for my computer and then we'll call it the real stock price and again we're doing just the one
            • 51:00 - 51:30 column here and the values from i location so it's all the rows and just the values from these that one location that's the open stock open let's go ahead and run that so that's loaded in there and then let's go ahead and create we have our inputs we're going to create inputs here and this should all look familiar this is the same thing we did before we're going to take our data set total we're going to do a little panda concat from the datastate train now remember the end of the data set train is part of the data going in let's just
            • 51:30 - 52:00 visualize that just a little bit here's our train data let me just put tr for train and it went up to this value here but each one of these values generated a bunch of columns it was 60 across and this value here equals this one and this value here equals this one and this value here equals this one and so we need these top 60 to go into our new data because that's part of the next data or it's actually the top 59. so that's what this first setup is over here is we're going in we're doing the real stock price and we're going to just
            • 52:00 - 52:30 take the data set test and we're going to load that in and then the real stock price is our data test.test location so we're just looking at that first column the open price and then our data set total we're going to take pandas and we're going to concat and we're going to take our data set train for the open and our dataset test open and this is one way you can reference these columns we've referenced them a couple different ways we've referenced them up here with the one two but we know it's labeled as a panda set as open so pandas is great that way lots of versatility there and
            • 52:30 - 53:00 we'll go ahead and go back up here and run this there we go and you'll notice this is the same as what we did before we have our open data set where pended our two different or concatenated our two data sets together we have our inputs equals data set total length data set total minus length data set minus test minus 60 values so we're going to run this over all of them and you'll see why this works because normally when you're running your test set versus your training set you run them completely separate but when we graph this you'll see that we're just going to be we'll be
            • 53:00 - 53:30 looking at the part that we didn't train it with to see how well it graphs and we have our inputs equals inputs dot reshapes or reshaping like we did before we're transforming our inputs so if you remember from the transform between zero and one and uh finally we want to go ahead and take our x test and we're going to create that x test and for i in range 60 to 80. so here's our x test and we're appending our inputs i to 60 which remember is 0 to 59 and i comma 0 on the other side so it's just the first column which is our open column and once again
            • 53:30 - 54:00 we take our x test we convert it to a numpy array we do the same reshape we did before and then we get down to the final two lines and here we have something new right here on these last two lines let me just highlight those or or mark them predicted stock price equals regressor dot predicts x test so we're predicting all the stock including both the training and the testing model here and then we want to take this prediction and we want to inverse the transform so remember we put them between zero and one well that's not
            • 54:00 - 54:30 going to mean very much to me to look at a float number between zero and one i want the dollar amount so i want to know what the cash value is and we'll go ahead and run this and you'll see it runs much quicker than the training that's what's so wonderful about these neural networks once you put them together it takes just a second to run the same neural network that took us what a half hour to train add and plot the data we're going to plot what we think it's going to be and we're going to plot it against the real data what the google stock actually did so let's go ahead and take a look at that in code and let's uh pull this code up so we have our plt that's our oh if you
            • 54:30 - 55:00 remember from the very beginning let me just go back up to the top we have our matplot library.pipeline as plt that's where that comes in and we come down here we're going to plot let me get my drawing thing out again we're going to go ahead and plt is basically kind of like an object it's one of the things that always threw me when i'm doing graphs in python because i always think you have to create an object and then it loads that class in there well in this case plt is like a canvas you're putting stuff on so if you've done html5 you'll
            • 55:00 - 55:30 have the canvas object this is the canvas so we're going to plot the real stock price that's what it actually is and we're going to give that color red so it's going to be in bright red we're going to label it real google stock price and then we're going to do our predicted stock and we're going to do it in blue and it's going to be labeled predicted and we'll give it a title it's always nice to give a title to your graph especially if you're going to present this to somebody you know to your shareholders in the office and the x label is going to be time because it's
            • 55:30 - 56:00 a time series and we didn't actually put the actual date and times on here but that's fine we just know they're incremented by time and then of course the y label is the actual stock price plt.legend tells us to build the legend on here so that the color red and real google stock price show up on there and then the plot shows us that actual graph so let's go ahead and run this and see what that looks like and you can see here we have a nice graph and let's talk just a little bit about this graph before we wrap it up here's our legend i was telling you about that's why we have
            • 56:00 - 56:30 the legend to show the prices we have our title and everything and you'll notice on the bottom we have a time sequence we didn't put the actual time in here now we could have we could have gone ahead and plotted the x since we know what the dates are and plotted this to dates but we also know that it's only the last piece of data that we're looking at so last piece of data which in somewhere probably around here on the graph i think it's like about 20 of the data probably less than that we have the google price and the google price has
            • 56:30 - 57:00 this little up jump and then down and you'll see that the actual google instead of a turn down here just didn't go up as high and didn't load go down so our prediction has the same pattern but the overall value is pretty far off as far as um stock but then again we're only looking at one column we're only looking at the open price we're not looking at how many volumes were traded like i was pointing out earlier we talk about stock just right off the bat there's six columns there's open high low close volume then there's
            • 57:00 - 57:30 weather i mean volume shares then there's the adjusted open adjusted high adjusted low adjusted close they have a special formula to predict exactly what it would really be worth based on the value of the stock and then from there there's all kinds of other stuff you can put in here so we're only looking at one small aspect the opening price of the stock and as you can see here we did a pretty good job this curve follows the curve pretty well it has like a little jumps on it bins they don't quite match up so this bin here does not quite match
            • 57:30 - 58:00 up with that bin there but it's pretty darn close we have the basic shape of it and the prediction isn't too far off and you can imagine that as we add more data in and look at different aspects in the specific domain of stock we should be able to get a better representation each time we drill in deeper of course this took a half hour for my program my computer to train so you can imagine that if i was running it across all those different variables it might take a little bit longer to train the data not so good for doing a quick tutorial
            • 58:00 - 58:30 like this so we covered a lot of really cool things in this tutorial today hopefully you'll be able to get rich predicting stock or maybe get a job if you are familiar with the stock domain and business domain but you can see how this will be applied to all kinds of different uh tools and trades across different domains so the biggest takeaways are as we had an introduction to the rnn and it's set up on there we went over the popular neural networks that are out there we discussed what is a recurrent neural network we went into uh one of the big problems with the rnn
            • 58:30 - 59:00 and that's the exploding gradient problem we also discussed the long short short-term memory networks which is part of what we're working on that's the the lst which is part of the rnn setup and finally we went through and we predicted the google stock to see how it did so i want to thank you for joining us today again my name is richard kirschner one of the simply learn team that's www.simplylearn.com for more information please visit our website feel free to ask any questions and check out our different courses we have you can also place comments below in the youtube
            • 59:00 - 59:30 video and we will keep monitoring those and try to address those again thank you for joining us today hi there if you like this video subscribe to the simply learn youtube channel and click here to watch similar videos turn it up and get certified click here