摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第二章《单变量线性回归》中第7课时《代价函数》的视频原文字幕。为本人在视频学习过程中逐字逐句记录下来以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助。

In this video (article), we'll define something called the cost function. This will let us figure out how to fit the best possible straight line to our data.

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。

Linear regression with one variable - Cost function 人工智能 第1张

In linear regression we have a training set like that shown here. Remember our notation M was the number of training examples, so maybe M=47. And the form of hypothesis, which we use to make prediction, is this linear function. To introduce a little bit more terminology, this Linear regression with one variable - Cost function 人工智能 第2张 and Linear regression with one variable - Cost function 人工智能 第3张, these Linear regression with one variable - Cost function 人工智能 第4张 are what I call the parameters of the model. What we are going to do in this video (article) is talk about how to go about choosing these two parameter values, Linear regression with one variable - Cost function 人工智能 第5张 and Linear regression with one variable - Cost function 人工智能 第6张.

Linear regression with one variable - Cost function 人工智能 第7张

With different choices of parameters Linear regression with one variable - Cost function 人工智能 第8张 and Linear regression with one variable - Cost function 人工智能 第9张 we get different hypotheses, different hypothesis functions. I know some of you will probably be already familiar with what I'm going to do on this slide, but just to review here are a few examples. If Linear regression with one variable - Cost function 人工智能 第10张 and Linear regression with one variable - Cost function 人工智能 第11张, then the hypothesis function will look like this. Right, because your hypothesis function will be Linear regression with one variable - Cost function 人工智能 第12张, this is flat at 1.5. If Linear regression with one variable - Cost function 人工智能 第13张 and Linear regression with one variable - Cost function 人工智能 第14张, then the hypothesis will look like this. And this should pass through this point (2,1), says you now have Linear regression with one variable - Cost function 人工智能 第15张 which looks like that. And if Linear regression with one variable - Cost function 人工智能 第16张 and Linear regression with one variable - Cost function 人工智能 第17张, then we end up with the hypothesis that looks like this. Let's see, it should pass through the Linear regression with one variable - Cost function 人工智能 第18张 point like so. And this is my new Linear regression with one variable - Cost function 人工智能 第19张. All right, well you remember that this is Linear regression with one variable - Cost function 人工智能 第20张  but as a shorthand, sometimes I just write this as Linear regression with one variable - Cost function 人工智能 第21张.

Linear regression with one variable - Cost function 人工智能 第22张​ ​

In linear regression we have a training set like maybe the one I've plotted here. What we want to do is come up with values for the parameters Linear regression with one variable - Cost function 人工智能 第23张 and Linear regression with one variable - Cost function 人工智能 第24张. So that the straight line we get out of this corresponds to a straight line that somehow fits the data well. Like maybe the line over there. So how do we come up with values Linear regression with one variable - Cost function 人工智能 第25张, Linear regression with one variable - Cost function 人工智能 第26张 that corresponds to a good fit to the data? The idea is we're going to choose our parameters Linear regression with one variable - Cost function 人工智能 第27张 and Linear regression with one variable - Cost function 人工智能 第28张 so that Linear regression with one variable - Cost function 人工智能 第29张, meaning the value we predict on input x, that is at least close to the values y for the examples in our training set. So, in our training set we're given a number of examples where we know x decides the house and we know the actual price of what it's sold for. So, let's try to choose values for the parameters so that at least in the training set, given the x's in the training set, we make reasonably accurate predictions for the y values. Let's formalize this. So linear regression, what we're going to do is that I'm going to want to solve a minimization problem. So, I'm going to write minimize over Linear regression with one variable - Cost function 人工智能 第30张, Linear regression with one variable - Cost function 人工智能 第31张. And, I want this to be small, right, I want the difference between Linear regression with one variable - Cost function 人工智能 第32张 and y to be small. And one thing I might do is try to minimize the square difference between the output of the hypothesis and the actual price of the house. Okay? So, let's fill in some details. Remember that I was using the notation Linear regression with one variable - Cost function 人工智能 第33张 to represent the Linear regression with one variable - Cost function 人工智能 第34张 training example. So, what I want really is to sum over my training set. Sum from i to M of the square difference between the prediction of my hypothesis when it is input the size of the house number i, minus the actual price that house number i was sold for and I want to minimize the sum of my training set sum from i equals 1 through M of the difference of this squared error, square difference between the predicted price of the house and the price that was actually sold for. And just remind you of your notation M here was the size of my training set, right, so the M there is my number of training examples, right? That hash sign is the abbreviation for "number" of training examples. Okay? And to make the math a little bit easier, I'm going to actually look at, you know, Linear regression with one variable - Cost function 人工智能 第35张 times that. So, we're going to try to minimize my average error, which we're going to minimize Linear regression with one variable - Cost function 人工智能 第36张. Putting the 2, the constant one half, in front it just makes some of the math a little easier. So, minimizing one half of something, right, should give you the same values of the parameters Linear regression with one variable - Cost function 人工智能 第37张, Linear regression with one variable - Cost function 人工智能 第38张 as minimizing that function. And just make sure this equation is clear, right? This expression in here, Linear regression with one variable - Cost function 人工智能 第39张, this is our usual, right? That's equal to Linear regression with one variable - Cost function 人工智能 第40张. And, this notation, minimize over Linear regression with one variable - Cost function 人工智能 第41张 and Linear regression with one variable - Cost function 人工智能 第42张, this means find me the values of theta zero and theta one that causes this expression to be minimized. And this expression depends on Linear regression with one variable - Cost function 人工智能 第43张 and Linear regression with one variable - Cost function 人工智能 第44张. Okay? So just to recap, we're posing this problem as find me the values of Linear regression with one variable - Cost function 人工智能 第45张 and Linear regression with one variable - Cost function 人工智能 第46张 so that the average already one over two M times the sum of square errors between my predictions on the training set minus the actual values of the houses on the training set is minimized. So, this is going to be my overall objective function for linear regression. And just to, you know rewrite this out a little bit more cleanly, what I'm going to do by convention is we usually define a cost function. Which is going to be exactly this.  That formula that I have up here. And what I want to do is minimize over Linear regression with one variable - Cost function 人工智能 第47张 and Linear regression with one variable - Cost function 人工智能 第48张 my function Linear regression with one variable - Cost function 人工智能 第49张. Just write this out, this is my cost function. So, this cost function is also called the squared error function or sometimes called the square error cost function and it turns out that why do we take the square of the errors? It turns out the squared error cost function is reasonable choice and will work well for most problems, for most regression problems. There are other cost functions that will work pretty well, but the squared error cost function is probably the most common used one for regression problems. Later in this class we'll also talk about alternative cost functions as well, but this choice that we just had should be a pretty reasonable thing to try for most linear regression problems. Okay, so, that's the cost function. So far we've just seen a mathematical definition of, you know, the cost function and in case this function Linear regression with one variable - Cost function 人工智能 第50张 seems a little bit abstract and you still don't have a good sense of what it's doing, in the next couple of videos (articles) we're actually going to go a little bit deeper into what the cost function J is doing and try to give you better intuition about what it's computing and why we want to use it.

<end>

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄