Please go here

http://en.wikipedia.org/wiki/Linear_approximation

perhaps we can now proceed from known properties of a Taylors series.

You might consider this which proves in a more rigorous way that the linear approximation is the best one.

http://higheredbcs.wiley.com/legacy/col … ctione.pdf

Also and I hate doing this to anyone but if you go here you will see that this is also somewhat of an intuitive definition.

http://math.stackexchange.com/questions … few-doubts

I like saying the tangent and function are exact at a, they have the same rate of change so it is obvious that local to a they are very close. This is a property of any Taylor polynomial.

Finally, you can go here and see the proof. It is difficult and long winded for such a simple concept. His error analysis is somewhat better than mine but my wardrobe is better.

]]>We are talking about this

y - f(a) = f '(a)(x - a)?

Yes.

It is the best linear approximation around a

Why?

Obviously at an intuitive level this is clear, I'm interested in the proof of WHY it is the best linear approximation. Besides the fact that all Calculus books hammer this into your head a million times, there has to be a WHY; an explanation of why any other line provides a worst approximation.

I'm being meticulous on purpose.

]]>We are talking about this

y - f(a) = f '(a)(x - a)?

As I understand it that arises from the point-slope formula. It is the best linear approximation around a. This does suggest truncating the Taylor series around a as a means to derive that formula. Taylor and Mclaurin approximations have that exact property of being the best approximations at the center a. I do not think questions of circularity apply. But I will look for a more geometric solution.

]]>Now, this looks like we are now getting into some heavy machinery for what seems to be a pretty basic and fundamental problem of showing that the derivative is the best linear approximation. There must be a simpler way of proving this fundamental fact. Anyways Taylor series etc depend on the derivative and its properties, namely that it is THE best linear approximation. We would be using circular logic if one were to prove this simple fact using such machinery.

Do you have any idea how to prove this using basic tools? Geometry or something? Im at a loss.

Thanks again for this conversation

]]>But I see a problem in this, if i remember correctly you use least squares to do a best-fit to a data set of finite points. How do you overcome the fact that an interval has infinite points?

That depends on a lot of things. Least squares is for discrete data. To fit to continuous data or functions can be done using collocation or Taylor series or Fourier series.

For instance

1 - x^2 / 2 is a good fit for cos(x) in the interval of [-1,1]

]]>But I see a problem in this, if i remember correctly you use least squares to do a best-fit to a data set of finite points. How do you overcome the fact that an interval has infinite points?

]]>I have never heard of doing that.

Me neither. I was just shooting ideas of what came to mind.

In Numerical Analysis we try to come up with the greatest error in some interval

But that's just one way of looking at the error, I guess.

What does that mean the "error of the whole interval?"

First of all I didnt say "the error OF the whole interval", I said "the error ON the whole interval". My idea was: I know what the error between F and L is at a single point. Then I wondered: How do I measure the error between F and L over the whole interval I? A plausible answer (although of course it may be wrong, hence why I'm asking here) was to take the error between F and L at every single point of the interval and add them up. (this is exactly the integral over I of |F - L|). This is then what I will call the error between F and L over the interval I. Then by looking at this integral, im assigning an "error" over the whole interval I to the line L.

I suspect we may have a misunderstanding. I'm not completely sure I have been able to explain conceptually what I'm trying to understand.

]]>I have never heard of doing that. In Numerical Analysis we try to come up with the greatest error in some interval. What does that mean the "error of the whole interval?"

]]>Can you be more specific?

How do you measure the error between a function F and a line L in an interval I around a point c? At each point x in I, i can measure the error as just the absolute value of the difference: |F(x) - L(x)|. But that's just the error at the point x. What's the error on the whole interval? The integral of |F - L| over the interval I?

Thanks.

]]>Of course, we must speak of the error between the function values and the values of the line in the interval I, but how do we formalize this?

That is exactly how I do it.

]]>Thanks very much.

]]>