Discussion about math, puzzles, games and fun. Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ π -¹ ² ³ °

You are not logged in.

- Topics: Active | Unanswered

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

To be simple, take the case of a function from R to R, and a point c in R. What does it mean exactly that the derivative is THE best linear approximation? I guess it means that in the set of all lines passing through (c,f(c)), the line having slope = f'(c) , ie, the tangent line, is the one that provides a better approximation around a small interval I at c. However, how do we decide when a line is a "better approximation" than another line? Of course, we must speak of the error between the function values and the values of the line in the interval I, but how do we formalize this?

Thanks very much.

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

Of course, we must speak of the error between the function values and the values of the line in the interval I, but how do we formalize this?

That is exactly how I do it.

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

Hello,

Can you be more specific?

How do you measure the error between a function F and a line L in an interval I around a point c? At each point x in I, i can measure the error as just the absolute value of the difference: |F(x) - L(x)|. But that's just the error at the point x. What's the error on the whole interval? The integral of |F - L| over the interval I?

Thanks.

*Last edited by LRKM (2013-01-31 21:05:24)*

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

Hi;

I have never heard of doing that. In Numerical Analysis we try to come up with the greatest error in some interval. What does that mean the "error of the whole interval?"

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

I have never heard of doing that.

Me neither. I was just shooting ideas of what came to mind.

In Numerical Analysis we try to come up with the greatest error in some interval

But that's just one way of looking at the error, I guess.

What does that mean the "error of the whole interval?"

First of all I didnt say "the error OF the whole interval", I said "the error ON the whole interval". My idea was: I know what the error between F and L is at a single point. Then I wondered: How do I measure the error between F and L over the whole interval I? A plausible answer (although of course it may be wrong, hence why I'm asking here) was to take the error between F and L at every single point of the interval and add them up. (this is exactly the integral over I of |F - L|). This is then what I will call the error between F and L over the interval I. Then by looking at this integral, im assigning an "error" over the whole interval I to the line L.

I suspect we may have a misunderstanding. I'm not completely sure I have been able to explain conceptually what I'm trying to understand.

*Last edited by LRKM (2013-01-31 21:18:36)*

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

I think that has already been done. You take the sum of the squares of the errors, this is called least squares. They use the squares because it is simpler than using the absolute value for computation.

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

Yes. This is what I meant when I said "But that's just one way of looking at the error". You can "define" your error to be whatever you want (as long as it's logical). You can define the error to be the sum of the absolute values (like I first said) of the differences but then you run into non-differentiability problems because of the absolute value and hence the minimization-maximization tools of calculus cannot be applied, which is why one uses least squares so that it is differentiable.

But I see a problem in this, if i remember correctly you use least squares to do a best-fit to a data set of finite points. How do you overcome the fact that an interval has infinite points?

*Last edited by LRKM (2013-01-31 21:40:49)*

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

Hi;

But I see a problem in this, if i remember correctly you use least squares to do a best-fit to a data set of finite points. How do you overcome the fact that an interval has infinite points?

That depends on a lot of things. Least squares is for discrete data. To fit to continuous data or functions can be done using collocation or Taylor series or Fourier series.

For instance

1 - x^2 / 2 is a good fit for cos(x) in the interval of [-1,1]

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

Good to learn, thanks!

Now, this looks like we are now getting into some heavy machinery for what seems to be a pretty basic and fundamental problem of showing that the derivative is the best linear approximation. There must be a simpler way of proving this fundamental fact. Anyways Taylor series etc depend on the derivative and its properties, namely that it is THE best linear approximation. We would be using circular logic if one were to prove this simple fact using such machinery.

Do you have any idea how to prove this using basic tools? Geometry or something? Im at a loss.

Thanks again for this conversation

*Last edited by LRKM (2013-01-31 21:56:52)*

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

Hi;

We are talking about this

y - f(a) = f '(a)(x - a)?

As I understand it that arises from the point-slope formula. It is the best linear approximation around a. This does suggest truncating the Taylor series around a as a means to derive that formula. Taylor and Mclaurin approximations have that exact property of being the best approximations at the center a. I do not think questions of circularity apply. But I will look for a more geometric solution.

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline

**LRKM****Member**- Registered: 2013-01-31
- Posts: 6

We are talking about this

y - f(a) = f '(a)(x - a)?

Yes.

It is the best linear approximation around a

Why?

Obviously at an intuitive level this is clear, I'm interested in the proof of WHY it is the best linear approximation. Besides the fact that all Calculus books hammer this into your head a million times, there has to be a WHY; an explanation of why any other line provides a worst approximation.

I'm being meticulous on purpose.

*Last edited by LRKM (2013-01-31 22:11:02)*

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 106,337

So far all I can say is the Taylor theorem already states that somewhere so you are using a theorem to prove the linear approximation.

Please go here

http://en.wikipedia.org/wiki/Linear_approximation

perhaps we can now proceed from known properties of a Taylors series.

You might consider this which proves in a more rigorous way that the linear approximation is the best one.

http://higheredbcs.wiley.com/legacy/col … ctione.pdf

Also and I hate doing this to anyone but if you go here you will see that this is also somewhat of an intuitive definition.

http://math.stackexchange.com/questions … few-doubts

I like saying the tangent and function are exact at a, they have the same rate of change so it is obvious that local to a they are very close. This is a property of any Taylor polynomial.

Finally, you can go here and see the proof. It is difficult and long winded for such a simple concept. His error analysis is somewhat better than mine but my wardrobe is better.

**In mathematics, you don't understand things. You just get used to them.****If it ain't broke, fix it until it is.****No great discovery was ever made without a bold guess. **

Offline