Math Is Fun Forum
  Discussion about math, puzzles, games and fun.   Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ • π ƒ -¹ ² ³ °

You are not logged in.

#1 This is Cool » A proof of the residue theorem (complex analysis) » 2013-05-15 21:23:27

Ivar Sand
Replies: 0

In short. Polar coordinates are used to prove the residue theorem for functions represented by a single Laurent series. The path of integration may have a general shape. The reader should be familiar with complex analysis.

Background. The proof emerged from the idea that the proof should contain terms with no contribution to the integral corresponding to the fact that there is no "height difference between the starting and ending points when walking around a mountain top and ending at the starting point". This, and a wish to use as little as possible of the theory of complex analysis, lead to the use of Laurent series and the introduction of a polar coordinate representation in the complex plane.

Complex analysis is the study of complex functions mapping the complex space ℂ to ℂ. A number z in ℂ has two components, a real component and an imaginary component, which is a real number multiplied by i where i=√-1. Nevertheless, algebraically, we regard z as we do a number in ℝ, i.e. as a single entity.

An analytic function is a function that can be represented by a Taylor series. A meromorphic function is a function that is analytic except at isolated points where it has poles. Locally around a point in ℂ, a meromorphic function can be represented by a Laurent series about that point. A Laurent series is a Taylor series to which terms with negative exponents are added. To represent a meromorphic function throughout its domain, several Laurent series may be required.

An integral of a complex function is like a line integral of a scalar real function mapping ℝ² into ℝ.

Laurent series. We identify a meromorphic function f(z) by its Laurent series about a point c,


   

where z ∈ D, an open disc with radius > 0 and centre at c. f is analytic everywhere on D except at c where f may have a pole[sup]1[/sup].

A Laurent series can be viewed as the sum of two series. The part with index n ranging from -∞ to -1 converges on ℂ - {c}, and the other part with n ranging from 0 to ∞ converges on D. These two series are power series, in 1/(z-c) and z-c respectively.

A Laurent series is integrable term by term along rectifiable paths within closed and bounded subsets of D - {c}.[sup]2[/sup]

Parameterised integration path. Let P be a rectifiable (i.e. P has a finite length)[sup]3[/sup], closed, and connected path, which is contained in a closed and bounded subset of D - {c}. Let a parameterisation of P be given,
    P: z=z(t), t∈[α,β]⊂ℝ, α<β,
    z(β)=z(α),
    the angular part of z(t)-c increases by 2π as t varies from α to β,
    z(t) is continuous,
    z(t) is piecewise continuously differentiable[sup]4[/sup] on [α,β].
The equation
    z(t) = r(t)e[sup]iθ(t)[/sup] + c
defines the radial function r(t) = |z(t)-c| and the angular function[sup]5[/sup] θ(t) = arg(z(t)-c). The properties of r(t) and θ(t) correspond to the ones of z(t) (stated without proof) and are,
    r(β)=r(α),
    r(t) is continuous,
    r(t) is piecewise continuously differentiable[sup]4[/sup] on [α,β]
and
    θ(β)=θ(α)+2π,
    θ(t) is continuous,
    θ(t) is piecewise continuously differentiable[sup]4[/sup] on [α,β].

Polar coordinates in ℂ. We calculate the derivative of z(t)-c, which equals z'(t),


   

   

   

r(t) in the denominator above is never 0 on P because P does not contain c.
We use the last expression for z'(t) above in the first of the following definitions,
    dz = z'(t)dt
    dr = r'(t)dt
    dθ = θ'(t)dt
and get,
    dz = z'(t)dt
    = r'(t)(z(t)-c)/r(t)dt + i(z(t)-c)θ'(t)dt
    = (z(t)-c)/r(t)r'(t)dt + i(z(t)-c)θ'(t)dt
    = (z-c)/r dr            + i(z-c)dθ
from which we identify the radial part, d[sub]r[/sub]z, and angular part, d[sub]a[/sub]z, of dz,

    dz = d[sub]r[/sub]z + d[sub]a[/sub]z
    d[sub]r[/sub]z = (z-c)/r dr
    d[sub]a[/sub]z = i(z-c)dθ.
Note in the expressions of the differentials of the complex polar coordinates, d[sub]r[/sub]z and d[sub]a[/sub]z, the corresponding differentials of the real polar coordinates, dr and dθ.

The residue theorem for a function represented by a single Laurent series.


   

The direction of integration is counterclockwise around the point c, see the description of P above.
Proof. Since P is a rectifiable path in a closed and bounded subset of D - {c}, we can integrate the Laurent series term by term,

   

Thus we are led to consider ∮[sub]P[/sub](z-c)[sup]n[/sup] dz, n= …, -2, -1, 0, 1, 2, …, and we need to prove,

   

where δ[sub]n,-1[/sub] = 1 if n = -1 otherwise 0.

Now we introduce polar coordinates in the left-hand side of (1),


   

   

   

   


We separate the n=-1 part from the rest,

   

   

We use the fact that n=-1 in the n=-1 part. Also, in the third integral, we make use of the parameterisation of P,

   

   

(Note that since r'(t) is only piecewise continuous on [α,β], the integrand of the integral above over [α,β] may not always be defined. This problem is circumvented by defining the value of this integral as the sum of the integrals with the same integrands as the integral above but with integration intervals that are the subintervals of [α,β] where r'(t) is continuous.[sup]4[/sup])
We observe ∮[sub]P[/sub] 1/r dr=0; this is an example of the "walking around a mountain" effect we talked about. Also, since we integrate once around c in a counterclockwise direction, ∮[sub]P[/sub]dθ=2π,

   

   

We observe that we have already gotten the result we are looking for in line one, so we aim at proving that the expression within the last parenthesis is zero,

   

   

r[sup]n[/sup] is the derivative of r[sup](n+1)[/sup]/(n+1),

   

   

Since r(t) and θ(t) are continuous and piecewise continuously differentiable on [α,β], so are r(t)^(n+1)/(n+1) and e^iθ(t)(n+1). This allows us to use integration by parts for the integral over [α,β], see Lemma 1 below,

   

   

   

   


   

   

   

Now we use r(β)=r(α) and θ(β)=θ(α)+2π,

   

   

   

   

   


   

   

   

e[sup]i2π(n+1)[/sup]=1 yields,

   

   

   

   

Two elements have opposite signs and cancel each other; they are another example of the "walking around a mountain" effect. We are left with only two integrals within the last parenthesis,

   

   

   

   

We find that these two integrals cancel each other,

   

   

   

   

   


   

   

   

   


   

   

   


This is the right-hand side of (1), and the proof is complete.

Appendix.
Lemma 1 (integration by parts).


   

where g and h are complex-valued functions.
Proof. We separate the real and the imaginary parts, g(t) = g[sub]1[/sub](t) + ig[sub]2[/sub](t) and h(t) = h[sub]1[/sub](t) + ih[sub]2[/sub](t) where g[sub]1[/sub](t), g[sub]2[/sub](t), h[sub]1[/sub](t), and h[sub]2[/sub](t) are real-valued functions.

We substitute the real-valued functions for the complex-valued functions on the left-hand side of the equation in the lemma,


   

   

   


   

   

   

The functions in the integrands of these four integrals are real-valued functions, and that allows us to use integration by parts,[sup]6[/sup]

   

   

   

   

   


   

   


   

   


   

   


   

This expression is identical to the right hand side of the equation in the lemma, and the proof is complete.

References.
1. Laurent series http://www.encyclopediaofmath.org/index.php/Laurent_series
2. Laurent series – Uniqueness http://en.wikipedia.org/wiki/Laurent_series#Uniqueness
3. Curve – Lengths of curves http://en.wikipedia.org/wiki/Rectifiable_path#Lengths_of_curves
4. Reinhold Remmert, Theory of Complex Functions http://books.google.no/books?id=uP8SF4jf7GEC&pg=PA173&lpg=PA173&dq=integrate+piecewise+continuously+differentiable&source=bl&ots=yMrRr6wE4v&sig=izem7hy8bHrw8TkKS-NEvhujMrs&hl=no&sa=X&ei=dpPWUbSzGoOv4QS_poC4DA&redir_esc=y#v=onepage&q=integrate%20piecewise%20continuously%20differentiable&f=false, p. 173-174.
5. Argument (complex analysis) http://en.wikipedia.org/wiki/Argument_(complex_analysis)
6. Integration by parts http://en.wikipedia.org/wiki/Integration_by_parts

#2 This is Cool » The divisibility by 3 rule – proof by induction » 2013-04-14 20:41:34

Ivar Sand
Replies: 0

In short. The divisibility by 3 rule is shown by using proof by induction. The objects of induction are sets consisting of three consecutive integers.

Background. While out walking, I tried from time to time through a period of several years to figure out how to prove that a number is divisible by 3 if and only if the sum of its digits is divisible by 3. I was not able to do that. It seemed obvious that proof by induction would have to be employed. The first step of proof by induction was easy and natural, but the next step seemed to be difficult. I found that it was necessary to sit down and concentrate in order to be able to write a proof.

Now I know that the proof is naively simple. The proof is also smart, I think, a proof that I would not be likely to find myself unless by accident.

The divisibility by 3 rule was obviously discovered by accident. I think one never suspected it to be true in the first place and therefore never started looking for a proof. If that had been the situation, my idea of there being two kinds of mathematicians, the exploring mathematicians who pioneer the results, and the smart mathematicians who write the elegant proofs that survive to the text books (see my thread "Exploring mathematicians and smart mathematicians"), might apply.

Admittedly, on this background it looks stupid to make an "exploring mathematician's" proof. Nevertheless I wanted to try to construct such a proof because I found the work challenging. Anyway, we get an exercise of doing an exploring trip into the realm of mathematics. Below, Proof 1 is the smart proof and Proof 2 is my proof.

Theorem 1. An integer is divisible by 3 if and only if the sum of its digits is divisible by 3.
Proof 1¹. I'll only sketch the proof here. One observes that any integer x can be written as a sum of products, every product being of the form of a digit of x multiplied by 1 or 10 or 100 or … 100…0, where the last number, 100…0, has n digits where n is the number of digits of x. One then observes that the numbers 10, 100, and … 100…0 can be written as respectively 9 + 1, 99 + 1, … 99…9 + 1. When these sums are multiplied by their respective digits of x and are summed, one gets one sum of products of factors including the factor 9 plus another sum. The first sum is clearly divisible by 3 so that the divisibility by 3 of x is the same as the divisibility by 3 of the second sum, which proves to be the sum of the digits of x.

Proof 2. We limit ourselves to non-negative integers. The proof for non-positive integers is similar.

First we observe that the theorem is true for a number consisting of one digit because, in this case, the number itself and the sum of its digit(s) have the same value and therefore have the same divisibility by 3 property, i.e. they are either both divisible by 3 or both not divisible by 3. Now, we intend to use proof by induction. We must decide what objects the proof by induction shall refer to. I suggest these objects to be sets of three consecutive integers. The first set consists of 0, 1, and 2, call this set S[sub]1[/sub]. The maximum element of S[sub]i[/sub], where i ≥ 1, is 1 less than the least element of S[sub]i+1[/sub] so that the union of all the S[sub]i[/sub] sets is the set of the non-negative integers, the set referred to in the theorem. Let us call S[sub]i[/sub] valid if the theorem is proved for all three numbers in it.

We have already seen that the theorem is true for each of the numbers in S[sub]1[/sub], so S[sub]1[/sub] is valid. Suppose S[sub]i[/sub] is valid for some i ≥ 1. We have to prove that S[sub]i+1[/sub] is valid, as well.

Definitions:
x is the least number in S[sub]i+1[/sub].
s() is a function that gives the sum of the digits of its non-negative integer argument.

Now take a look at "A proof diagram for the proof of Theorem 1" in the appendix.

Let us start with x, the least number of S[sub]i+1[/sub]. It is easy to see that x equals 3*i, which is clearly divisible by 3. We must show that the sum of the digits of x, s(x), also is divisible by 3.

x-3 in S[sub]i[/sub] is divisible by 3 because x is.

Since S[sub]i[/sub] is valid, s(x-3) is divisible by 3 because x-3 is divisible by 3.

Lemma 1 gives us that s(x) is divisible by 3 because s(x-3) is divisible by 3. This concludes the proof for x.

Let us consider x+1, the middle number of S[sub]i+1[/sub]. x+1 is 1 higher than a number divisible by 3 and is therefore itself not divisible by 3. We must show that s(x+1) is not divisible by 3 either.

We send x+1 down to S[sub]i[/sub] by subtracting 3: x+1 - 3 = x-2. x-2 is not divisible by 3 because x is divisible by 3 while the difference between x and x-2, 2, is not divisible by 3.

Since S[sub]i[/sub] is valid, s(x-2) is not divisible by 3 because x-2 is not divisible by 3.

Lemma 1 gives us that s(x+1) is not divisible by 3 because s(x-2) is not divisible by 3. This concludes the proof for x+1.

Let us consider x+2, the highest number of S[sub]i+1[/sub] and the last one we have to consider. Its case is similar to the case of x+1, and the proof is left to the reader. QED.

Appendix.
Lemma 1. If the sum of the digits of a non-negative integer x is divisible by 3 then the same holds for the sum of the digits of x + 3, and if the sum of the digits of x is not divisible by 3 then the same holds for the sum of the digits of x + 3.
Proof. Definitions:
d[sub]j[/sub]() is a function that produces digit number j of its non-negative integer argument where j = 1 for the least significant digit of the argument.
s() is a function that gives the sum of the digits of its non-negative integer argument.

We need to show that s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3.

If d[sub]1[/sub](x) ≤ 6 then d[sub]1[/sub](x+3) = d[sub]1[/sub](x) + 3 and d[sub]j[/sub](x+3) = d[sub]j[/sub](x) for j > 1. Therefore, for such an x, s(x+3) = s(x) + 3. Since s(x+3) - s(x) is divisible by 3, s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3, and this finishes the treatment of this case.

If d[sub]1[/sub](x) ≥ 7 then d[sub]1[/sub](x+3) = d[sub]1[/sub](x) - 7 and there is a carry equalling 1. Now suppose, without limiting the situation in any way, that d[sub]j[/sub](x) = 9 for j = 2, 3, … k and d[sub]k+1[/sub](x) < 9 for a k ≥ 1. Then d[sub]j[/sub](x+3) = 0 for j = 2, 3, … k and d[sub]k+1[/sub](x+3) = d[sub]k+1[/sub](x) + 1. We find s(x+3) = s(x) - 7 + (-9)*(k-1) + 1 = s(x) + (-9)*(k-1) - 6. Since s(x+3) - s(x) is divisible by 3, s(x+3) and s(x) share the same divisibility by 3 property, and this concludes the proof.

A proof diagram for the proof of Theorem 1.
Below is a proof diagram where y = x, x+1, or x+2. All y, y-3, s(y-3), and s(y) in the diagram have the same divisibility by 3 property.

  S[sub]i+1[/sub]:                   y             s(y)
                            |              ∧
Subtract 3 ------->   |              | <------- Lemma 1
                            ∨              |
  S[sub]i[/sub]:                     y-3 ----> s(y-3)
                                   ∧
                                   |
                              S[sub]i[/sub] is valid


Reference.
1. http://mathforum.org/k12/mathtips/division.tips.html.

#3 Re: This is Cool » Exploring mathematicians and smart mathematicians » 2013-03-25 00:02:12

Thanks for your post, cmowla.

To be honest, I guess I used the idea of there being smart mathematicians and exploring mathematicians as a way of marketing my proof, hehe. However, a Physics professor at the university where I studied did say to me once that he liked an A- (or B+) student, or not quite first-rate student, better than an A student because an A- student is more innovative than an A student, who is conservative. An A- student is familiar with making mistakes and is not afraid of making them. An A student, on the other hand, is afraid of making mistakes and does not dare say things that might prove to be false.

I have posted a comment on your post about the basic limit laws after having spent some time trying to understand your post. I hope you appreciate my comment although I understand that you are not into calculus at the moment.

Well, I think there is such a thing as mathematical maturity. I remember taking a course in linear algebra once. It was a spring course, and I had the feeling of understanding nothing before Easter and everything after Easter.

I cannot comment on Rubik's cube as I am not into that area.

Regarding your item [1]: I guess we have to accept that mathematics is difficult; there seems to be general acceptance of this. This causes papers to be poorly understood, which is sad. However, maybe someone someday will develop a mathematics language that would make mathematics easy to understand.

Regarding your item [2]: I think an idea behind listing in a paper the prerequisite knowledge is to suggest to the unqualified reader to stop reading the paper so as to avoid wasting his or her time reading it. By the way, I have now included the text "(calculus)" in the name of this thread so that readers not interested in calculus can skip the thread. (Oops, no, I didn't — it is not possible.)

By the way, perhaps there is a simple contribution in my post #1 in regard of the effort of making papers easy to read. There, I try to align in successive lines the same texts in mathematical expressions. That way, it is easy for the reader to see what is unchanged in a pair of lines so as to find these most interesting parts of the lines fast.

#4 Re: Help Me ! » Easier Proofs for the Basic Limit Laws? » 2013-03-19 21:08:44

In your post #1 you state that you evaluated the absolute value(s) in the second quote to produce the statement in the third quote. However, I have a counterexample. Set x = a + δ/2 and f(a + δ/2) = L - 2ε. In this case the statement in the second quote is false, whereas the statement in the third quote is true.

Also, in the proof of Sum Law, the theorem statement has ε whereas the last statement in the proof has ε[sub]2[/sub]. Shouldn't these two epsilons have been the same? The same or something similar goes for Constant Law and Product Law.

#5 This is Cool » Exploring mathematicians and smart mathematicians » 2013-03-11 23:36:35

Ivar Sand
Replies: 3

In short. This post presents the idea that there are two kinds of mathematicians, the exploring mathematicians who pioneer the results, and the smart mathematicians who follow and produce the elegant proofs that survive to the text books. A theorem with a proof that is considered smart is chosen from a text book, and then a more elementary proof is presented.

Background. One day, not so long ago, I started wondering about how to prove that a continuous function defined on a closed interval is bounded. I remembered that I had been thinking about the same problem many years earlier and that I had not been able to find a proof. I remembered that I had a runaround experience while trying to locate a point where the function approached infinity. So I gave up trying to construct a proof and looked up the proof in a text book. I found that the proof was smart, a proof that I could not be depend upon being able to find myself.

In spite of this, I continued to think that a straightforward proof ought to exist for this theorem. I fantasised about there being two kinds of mathematicians, the exploring mathematician and the smart mathematician; the exploring mathematicians being the pioneer who is the first one to prove a theorem. The exploring mathematicians would apply engineering-like methods to construct a proof, which might prove to be long and complicated. Then a smart mathematician would follow and replace the exploring mathematician's proof with a smarter and more elegant proof. The new proof would stand the test of time better than the original proof and survive to the text books. Whether this distinction has anything to do with reality, I don't know.

Nevertheless I wanted to try to construct a more elementary proof of the theorem. So I did (or tried to), and I posted this article to demonstrate how such an exploring mathematician's proof might look like. Below, Proof 1 is the "smart" proof and Proof 2 is my proof.

The theorem¹. A continuous function defined on a closed interval is bounded.
Proof 1. I'll only sketch the proof here. One assumes the contrary — that the continuous function is unbounded. One divides the interval into two equal parts. One observes that the function has to be unbounded on at least one of the two parts. One chooses one of the parts, divides that interval into two equal parts, chooses the part where the function is unbounded, and so on. The lengths of the chosen intervals approach zero, and the leftmost endpoints of the intervals approach a point in the function's interval of definition. One uses the concept of supremum to prove that the sequence of endpoints converges. The function's continuity at that point is then shown to imply that the function cannot be unbounded on an interval around that point after all.

Comment. One needs to verify that the unboundedness property and the continuity property collide. I think a difficulty lies in the fact that the unboundedness property is the property of an area or interval whereas the continuity property is the property of a point. (The unboundedness property is not the property of a point in the interval of definition because if a function were unbounded or infinite at a point it would be undefined there.) I guess the smartness of Proof 1 lies in the fact that the function's unboundedness property is included in every step of the process. So to speak, one keeps the unboundedness property in hand all the way while one searches for a point in order to use the function's continuity there.

Proof 2. Let the function's name be f and its closed interval of definition be [a,b]. As in Proof 1, we assume the opposite that f is unbounded, and aim at a contradiction.

We start looking for f's unboundedness at a. Since f is defined at a it is finite there, so a is not the point we are searching for. If a = b f cannot be unbounded, so we deduce that b must be greater than a.

Now let the argument of f, call it x, increase continuously from the point a while we look for a point where |f(x)| = |f(a)| + 1. Since f is continuous, f(x) varies smoothly as x increases – there are no jumps in the function value. The first point we encounter that satisfies |f(x)| = |f(a)| + 1 we call x[sub]1[/sub]. We proceed from x[sub]1[/sub] moving continuously towards higher values of x looking for a point where |f(x)| = |f(x[sub]1[/sub])| + 1 and call the first such point x[sub]2[/sub]. Proceeding in this way we get a sequence of points {x[sub]1[/sub], x[sub]2[/sub], …}.

Suppose we reach b, and we find that our sequence contains a finite number of elements. If the sequence is empty, f is bounded on [a,b] by |f(a)| + 1, otherwise f is bounded by |f(x[sub]n[/sub])| + 1 where x[sub]n[/sub] is the last element in {x[sub]1[/sub], x[sub]2[/sub], …}. In both cases f is bounded. We conclude that the number of points in {x[sub]1[/sub], x[sub]2[/sub], …} is infinite.

The function value at a point x[sub]i[/sub] in {x[sub]1[/sub], x[sub]2[/sub], …} for an i > 1 satisfies |f(x[sub]i[/sub])| = |f(x[sub]i-1[/sub])| + 1. We find |f(x[sub]i[/sub])| = |f(a)| + i for i ≥ 1 which implies that |f(x[sub]i[/sub])| approaches ∞ as i increases.

Since the monotonic sequence {x[sub]1[/sub], x[sub]2[/sub], …} is bounded (by b), it converges². Call the point of convergence x[sub]0[/sub].

Choose an ε > 0. We have:
1. Since f is continuous at x[sub]0[/sub], there is a δ > 0 so that |f(x) - f(x[sub]0[/sub])| < ε for all x satisfying |x - x[sub]0[/sub])| < δ. Now
|f(x) - f(x[sub]0[/sub])| < ε ⇒
|f(x) - f(x[sub]0[/sub])| + |f(x[sub]0[/sub])| < ε + |f(x[sub]0[/sub])| ⇒
|f(x) - f(x[sub]0[/sub]) + f(x[sub]0[/sub])| ≤ |f(x) - f(x[sub]0[/sub])| + |f(x[sub]0[/sub])| < ε + |f(x[sub]0[/sub])| ⇒
|f(x)|                     ≤ |f(x) - f(x[sub]0[/sub])| + |f(x[sub]0[/sub])| < ε + |f(x[sub]0[/sub])| ⇒
|f(x)| < ε + |f(x[sub]0[/sub])|
We conclude:
|f(x)| < |f(x[sub]0[/sub])| + ε whenever |x - x[sub]0[/sub])| < δ (1).
2. Since {x[sub]1[/sub], x[sub]2[/sub], …} converges towards x[sub]0[/sub], there is an integer m1 so that
|x[sub]i[/sub] - x[sub]0[/sub]| < δ whenever i > m1 (2).
3. Since |f(x[sub]i[/sub])| approaches ∞ with i, there is an integer m2 so that
|f(x[sub]i[/sub])| > |f(x[sub]0[/sub])| + ε whenever i > m2 (3).

Now choose m = max (m1,m2). This gives us new versions of (2) and (3):
|x[sub]i[/sub] - x[sub]0[/sub]| < δ whenever i > m (2')
|f(x[sub]i[/sub])| > |f(x[sub]0[/sub])| + ε whenever i > m (3').

The continuity condition (1) expresses that f is bounded by |f(x[sub]0[/sub])| + ε on the open interval (x[sub]0[/sub] - δ, x[sub]0[/sub] + δ) whereas the unboundedness or divergence condition (3') expresses that |f(x[sub]i[/sub])| is greater than this bound for an i satisfying i > m, call it j. Now since x[sub]j[/sub] lies in (x[sub]0[/sub] - δ, x[sub]0[/sub] + δ) according to the convergence condition (2'), we deduce from (1):
|f(x[sub]j[/sub])| < |f(x[sub]0[/sub])| + ε
and from (3'):
|f(x[sub]j[/sub])| > |f(x[sub]0[/sub])| + ε.
These two inequalities are inconsistent, and this contradiction allows us to conclude that our original assumption of f being unbounded is false. QED.

References.
1. Tom M. Apostol, Calculus, Volume 1, One-Variable Calculus, with an Introduction to Linear Algebra, Second edition, 1967, p. 150.
2. Ibid., p. 381.

#6 This is Cool » Behind the scenes of proof by contradiction » 2013-03-03 21:40:40

Ivar Sand
Replies: 0

In short. We start with an example: A ∧ B ⇒ C, C ⇒ D, … D ⇒ E, where B is an assumed condition and E is a contradictory condition. According to proof by contradiction, we conclude ¬B. We demonstrate the truth of this conclusion by proving B ⇒ C ∧ D ∧ … E and showing that this implication reduces to ¬B.

Background. When we construct proofs, we often use the technique of assuming something to be true while suspecting it is false and then proceeding to derive a contradiction which allows us to conclude (and prove) that what we assumed to be true is instead false.

I often wondered how the logic of this works. Maybe I learned it some long time ago, but I am not sure. As a matter of fact, I think I never learned it. Therefore, I thought it might be fun to do a bit of exploring in order to find out what the logic of proof by contradiction looks like, and this thread is the result.

Outline. Section A shows an example of the use of proof by contradiction, and section B reveals the logic behind the results in section A.

Proven statements are identified by numbers within parentheses.

A. An example of the use of proof by contradiction.
I guess there are many ways a proof by contradiction can go. I have chosen to demonstrate one of them.

1. First we start with a result we have proved, the statement or condition A:
A (1)
2. Then we assume B to be true. We deduce something from A and B:
A ∧ B ⇒ C (2)
3. Then we deduce something from C:
C ⇒ D (3)
4. Then another deducing step:
D ⇒ E (4)
where E is the contradictory condition we are seeking. We conclude from this:
¬B (5)

B. An analysis of the proof by contradiction in section A.
From the combined implications in section A, we here produce what we might call the grand formula of proof by contradiction. Then we proceed by showing that this formula reduces to (5).
1. First we use (1) in (2). Note: I use = instead of ⇔ since either means the same on Bool.
A ∧ B ⇒ C =
true ∧ B ⇒ C =
B ⇒ C (6)

2. Next we combine (6) and (3) into one condition:
(B ⇒ C) ∧ (C ⇒ D) (7)
We manipulate (7) by using (3) and A ⇒ B ≡ ¬A ∨ B:
(B ⇒ C) ∧ (C ⇒ D) =
(¬B ∨ C) ∧ (C ⇒ D) =
¬B ∧ (C ⇒ D) ∨ C ∧ (C ⇒ D) =
¬B ∧ (true)  ∨ C ∧ (¬C ∨ D) =
¬B           ∨ (C ∧ ¬C ∨ C ∧ D) =
¬B           ∨ (false ∨ C ∧ D) =
¬B           ∨          C ∧ D =
B ⇒ C ∧ D (8)
Let us compare (8) and (7) to see what these deductions have accomplished.
C ⇒ D in (7) has the effect on B ⇒ C in (7) that "∧ D" has been appended to B ⇒ C as B ⇒ C appears in (8). Also, it looks as if "C) ∧ (C ⇒" in (7) has been replaced by "C ∧" in (8).

3. Now we combine (8) and (4):
(B ⇒ C ∧ D) ∧ (D ⇒ E) (9)
By using (4), we treat (9) in the same way as we did (7). Essentially, the only difference between (9) and (7) is that in (9) B implies a double condition, C ∧ D, while in (7) a single condition, C. This complication proves to pose no problem though:
(B ⇒ C ∧ D) ∧ (D ⇒ E) =
(¬B ∨ C ∧ D) ∧ (D ⇒ E) =
¬B ∧ (D ⇒ E) ∨ C ∧ D ∧ (D ⇒ E) =
¬B ∧ (true)   ∨ C ∧ D ∧ (¬D ∨ E) =
¬B            ∨ (C ∧ D ∧ ¬D ∨ C ∧ D ∧ E) =
¬B            ∨ (false      ∨ C ∧ D ∧ E) =
¬B            ∨               C ∧ D ∧ E =
B ⇒ C ∧ D ∧ E (10)
(10) is an example of what I call the grand formula of proof by contradiction.

4. Now I would like to study the contradictory condition E a bit closer so as to make the inconsistency evident, i.e. by producing "false". In practice, we don't usually do this, and we didn't in section A either. Assume that E has the form:
E ≡ E1 ∧ E2 (11)
where E1 and E2 contradict each other. Assume that we have proved:
E1 ⇒ ¬E2 (12)
Here, the inconsistency is clear since E2 in (11) and ¬E2 in (12) contradict each other per definition.
Now let us continue in order to produce "false". We manipulate the right hand side of (11) by using (12):
E1 ∧ E2 =
E1 ∧ E2 ∧ true =
E1 ∧ E2 ∧ (E1 ⇒ ¬E2) =
E1 ∧ E2 ∧ (¬E1 ∨ ¬E2) =
(E1 ∧ E2 ∧ ¬E1 ∨ E1 ∧ E2 ∧ ¬E2) =
(false         ∨ false) =
false
We use this to substitute "false" for E1 ∧ E2 in (11) and get:
E = false (13)

5. Now we use (13) to substitute "false" for E in (10):
B ⇒ C ∧ D ∧ E =
B ⇒ C ∧ D ∧ false =
B ⇒ false =
¬B ∨ false =
¬B
which is the same as (5), and this finishes our analysis.

#7 Re: This is Cool » Topsy-turvy continuity » 2013-02-03 23:26:59

Now I have updated post #4. I have removed the word differentiable and also made other changes in order to make the mathematics more precise.

#8 Re: This is Cool » Topsy-turvy continuity » 2013-01-31 21:16:55

I see, Bob. I agree that f is not differentiable at x[sub]0[/sub]. I'm sorry about that. I propose to replace in post #4:
"Suppose f is differentiable at a point x[sub]0[/sub] and that its derivative at x[sub]0[/sub] is infinite"
by:
"Suppose f is not differentiable at a point x[sub]0[/sub] because its derivative is infinite there"

#9 Re: This is Cool » Topsy-turvy continuity » 2013-01-30 22:44:14

Well, anonimnystefy, let's see. We go to the Wikipedia page named "Continuous function", the paragraph "Weierstrass definition (epsilon-delta) of continuous functions" and the sentence "For any number ε > 0, …". This sentence expresses something like: given an ε > 0, a δ > 0 exists …, where ε is part of a condition in the codomain and δ is part of a condition in the domain. Topsy-turvy continuity, on the other hand, looks like: given a δ > 0, a k > 0 exists …, where δ is part of a condition in the domain and k is part of a condition in the codomain.

To summarise:
The continuity of a function f at x[sub]0[/sub] is defined:
Given an ε > 0, a δ > 0 exists so that ∣f(x) - f(x[sub]0[/sub])∣ < ε whenever ∣x - x[sub]0[/sub]∣ < δ.
The topsy-turvy continuity of a function f at x[sub]0[/sub] is defined:
Given a δ > 0 a k > 0 exists so that |f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]| whenever |x - x[sub]0[/sub]| < δ.

#10 Re: This is Cool » Topsy-turvy continuity » 2013-01-30 22:07:24

No, Bob, I haven't. I suppose you are thinking about whether continuity implies topsy-turvy continuity. I don't think this is true so a counter-proof would be called for. I believe the problem is connected to points in f's domain where f has an infinite derivative. Let's have a closer look.

Let f have an infinite derivative at a point x[sub]0[/sub]. This means that (f(x) - f(x[sub]0[/sub])) / (x - x[sub]0[/sub]) approaches infinity as x approaches x[sub]0[/sub], or, in other words, that for every M > 0 a δ > 0 exists so that (f(x) - f(x[sub]0[/sub])) / (x - x[sub]0[/sub]) > M whenever |x - x[sub]0[/sub]| < δ and x ≠ x[sub]0[/sub]. (We need the condition x ≠ x[sub]0[/sub] to avoid 0/0, which is undefined.) Now we take the absolute value of the left hand side of this inequality and multiply both sides by |x - x[sub]0[/sub]|, and we get
|f(x) - f(x[sub]0[/sub])| > M |x - x[sub]0[/sub]| whenever |x - x[sub]0[/sub]| < δ and x ≠ x[sub]0[/sub] (1)

Now, topsy-turvy continuity at x[sub]0[/sub] means: given a δ' > 0, a k > 0 exists so that
|f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]| whenever |x - x[sub]0[/sub]| < δ' (2)

Let us try to prove that f is not topsy-turvy continuous at x[sub]0[/sub]. We do that by proposing (2) to be true and see if we can derive a contradiction.

We choose M = k + 1 in (1). Regard the point x[sub]1[/sub] = x[sub]0[/sub] + min(δ',δ)/2. Since |x[sub]1[/sub] - x[sub]0[/sub]| < δ and |x[sub]1[/sub] - x[sub]0[/sub]| < δ' x[sub]1[/sub] fits into both (1) and (2). From (1) we get:
|f(x[sub]1[/sub]) - f(x[sub]0[/sub])| > (k + 1) |x[sub]1[/sub] - x[sub]0[/sub]|,
and from (2) we get:
|f(x[sub]1[/sub]) - f(x[sub]0[/sub])| ≤ k |x[sub]1[/sub] - x[sub]0[/sub]|.
These two inequalities are contradictory to each other, and we conclude that our proposition of (2) being true is false. Therefore f is not topsy-turvy continuous at x[sub]0[/sub].

#11 This is Cool » Topsy-turvy continuity » 2013-01-30 00:09:39

Ivar Sand
Replies: 7

Background. As you would now, the continuity[sup]1‚2[/sup] of a function f at x[sub]0[/sub], a point in the domain of f, is defined using a condition in f's codomain first ("given ε > 0 … ∣f(x) - f(x[sub]0[/sub])∣ < ε") and then proceeds with a condition in f's domain ("… then there exists a δ > 0 … ∣x - x[sub]0[/sub]∣ < δ").

Once a friend of mine remarked that he felt that the definition of continuity is unnatural in that it starts in a function's codomain instead of in its domain. I remember that I had the same feeling myself when I learned about continuity. I guess that the natural approach, as indicated by my friend, was adopted first, and that the standard definition came later and proved to be more fruitful than the natural approach in some way. (This seems indeed to have been the case[sup]2[/sup].)

I thought that it might be fun to see what such a natural approach would look like. (If you disagree this is the place to stop reading this thread.) So I put my mathematical explorer's boots on and dived into the jungle of mathematics. I wanted to see if I could build myself a natural definition of continuity and compare this concept with the existing definition of continuity. I decided to call this natural definition of continuity topsy-turvy continuity because it does things in the opposite way as compared to the existing definition of continuity. The result follows below.

Definition. A function f is said to be topsy-turvy continuous at a point x[sub]0[/sub] if for a given δ > 0 a k > 0 exists such that whenever |x - x[sub]0[/sub]| < δ, it is true that |f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]|.

Comments:
- This is just a simple definition of topsy-turvy continuity.
- Note that k may depend on x[sub]0[/sub] and δ.
- Note that ≤ is used instead of <. This is to cover the case x = x[sub]0[/sub] (a < would have given 0 < 0).

Now we must check that this definition relates to the standard definition of continuity in a proper way. A topsy-turvy continuous function should be continuous. However, the topsy-turvy continuity concept may be weaker than standard continuity so that a continuous function may not be topsy-turvy continuous at all points. Therefore, topsy-turvy continuity should imply continuity. Indeed it does, as is stated in the following theorem.

Theorem. A topsy-turvy continuous function f is continuous.
Proof: Let ε > 0 be given. Choose some point x[sub]0[/sub] in the domain of f. Since f is topsy-turvy continuous at x[sub]0[/sub], a positive k exists so that when |x - x[sub]0[/sub]| < ε it is true that |f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]|.

Define δ = min (ε, ε/k). We observe that δ > 0. Since we have δ ≤ ε, the topsy-turvy continuity condition |f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]| still holds with the same constant k when |x - x[sub]0[/sub]| < δ.

When |x - x[sub]0[/sub]| < δ it is also true that |x - x[sub]0[/sub]| < ε/k since δ ≤ ε/k, and multiplying both sides by k yields k |x - x[sub]0[/sub]| < k*ε/k = ε. Using this in |f(x) - f(x[sub]0[/sub])| ≤ k |x - x[sub]0[/sub]| gives |f(x) - f(x[sub]0[/sub])| < ε still maintaining |x - x[sub]0[/sub]| < δ.

We conclude that for an ε > 0 a δ > 0 exists so that |f(x) - f(x[sub]0[/sub])| < ε whenever |x - x[sub]0[/sub]| < δ, and this means that f is continuous at x[sub]0[/sub]. QED.

References:
1 Tom M. Apostol, Calculus, Volume 1, One-Variable Calculus, with an Introduction to Linear Algebra, Second edition, 1967, p. 130-131.
2 The wikipedia page for Continuous function.

#13 Re: This is Cool » How to reduce the time checking if two lists contain the same elements » 2013-01-23 21:22:13

bobbym Thanks for your welcome, bobbym. I am glad to be here.

scientia Thanks for your post, scientia. I am not as much familiar with the various kinds of mappings as you are. As you probably have observed, I do not use mappings in my proof, I simply use the fact that if A is a subset of B and B is a subset of A then A and B are identical.

I'll see if I can squeeze my brain into solving one of your exercises.

#14 This is Cool » How to reduce the time checking if two lists contain the same elements » 2013-01-22 23:51:43

Ivar Sand
Replies: 3

Suppose you have two lists of elements, and you are wondering if they contain the same elements. You observe that the lists have the same lengths. You check that every element of the first list exists among the elements of the second list. To complete the process, you must find out if every element of the second list exists among the elements of the first list. Now you stop and think: is it necessary to do this last step? No, it is not because the following theorem exists:

Theorem: If two finite sets have the same number of elements and one set is a subset of the other, the two sets are equal.

Proof: We name the sets A and B, and let A be the subset. We argue by contradiction: We assume that the theorem is false, meaning that A and B are not equal. Since A is a subset of B, there must be an element of B which is not in A because otherwise A and B would be equal. Since B contains A and an element that does not exist in A, the number of elements of B ≥ the number of elements of A + 1. However, this is against the requirement that the number of elements of B = number of elements of A. We have thus arrived at a contradiction and have proved the theorem.

This theorem has some practical benefit. Suppose, for example, that you are considering two lists of file names. They are sorted differently, and you suspect them to contain the same elements. To see that the lists have the same lengths is often easy. To compare them, if the lists have e.g. 3 elements, glancing back and forth between them a few times should be sufficient, but if the lists have 5-6 elements or more, this eye method becomes unreliable and a more systematic method, like one of the two methods indicated above, must be used. However, both of these methods have the weakness that to compare the two lists element by element is often a complicated, tedious, and time-consuming task. It is the amount of time spent comparing the lists that you can use the theorem to reduce.

Board footer

Powered by FluxBB