Posts by Ivar Sand

Ivar Sand · This is Cool

An integer is said to be divisible by 3 if the remainder of its division by 3 is zero.

The divisibility by 3 rule says that an integer is divisible by 3 if and only if its cross sum is divisible by 3.

This post treats Proof 1 of Theorem 1 of the post The divisibility by 3 rule – proof by induction (b).
The treatment of Proof 1 in that post was sketchy.
Here, we present the theorem in two ways, an informal version and a formal version.

The informal version does not require any knowledge of mathematics except that the reader should have learned the divisibility by 3 rule. The formal version, on the other hand, requires some knowledge of mathematics.

The informal version can be presented orally. I imagine that this could happen at a class reunion of a class where the divisibility by 3 rule has been taught. The proof of the divisibility by 3 rule may not be a popular subject, though, because everybody does not like mathematics. But one could wait until there is a pause in the conversations and then say, "I have something to say" and then just roll it out – don't let anyone get away!

Informal version.
Consider the number 123.
We shall focus on the structure of this number.
We have:
123 = 1*100 + 2*10 + 3
We use that 100 = 99+1 and 10 = 9+1:
123 = 1*(99 + 1) + 2*(9+1) + 3
We separate out the parts on the right hand side with 9's in them and put them first:
123 = (1*99 + 2*9) + (1*1 + 2*1) + 3
We call (1*99 + 2*9) the 9-part of 123 and (1*1 + 2*1) the 1-part:
123 = 9-part of 123 + 1-part of 123 + 3
We observe that the sum of the 1-part of 123 and 3, noting that 3 is the last digit of 123, is the cross sum of 123:
123 = 9-part of 123 + cross sum of 123
If we now stop and think for a minute we see that our analysis works for any integer x with any number of digits.
Therefore, we have shown that:
x = 9-part of x + cross sum of x
We apply the divisibility condition to either side of this equation:
the divisibility of x by 3 = the divisibility of (9-part of x + cross sum of x) by 3
We observe that the 9-part of x is divisible by 3 so that this part does not contribute to the remainder of the division of (9-part of x + cross sum of x) by 3, so it does not contribute to the divisibility of (9-part of x + cross sum of x) by 3 either.
Therefore, we can remove the 9-part of x from the equation to get:
the divisibility of x by 3 = the divisibility of (cross sum of x) by 3
Voila, our "proof" is finished!

Formal version.
Let x be a positive integer, x >= 0.
Let x have n digits, n >= 1.
Let xi be digit number i of x, where the numbering goes from right to left and the index value of the utmost digit to the right is 0.

Define
"y"⌃i = y...y
where i >= 1 and the integer y...y consists of i digits all of which are equal to y.

We have that:
10⌃i = 10...0
where i >= 0 and 10...0 has i zeroes.
Note that:
10⌃i - 1 = 9...9
where i >= 1 and the integer 9...9 has i digits so that:
10⌃i - 1 = "9"⌃i
according to the definition of "y"⌃i above.

Theorem.
Let x be a positive integer, x >= 0.
Let x have n+1 digits where n >= 0.
Let the digits of x be signified by xi where 0 <= i <= n.

Then x is divisible by 3 if and only if the cross sum of x is divisible by 3.

Proof.
First note the following:
The mod function or the modulo operation (a) produces the remainder of a division.
It is termed: x mod y, where x is the dividend and y is the divisor.
So, that an integer x is divisible by 3 is equivalent to saying that x mod 3 = 0.
Therefore, we define:
the divisibility of x by 3 = (x mod 3 = 0).

We have:
x = sum(i=0,... n) xi * 10⌃i
= sum(i=0,... n) xi * ((10⌃i - 1) + 1)
= sum(i=0,... n) xi * (10⌃i - 1) + sum(i=0,... n) xi * (1)
= sum(i=1,... n) xi * "9"⌃i + sum(i=0,... n) xi
= 3 * sum(i=1,... n) xi * "3"⌃i + sum(i=0,... n) xi as we observe that "9"⌃i = 3 * "3"⌃i

We now introduce the mod function on both sides of the preceding equation:
x mod 3 = (3 * sum(i=1,... n) xi * "3"⌃i + cross sum of x) mod 3 as we note that sum(i=0,... n) xi equals the cross sum of x
= (cross sum of x) mod 3 as the expression 3 * sum(i=1,... n) xi * "3"⌃i does not contribute to the remainder of the division by 3 (as implied by the mod function) because this expression is divisible by 3

Now, suppose that the cross sum of x is divisible by 3.
This means that (cross sum of x) mod 3 = 0.
The equation:
x mod 3 = (cross sum of x) mod 3
gives that x mod 3 too is equal to zero.
Therefore, x is divisible by 3.
So, we have proved that the cross sum of x being divisible by 3 implies that x is divisible by 3.

Next, suppose that x is divisible by 3.
This means that x mod 3 = 0.
The equation:
x mod 3 = (cross sum of x) mod 3
read from right to left gives that (cross sum of x) mod 3 too is equal to zero.
Therefore, the cross sum of x is divisible by 3.
So, we have proved that x being divisible by 3 implies that the cross sum of x is divisible by 3.

All in all, we have proved that x being divisible by 3 is equivalent to the cross sum of x being divisible by 3.
QED

References.
(a) Modulo https://en.wikipedia.org/wiki/Modulo.
(b) The divisibility by 3 rule – proof by induction https://www.mathisfunforum.com/viewtopic.php?id=19234.

Ivar Sand · This is Cool

Thanks for the information, Jai Ganesh.

The hide tag does not seem to work because it produces an extra line feed so that the hide tag appears on the next line. Therefore, it is not possible to use the hide tag inside a line, which is what I need it for. The url tag is ok, though.

Ivar Sand · This is Cool

(See also Behind the scenes of proof by contradiction at https://www.mathisfunforum.com/viewtopic.php?id=19059)

This post treats the same subject as Behind the scenes of proof by contradiction except in a bit more general way as it treats any number of implications while Behind the scenes of proof by contradiction treats only three. Also, the formulation here uses a sequence of theorems, one theorem referencing the next theorem and I hope that this formulation is more understandable than the formulation found in Behind the scenes of proof by contradiction.

Since these two posts treat the same subject, the introduction found in Behind the scenes of proof by contradiction works for this post as well, so it is not repeated here. Therefore, please look in Behind the scenes of proof by contradiction for the introduction to the subject.

A lemma that treats the special case that the two contradictory propositions are each other's negation is included at the end.

First a note on inconsistency or contradiction:
Two propositions (a) X and Y are inconsistent or contradictory if and only if:
X ∧ Y => false    see reference (b)
Further investigation gives:
(X ∧ Y => false) =
(¬(X ∧ Y) ∨ false) = by the rule of inference (d)
¬(X ∧ Y)    by the law of identity (c1)
which expresses that X and Y cannot be true at the same time, and furthermore:
¬(X ∧ Y) =
¬X ∨ ¬Y    by the first of De Morgan's laws (f)
which expresses that either X or Y is false. (The difference between inconsistency and contradiction is that the Boolean values of two contradictory propositions are different whereas inconsistent propositions may both be false (e).)

Theorem 1:
Let n be a natural number, n>=1.
Let A B1 B2 ... Bn be propositions.
Let
A => B1,
B1 => B2,
...
Bn-1 => Bn
all be true, i.e. proven propositions or theorems.
Let A and Bn be inconsistent or contradictory propositions.

Then:
¬A
is true.

Proof:
We have that:
A => B1,
B1 => B2,
...
Bn-1 => Bn
are true.
Then:
A ⇒ B1 ∧ B2 ... ∧ Bn
is true by Theorem 2.

We also have that A and Bn are inconsistent or contradictory (b) which means that (see the note on inconsistency or contradiction above):
¬A ∨ ¬Bn
is true.

We combine these two propositions using a logical and to get:
(A ⇒ B1 ∧ B2 ... ∧ Bn) ∧ (¬A ∨ ¬Bn)
which is true by the definition of conjunction (g) since both (A ⇒ B1 ∧ B2 ... ∧ Bn) and (¬A ∨ ¬Bn) are true.
We manipulate this proposition:
(A ⇒ B1 ∧ B2 ... ∧ Bn) ∧ (¬A ∨ ¬Bn) =
(¬A ∨ B1 ∧ B2 ... ∧ Bn) ∧ (¬A ∨ ¬Bn) = by the rule of inference (d)
¬A ∨ (B1 ∧ B2 ... ∧ Bn ∧ ¬Bn) =    by the distributivity law (c1) (switch the sides of the equation of the distributivity law)
¬A ∨ (B1 ∧ B2 ... ∧ Bn-1 ∧ false) =    by the complements law (c2)
¬A ∨ (false) =    by the repeated use of the definition of conjunction (g)
¬A    by the identity law (c1)
Accordingly, we have proved that:
¬A
is true since the left-hand side of the first equality in the sequence of equalities is true.
This completes the proof.
QED

Theorem 2:
Let n be a natural number, n>=1.
Let A B1 B2 ... Bn be propositions.
Let
A => B1,
B1 => B2,
...
Bn-1 => Bn
all be true.

Then:
A ⇒ B1 ∧ B2 ... ∧ Bn
is true.

Proof:
We use proof by induction (h).
The proof is trivial for n = 1, the base case of proof by induction.

Now for the induction step of proof by induction.
Suppose that the theorem holds for n = i where i is a natural number, i >= 1.
We need to prove that the theorem holds for n=i+1.

We have that:
A ⇒ B1 ∧ B2 ... ∧ Bi and
Bi ⇒ Bi+1
are true propositions.
We need to prove that:
A ⇒ B1 ∧ B2 ... ∧ Bi ∧ Bi+1
is true.

We combine the propositions A ⇒ B1 ∧ B2 ... ∧ Bi and Bi ⇒ Bi+1:
(A ⇒ B1 ∧ B2 ... ∧ Bi) ∧ (Bi ⇒ Bi+1)
which is true by the definition of conjunction (g) since both (A ⇒ B1 ∧ B2 ... ∧ Bi) and (Bi ⇒ Bi+1) are true
We manipulate this proposition:
(A ⇒ B1 ∧ B2 ... ∧ Bi) ∧ (Bi ⇒ Bi+1) =
(¬A ∨ B1 ∧ B2 ... ∧ Bi) ∧ (Bi ⇒ Bi+1) =    by the rule of inference (d)
¬A ∧ (Bi ⇒ Bi+1) ∨ B1 ∧ B2 ... ∧ Bi ∧ (Bi ⇒ Bi+1) = by the distributivity law (c2) (the commutativity law (c2) is also needed)
¬A ∧ (true) ∨ B1 ∧ B2 ... ∧ Bi ∧ (Bi ⇒ Bi+1) =    as Bi ⇒ Bi+1 is true
¬A ∨ B1 ∧ B2 ... ∧ Bi ∧ (Bi ⇒ Bi+1) =    by the identity law (c2)
¬A ∨ B1 ∧ B2 ... ∧ Bi ∧ (¬Bi ∨ Bi+1) =    by the rule of inference (d)
¬A ∨ B1 ∧ B2 ... ∧ Bi ∧ ¬Bi ∨ B1 ∧ B2 ... ∧ Bi ∧ Bi+1 = by the distributivity law (c2)
¬A ∨ B1 ∧ B2 ... ∧ false ∨ B1 ∧ B2 ... ∧ Bi ∧ Bi+1 = by the complements law (c2)
¬A ∨ false ∨ B1 ∧ B2 ... ∧ Bi ∧ Bi+1 =    by the repeated use of the definition of conjunction (g)
¬A ∨ B1 ∧ B2 ... ∧ Bi ∧ Bi+1 =    by the identity law (c1)
A ⇒ B1 ∧ B2 ... ∧ Bi ∧ Bi+1    by the rule of inference (d)
This means that:
A ⇒ B1 ∧ B2 ... ∧ Bi ∧ Bi+1
is true since the left-hand side of the first equality in the sequence of equalities is true.
This completes the induction procedure and the proof of the theorem.
QED

Lemma:
Let n be a natural number, n>=1.
Let A B1 B2 ... Bn be propositions.
Let
A => B1,
B1 => B2,
...
Bn-1 => Bn
all be true.
Let
Bn = ¬A
be true.

Then:
¬A
is true.

Proof:
We have that the propositions:
A => B1,
B1 => B2,
...
Bn-1 => Bn
are all true.

We intend to prove that A and Bn are inconsistent or contradictory (b).
We do this by evaluating an expression for inconsistency or contradiction.
We choose the expression ¬(A ∧ Bn), see the note on inconsistency or contradiction above.
We have:
   ¬(A ∧ Bn) =
   ¬(A ∧ ¬A) = as Bn = ¬A
   ¬false =    by the law of complements (c2)
   true by reference (i)
from which we conclude that A and Bn are inconsistent or contradictory.

Then it follows by Theorem 1 that:
¬A
is true.
QED

References.
(a) Proposition https://en.wikipedia.org/wiki/Proposition.
(b) Contradiction, https://en.wikipedia.org/wiki/Contradiction.
Note: the formulation "a proposition is a contradiction if false can be derived from it, using the rules of the logic" is a bit more general than the one we use here.
(c) Boolean algebra (structure), https://en.wikipedia.org/wiki/Boolean_a … structure), the table under the header Definition.
(c1) the left column of the table.
(c2) the right column of the table.
(d) Material implication (rule of inference), https://en.wikipedia.org/wiki/Material_ … inference).
(e) 9.1: Recognizing Inconsistency and Contradiction https://human.libretexts.org/Bookshelve … tradiction.
(f) De Morgan's laws https://en.wikipedia.org/wiki/De_Morgan%27s_laws, see for instance the first equation under the header Boolean algebra.
(g) Boolean algebra https://en.wikipedia.org/wiki/Boolean_algebra, the table under the header Basic operations.
(h) Mathematical induction https://en.wikipedia.org/wiki/Mathematical_induction.
(i) Boolean domain https://en.wikipedia.org/wiki/Boolean_domain, the information under the header Generalizations.

Ivar Sand · This is Cool

Example of adding two numbers a and b with the result s:

c 101
a 404
b 678
s 1082

Here, c stands for "carry"(2).

The general schema for adding two non-negative integers(1) a and b looks like this:

where n is a natural number(1).
Note that the sequence(5) of digits(3) for c (the carry) is placed one position to the left with respect to the sequences of digits for a and b.

Below, we shall benefit from the definition of x modulo 10(4):
x mod 10 = x1
and from the definition of x // 10:
x // 10 = (x - x1) / 10
where x stands for a non-negative integer(1) with x1 as its least significant digit.
We note that:
x mod 10 picks out the least significant digit of x
x // 10 shifts the digits of x one position to the right and discards the digit to the right of the decimal point

Let n be a natural number(1).

Let the non-negative integers(1) c, a, b, and s be represented by their sequences(5) of digits(3):

so that the values of a, b, and s can be expressed, respectively:

The number of digits is set to n for both a and b.
If one of a and b should at the outset have less than n number of digits, its number of digits is increased by extending its sequence of digits to the left with digits equal to 0 until its number of digits equals n.

According to the general schema for adding two non-negative integers, we define the digits of c like this:

and we define the digits of s like this:

Then:

in other words:

We take a closer look at s:

as stated in the theorem.

Let n be a natural number(1).
Let x be a non-negative integer(1) represented by its sequence(5) of digits(3):

Let:

Let:

Then:

We have:

as stated in the lemma.

References:
(1) Natural number (non-negative integers) https://en.wikipedia.org/wiki/Natural_number
(2) Carry (arithmetic) https://en.wikipedia.org/wiki/Carry_(arithmetic)
(3) Numerical digit https://en.wikipedia.org/wiki/Numerical_digit
(4) Modulo Operation https://www.mathsisfun.com/numbers/modulo.html
(5) Sequences https://www.mathsisfun.com/algebra/sequences-series.html

Ivar Sand · This is Cool

We intend to prove that 2 + 2 = 4.

We start by defining the sequence of natural numbers (1):
1, 2, 3, ... s,t, ...
Here, s, and t represent two consecutive natural numbers.

So far in our presentation, the natural numbers are only symbols.
We define their values by introducing the value equation, which defines uniquely the value of every natural number except for 1 in terms of the value of 1:
for every natural number s: t = s + 1
The natural number s should go through the natural numbers sequentially from 1 so that the right-hand side of the value equation is defined for every value of s.

To make the definition of the value equation complete, we should define the +-operator.
However, we choose instead to introduce an axiom of the natural numbers, the associativity axiom (2):
for every natural number a, b, and c: a + (b + c) = (a + b) + c

We are now in a position to prove:

Theorem:
2 + 2 = 4
Proof:
We operate on the left-hand side of the equation 2 + 2 = 4:
2 + 2
= 2 + (2) as a parenthesis can be put around a singular natural number
= 2 + (1 + 1) as 2 = 1 + 1 by the value equation
= (2 + 1) + 1 by the associativity axiom
= (3) + 1 as 3 = 2 + 1 by the value equation
= 3 + 1 as a parenthesis around a singular natural number can be removed
= 4 as 4 = 3 + 1 by the value equation
We see from this that we have proved that 2 + 2 = 4.
This is what we intended to prove, so the proof is complete.

References:
1 Natural number https://en.wikipedia.org/wiki/Natural_number
2 Associative Laws https://www.mathsisfun.com/associative-commutative-distributive.html

Ivar Sand · This is Cool

In short. Polar coordinates are used to prove the residue theorem for functions represented by a single Laurent series. The path of integration may have a general shape. The reader should be familiar with complex analysis.

Background. The proof emerged from the idea that the proof should contain terms with no contribution to the integral corresponding to the fact that there is no "height difference between the starting and ending points when walking around a mountain top and ending at the starting point". This, and a wish to use as little as possible of the theory of complex analysis, lead to the use of Laurent series and the introduction of a polar coordinate representation in the complex plane.

Complex analysis is the study of complex functions mapping the complex space ℂ to ℂ. A number z in ℂ has two components, a real component and an imaginary component, which is a real number multiplied by i where i=√-1. Nevertheless, algebraically, we regard z as we do a number in ℝ, i.e. as a single entity.

An analytic function is a function that can be represented by a Taylor series. A meromorphic function is a function that is analytic except at isolated points where it has poles. Locally around a point in ℂ, a meromorphic function can be represented by a Laurent series about that point. A Laurent series is a Taylor series to which terms with negative exponents are added. To represent a meromorphic function throughout its domain, several Laurent series may be required.

An integral of a complex function is like a line integral of a scalar real function mapping ℝ² into ℝ.

Laurent series. We identify a meromorphic function f(z) by its Laurent series about a point c,

where z ∈ D, an open disc with radius > 0 and centre at c. f is analytic everywhere on D except at c where f may have a pole(1).

A Laurent series can be viewed as the sum of two series. The part with index n ranging from -∞ to -1 converges on ℂ - {c}, and the other part with n ranging from 0 to ∞ converges on D. These two series are power series, in 1/(z-c) and z-c respectively.

A Laurent series is integrable term by term along rectifiable paths within closed and bounded subsets of D - {c}.(2)

Parameterised integration path. Let P be a rectifiable (i.e. P has a finite length)(3), closed, and connected path, which is contained in a closed and bounded subset of D - {c}. Let a parameterisation of P be given,
P: z=z(t), t∈[α,β]⊂ℝ, α<β,
z(β)=z(α),
the angular part of z(t)-c increases by 2π as t varies from α to β,
z(t) is continuous,
z(t) is piecewise continuously differentiable(4) on [α,β].
The equation
z(t) = r(t)e^(iθ(t)) + c
defines the radial function r(t) = |z(t)-c| and the angular function(5) θ(t) = arg(z(t)-c). The properties of r(t) and θ(t) correspond to the ones of z(t) (stated without proof) and are,
r(β)=r(α),
r(t) is continuous,
r(t) is piecewise continuously differentiable(4) on [α,β]
and
θ(β)=θ(α)+2π,
θ(t) is continuous,
θ(t) is piecewise continuously differentiable(4) on [α,β].

Polar coordinates in ℂ. We calculate the derivative of z(t)-c, which equals z'(t),

r(t) in the denominator above is never 0 on P because P does not contain c.
We use the last expression for z'(t) above in the first of the following definitions,
dz = z'(t)dt
dr = r'(t)dt
dθ = θ'(t)dt
and get,
dz = z'(t)dt
= r'(t)(z(t)-c)/r(t)dt + i(z(t)-c)θ'(t)dt
= (z(t)-c)/r(t)r'(t)dt + i(z(t)-c)θ'(t)dt
= (z-c)/r dr + i(z-c)dθ
from which we identify the radial part, dz_r, and angular part, dz_a, of dz,
dz = dz_r + dz_a
dz_r = (z-c)/r dr
dz_a = i(z-c)dθ.
Note in the expressions of the differentials of the complex polar coordinates, dz_r and dz_a, the corresponding differentials of the real polar coordinates, dr and dθ.

The residue theorem for a function represented by a single Laurent series.

The direction of integration is counterclockwise around the point c, see the description of P above.
Proof. Since P is a rectifiable path in a closed and bounded subset of D - {c}, we can integrate the Laurent series term by term,

Thus we are led to consider ∮_P (z-c)^n dz, n = ... -2, -1, 0, 1, 2, ..., and we need to prove,

where δ_(n,-1) = 1 if n = -1 otherwise 0.

Now we introduce polar coordinates in the left-hand side of [1],

We separate the n=-1 part from the rest,

We use the fact that n=-1 in the n=-1 part. Also, in the third integral, we make use of the parameterisation of P,

(Note that since r'(t) is only piecewise continuous on [α,β], the integrand of the integral above over [α,β] may not always be defined. This problem is circumvented by defining the value of this integral as the sum of the integrals with the same integrands as the integral above but with integration intervals that are the subintervals of [α,β] where r'(t) is continuous.(4))
We observe ∮_P 1/r dr=0; this is an example of the "walking around a mountain" effect we talked about. Also, since we integrate once around c in a counterclockwise direction, ∮_P dθ=2π,

We observe that we have already gotten the result we are looking for in line one, so we aim at proving that the expression within the last parenthesis is zero,

r^n is the derivative of r^(n+1)/(n+1),

Since r(t) and θ(t) are continuous and piecewise continuously differentiable on [α,β], so are r(t)^(n+1)/(n+1) and e^(iθ(t)(n+1)). This allows us to use integration by parts for the integral over [α,β], see Lemma 1 below,

Now we use r(β)=r(α) and θ(β)=θ(α)+2π,

e^(i2π(n+1))=1 yields,

Two elements have opposite signs and cancel each other; they are another example of the "walking around a mountain" effect. We are left with only two integrals within the last parenthesis,

We find that these two integrals cancel each other,

This is the right-hand side of [1], and the proof is complete.

Appendix.
Lemma 1 (integration by parts).

where g and h are complex-valued functions.
Proof. We separate the real and the imaginary parts, g(t) = g_1(t) + ig_2(t) and h(t) = h_1(t) + ih_2(t) where g_1(t), g_2(t), h_1(t), and h_2(t) are real-valued functions.

We substitute the real-valued functions for the complex-valued functions on the left-hand side of the equation in the lemma,

The functions in the integrands of these four integrals are real-valued functions, and that allows us to use integration by parts(6),

This expression is identical to the right hand side of the equation in the lemma, and the proof is complete.

References.
(1) Laurent series http://www.encyclopediaofmath.org/index.php/Laurent_series
(2) Laurent series (Uniqueness) http://en.wikipedia.org/wiki/Laurent_series#Uniqueness
(3) Arc length https://en.wikipedia.org/wiki/Arc_length
(4) Reinhold Remmert, Theory of Complex Functions http://books.google.no/books?id=uP8SF4jf7GEC&pg=PA173&lpg=PA173&dq=integrate+piecewise+continuously+differentiable&source=bl&ots=yMrRr6wE4v&sig=izem7hy8bHrw8TkKS-NEvhujMrs&hl=no&sa=X&ei=dpPWUbSzGoOv4QS_poC4DA&redir_esc=y#v=onepage&q=integrate%20piecewise%20continuously%20differentiable&f=false, p. 173-174.
(5) Argument (complex analysis) http://en.wikipedia.org/wiki/Argument_(complex_analysis)
(6) Integration by parts http://en.wikipedia.org/wiki/Integration_by_parts

Ivar Sand · This is Cool

(See also The divisibility by 3 rule – direct proof at https://www.mathisfunforum.com/viewtopic.php?id=33973)

In short. The divisibility by 3 rule is shown by using proof by induction. The objects of induction are sets consisting of three consecutive integers.

Background. While out walking, I tried from time to time through a period of several years to figure out how to prove that a number is divisible by 3 if and only if the sum of its digits is divisible by 3. I was not able to do that. It seemed obvious that proof by induction would have to be employed. The first step of proof by induction was easy and natural, but the next step seemed to be difficult. I found that it was necessary to sit down and concentrate in order to be able to write a proof.

Now I know that the proof is naively simple. The proof is also smart, I think, a proof that I would not be likely to find myself unless by accident.

The divisibility by 3 rule was obviously discovered by accident. I think one never suspected it to be true in the first place and therefore never started looking for a proof. If that had been the situation, my idea of there being two kinds of mathematicians, the exploring mathematicians who pioneer the results, and the smart mathematicians who write the elegant proofs that survive to the text books (see my thread "Exploring mathematicians and smart mathematicians"), might apply.

Admittedly, on this background it looks stupid to make an "exploring mathematician's" proof. Nevertheless I wanted to try to construct such a proof because I found the work challenging. Anyway, we get an exercise of doing an exploring trip into the realm of mathematics. Below, Proof 1 is the smart proof and Proof 2 is my proof.

Theorem 1. An integer is divisible by 3 if and only if the sum of its digits is divisible by 3.
Proof 1¹. I'll only sketch the proof here. One observes that any integer x can be written as a sum of products, every product being of the form of a digit of x multiplied by 1, 10, 100, or ... 10 ... 0, where the last number, 10 ... 0, has n digits where n is the number of digits of x. One then observes that the numbers 10, 100, and ... 10 ... 0 can be written as respectively 9 + 1, 99 + 1, ... 9 ... + 1. When these sums are multiplied by their respective digits of x and are summed, one gets one sum of products of factors including the factor 9 plus another sum. The first sum is clearly divisible by 3 so that the divisibility by 3 of x is the same as the divisibility by 3 of the second sum plus the least significant digit of x, which proves to be the sum of the digits of x.

Proof 2. We limit ourselves to non-negative integers. The proof for non-positive integers is similar.

First we observe that the theorem is true for a number consisting of one digit because, in this case, the number itself and the sum of its digit(s) have the same value and therefore have the same divisibility by 3 property, i.e. they are either both divisible by 3 or both not divisible by 3. Now, we intend to use proof by induction. We must decide what objects the proof by induction shall refer to. I suggest these objects to be sets of three consecutive integers. The first set consists of 0, 1, and 2, call this set S_1. The maximum element of S_i, where i ≥ 1, is 1 less than the least element of S_(i+1) so that the union of all the S_i sets is the set of the non-negative integers, the set referred to in the theorem. Let us call S_i valid if the theorem is proved for all three numbers in it.

We have already seen that the theorem is true for each of the numbers in S_1, so S_1 is valid. Suppose S_i is valid for some i ≥ 1. We have to prove that S_(i+1) is valid, as well.

Definitions:
x is the least number in S_(i+1).
s() is a function that gives the sum of the digits of its non-negative integer argument.

Now take a look at "A proof diagram for the proof of Theorem 1" in the appendix.

Let us start with x, the least number of S_(i+1). It is easy to see that x equals 3*i, which is clearly divisible by 3. We must show that the sum of the digits of x, s(x), also is divisible by 3.

x-3 in S_i is divisible by 3 because x is.

Since S_i is valid, s(x-3) is divisible by 3 because x-3 is divisible by 3.

Lemma 1 gives us that s(x) is divisible by 3 because s(x-3) is divisible by 3. This concludes the proof for x.

Let us consider x+1, the middle number of S_(i+1). x+1 is 1 higher than a number divisible by 3 and is therefore itself not divisible by 3. We must show that s(x+1) is not divisible by 3 either.

We send x+1 down to S_i by subtracting 3: x+1 - 3 = x-2. x-2 is not divisible by 3 because x is divisible by 3 while the difference between x and x-2, 2, is not divisible by 3.

Since S_i is valid, s(x-2) is not divisible by 3 because x-2 is not divisible by 3.

Lemma 1 gives us that s(x+1) is not divisible by 3 because s(x-2) is not divisible by 3. This concludes the proof for x+1.

Let us consider x+2, the highest number of S_(i+1) and the last one we have to consider. Its case is similar to the case of x+1, and the proof is left to the reader. QED.

Appendix.
Lemma 1. If the sum of the digits of a non-negative integer x is divisible by 3 then the same holds for the sum of the digits of x + 3, and if the sum of the digits of x is not divisible by 3 then the same holds for the sum of the digits of x + 3.
Proof. Definitions:
d_j() is a function that produces digit number j of its non-negative integer argument where j = 1 for the least significant digit of the argument.
s() is a function that gives the sum of the digits of its non-negative integer argument.

We need to show that s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3.

If d_1(x) ≤ 6 then d_1(x+3) = d_1(x) + 3 and d_j(x+3) = d_j(x) for j > 1. Therefore, for such an x, s(x+3) = s(x) + 3. Since s(x+3) - s(x) is divisible by 3, s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3, and this finishes the treatment of this case.

If d_1(x) ≥ 7 then d_1(x+3) = d_1(x) - 7 and there is a carry equalling 1. Now suppose, without limiting the situation in any way, that d_j(x) = 9 for j = 2, 3, ... k and d_(k+1)(x) < 9 for a k ≥ 1. Then d_j(x+3) = 0 for j = 2, 3, ... k and d_(k+1)(x+3) = d_(k+1)(x) + 1. We find s(x+3) = s(x) - 7 + (-9)*(k-1) + 1 = s(x) + (-9)*(k-1) - 6. Since s(x+3) - s(x) is divisible by 3, s(x+3) and s(x) share the same divisibility by 3 property, and this concludes the proof.

A proof diagram for the proof of Theorem 1.
Below is a proof diagram where y = x, x+1, or x+2. All y, y-3, s(y-3), and s(y) in the diagram have the same divisibility by 3 property.

Reference.
1. http://mathforum.org/k12/mathtips/division.tips.html. (This web page does not exist any more.)

Ivar Sand · This is Cool

Thanks for your post, cmowla.

To be honest, I guess I used the idea of there being smart mathematicians and exploring mathematicians as a way of marketing my proof, hehe. However, a Physics professor at the university where I studied did say to me once that he liked an A- (or B+) student, or not quite first-rate student, better than an A student because an A- student is more innovative than an A student, who is conservative. An A- student is familiar with making mistakes and is not afraid of making them. An A student, on the other hand, is afraid of making mistakes and does not dare say things that might prove to be false.

I have posted a comment on your post about the basic limit laws after having spent some time trying to understand your post. I hope you appreciate my comment although I understand that you are not into calculus at the moment.

Well, I think there is such a thing as mathematical maturity. I remember taking a course in linear algebra once. It was a spring course, and I had the feeling of understanding nothing before Easter and everything after Easter.

I cannot comment on Rubik's cube as I am not into that area.

Regarding your item [1]: I guess we have to accept that mathematics is difficult; there seems to be general acceptance of this. This causes papers to be poorly understood, which is sad. However, maybe someone someday will develop a mathematics language that would make mathematics easy to understand.

Regarding your item [2]: I think an idea behind listing in a paper the prerequisite knowledge is to suggest to the unqualified reader to stop reading the paper so as to avoid wasting his or her time reading it. By the way, I have now included the text "(calculus)" in the name of this thread so that readers not interested in calculus can skip the thread.

By the way, perhaps there is a simple contribution in my post #1 in regard of the effort of making papers easy to read. There, I try to align in successive lines the same texts in mathematical expressions. That way, it is easy for the reader to see what is unchanged in a pair of lines so as to find these most interesting parts of the lines fast.

Ivar Sand · Help Me !

In your post #1 you state that you evaluated the absolute value(s) in the second quote to produce the statement in the third quote. However, I have a counterexample. Set x = a + δ/2 and f(a + δ/2) = L - 2ε. In this case the statement in the second quote is false, whereas the statement in the third quote is true.

Also, in the proof of Sum Law, the theorem statement has ε whereas the last statement in the proof has ε_2. Shouldn't these two epsilons have been the same? The same or something similar goes for Constant Law and Product Law.

Ivar Sand · This is Cool

In short. This post presents the idea that there are two kinds of mathematicians, the exploring mathematicians who pioneer the results, and the smart mathematicians who follow and produce the elegant proofs that survive to the text books. A theorem with a proof that is considered smart is chosen from a text book, and then a more elementary proof is presented.

Background. One day, not so long ago, I started wondering about how to prove that a continuous function defined on a closed interval is bounded. I remembered that I had been thinking about the same problem many years earlier and that I had not been able to find a proof. I remembered that I had a runaround experience while trying to locate a point where the function approached infinity. So I gave up trying to construct a proof and looked up the proof in a text book. I found that the proof was smart, a proof that I could not be depend upon being able to find myself.

In spite of this, I continued to think that a straightforward proof ought to exist for this theorem. I fantasised about there being two kinds of mathematicians, the exploring mathematician and the smart mathematician; the exploring mathematicians being the pioneer who is the first one to prove a theorem. The exploring mathematicians would apply engineering-like methods to construct a proof, which might prove to be long and complicated. Then a smart mathematician would follow and replace the exploring mathematician's proof with a smarter and more elegant proof. The new proof would stand the test of time better than the original proof and survive to the text books. Whether this distinction has anything to do with reality, I don't know.

Nevertheless I wanted to try to construct a more elementary proof of the theorem. So I did (or tried to), and I posted this article to demonstrate how such an exploring mathematician's proof might look like. Below, Proof 1 is the "smart" proof and Proof 2 is my proof.

The theorem¹. A continuous function defined on a closed interval is bounded.
Proof 1. I'll only sketch the proof here. One assumes the contrary, that the continuous function is unbounded. One divides the interval into two equal parts. One observes that the function has to be unbounded on at least one of the two parts. One chooses one of the parts, divides that interval into two equal parts, chooses the part where the function is unbounded, and so on. The lengths of the chosen intervals approach zero, and the leftmost endpoints of the intervals approach a point in the function's interval of definition. One uses the concept of supremum to prove that the sequence of endpoints converges. The function's continuity at that point is then shown to imply that the function cannot be unbounded on an interval around that point after all.

Comment. One needs to verify that the unboundedness property and the continuity property collide. I think a difficulty lies in the fact that the unboundedness property is the property of an area or interval whereas the continuity property is the property of a point. (The unboundedness property is not the property of a point in the interval of definition because if a function were unbounded or infinite at a point it would be undefined there.) I guess the smartness of Proof 1 lies in the fact that the function's unboundedness property is included in every step of the process. So to speak, one keeps the unboundedness property in hand all the way while one searches for a point in order to use the function's continuity there.

Proof 2. Let the function's name be f and its closed interval of definition be [a,b]. As in Proof 1, we assume the opposite that f is unbounded, and aim at a contradiction.

We start looking for f's unboundedness at a. Since f is defined at a it is finite there, so a is not the point we are searching for. If a = b f cannot be unbounded, so we deduce that b must be greater than a.

Now let the argument of f, call it x, increase continuously from the point a while we look for a point where |f(x)| = |f(a)| + 1. Since f is continuous, f(x) varies smoothly as x increases, so there are no jumps in the function value. The first point we encounter that satisfies |f(x)| = |f(a)| + 1 we call x_1. We proceed from x_1 moving continuously towards higher values of x looking for a point where |f(x)| = |f(x_1)| + 1 and call the first such point x_2. Proceeding in this way we get a sequence of points {x_1, x_2, ...}.

Suppose we reach b, and we find that our sequence contains a finite number of elements. If the sequence is empty, f is bounded on [a,b] by |f(a)| + 1, otherwise f is bounded by |f(x_n)| + 1 where x_n is the last element in {x_1, x_2, ...}. In both cases f is bounded. We conclude that the number of points in {x_1, x_2, ...} is infinite.

The function value at a point x_i in {x_1, x_2, ...} for an i > 1 satisfies |f(x_i)| = |f(x_(i-1))| + 1. We find |f(x_i)| = |f(a)| + i for i ≥ 1 which implies that |f(x_i)| approaches ∞ as i increases.

Since the monotonic sequence {x_1, x_2, ...} is bounded (by b), it converges². Call the point of convergence x_0.

Choose an ε > 0. We have:
1. Since f is continuous at x_0, there is a δ > 0 so that |f(x) - f(x_0)| < ε for all x satisfying |x - x_0)| < δ. Now
|f(x) - f(x_0)| < ε ⇒
|f(x) - f(x_0)| + |f(x_0)| < ε + |f(x_0)| ⇒
|f(x) - f(x_0) + f(x_0)| ≤ |f(x) - f(x_0)| + |f(x_0)| < ε + |f(x_0)| ⇒
|f(x)| ≤ |f(x) - f(x_0)| + |f(x_0)| < ε + |f(x_0)| ⇒
|f(x)| < ε + |f(x_0)|
We conclude:
|f(x)| < |f(x_0)| + ε whenever |x - x_0)| < δ (1).
2. Since {x_1, x_2, ...} converges towards x_0, there is an integer m1 so that
|x_i - x_0| < δ whenever i > m1 (2).
3. Since |f(x_i)| approaches ∞ with i, there is an integer m2 so that
|f(x_i)| > |f(x_0)| + ε whenever i > m2 (3).

Now choose m = max (m1,m2). This gives us new versions of (2) and (3):
|x_i - x_0| < δ whenever i > m (2')
|f(x_i)| > |f(x_0)| + ε whenever i > m (3').

The continuity condition (1) expresses that f is bounded by |f(x_0)| + ε on the open interval (x_0 - δ, x_0 + δ) whereas the unboundedness or divergence condition (3') expresses that |f(x_i)| is greater than this bound for an i satisfying i > m, call it j. Now since x_j lies in (x_0 - δ, x_0 + δ) according to the convergence condition (2'), we deduce from (1):
|f(x_j)| < |f(x_0)| + ε
and from (3'):
|f(x_j)| > |f(x_0)| + ε.
These two inequalities are inconsistent, and this contradiction allows us to conclude that our original assumption of f being unbounded is false. QED.

References.
1. Tom M. Apostol, Calculus, Volume 1, One-Variable Calculus, with an Introduction to Linear Algebra, Second edition, 1967, p. 150.
2. Ibid., p. 381.

Ivar Sand · This is Cool

(See also Behind the scenes of proof by contradiction II at https://www.mathisfunforum.com/viewtopic.php?id=33858)

In short. We start with an example: A ⇒ B, B ⇒ C, C ⇒ D, where A is an assumed condition and D is a condition contradictory to A. According to proof by contradiction(a), we conclude ¬A. We demonstrate the truth of this conclusion by proving that A ⇒ B ∧ C ∧ D and then showing that this implication reduces to ¬A.

Background. When we construct proofs, we often use the technique of assuming something to be true while suspecting it is false and then proceeding to derive a contradiction(b) which allows us to conclude (and prove) that what we assumed to be true is instead false.

I often wondered what the logic behind this kind of reasoning was like. Maybe I learned it some long time ago, but I am not sure. As a matter of fact, I think I never learned it. Therefore, I thought it might be fun to do a bit of exploring in order to find out what the logic of proof by contradiction might look like, and this thread is the result.

(Note that we treat a special case of proof by contradiction. I believe that the proof of the general case lies in mathematical logic, a branch of mathematics that I never studied, and that the proof of the general case is completely different from the proof presented here.)

Outline. Section A shows an example of the use of proof by contradiction, and section B reveals the logic behind the results in section A.

Proven statements are identified by numbers within parentheses.
The right part of the line containing a statement may contain a comment indicating how the statement was produced.

A. An example of the use of proof by contradiction.
I guess there are many ways a proof by contradiction(a) can go. I have chosen to demonstrate one of them.

First we assume A to be true. Note that this does not mean that we consider A to be a proven statement. We procede by deducing something from A:
A ⇒ B (1)
Then we deduce something from B:
B ⇒ C (2)
Then another deducing step:
C ⇒ D (3)
where D is the contradictory condition(b) we are seeking. We conclude from this by the principle of proof by contradiction(a) that:
¬A (4)

B. An analysis of the proof by contradiction in section A.
From the combined implications in section A, we here produce what we might call the grand formula of proof by contradiction. Then we proceed by showing that this formula reduces to (4) which means that ¬A is indeed true.

1. First we combine (1) and (2) into one condition:
(A ⇒ B) ∧ (B ⇒ C) (5) (1) and (2)
which is true because (1) and (2) are true.
Next we manipulate (5). Note: I use = instead of ⇔ since either means the same on the set Bool(e).
(A ⇒ B) ∧ (B ⇒ C) = (5)
(¬A ∨ B) ∧ (B ⇒ C) = by A ⇒ B is defined as ¬A ∨ B(d)
¬A ∧ (B ⇒ C) ∨ B ∧ (B ⇒ C) = by law of distributivity(c)
¬A ∧ (true) ∨ B ∧ (¬B ∨ C) = by (2) and B ⇒ C is defined as ¬B ∨ C(d)
¬A ∨ (B ∧ ¬B ∨ B ∧ C) = by law of identity(c) and law of distributivity(c)
¬A ∨ (false ∨ B ∧ C) = by law of complements(c)
¬A ∨ B ∧ C = by law of identity(c)
A ⇒ B ∧ C (6) by A ⇒ B ∧ C is defined as ¬A ∨ B ∧ C(d)

2. Now we combine (6) and (3):
(A ⇒ B ∧ C) ∧ (C ⇒ D) (7) (6) and (3)
which is true because (6) and (3) are true.
By using (3), we treat (7) in the same way as we used (2) to treat (5):
(A ⇒ B ∧ C) ∧ (C ⇒ D) = (7)
(¬A ∨ B ∧ C) ∧ (C ⇒ D) = by A ⇒ B ∧ C is defined as ¬A ∨ B ∧ C(d)
¬A ∧ (C ⇒ D) ∨ B ∧ C ∧ (C ⇒ D) = by law of distributivity(c)
¬A ∧ (true) ∨ B ∧ C ∧ (¬C ∨ D) = by (3) and C ⇒ D is defined as ¬C ∨ D(d)
¬A ∨ (B ∧ C ∧ ¬C ∨ B ∧ C ∧ D) = by law of identity(c) and law of distributivity(c)
¬A ∨ (false ∨ B ∧ C ∧ D) = by law of complements(c)
¬A ∨ B ∧ C ∧ D = by law of identity(c)
A ⇒ B ∧ C ∧ D (8) by A ⇒ B ∧ C ∧ D is defined as ¬A ∨ B ∧ C ∧ D(d)
(8) is an example of what I call the grand formula of proof by contradiction. It expresses that we can combine the statements (1) (2) and (3) into one statement, i.e. (A ⇒ B) ∧ (B ⇒ C) ∧ (C ⇒ D), and replacing "B) ∧ (B ⇒" with "B ∧" and "C) ∧ (C ⇒" with "C ∧".

3. Let us study the contradictory(b) situation between A and D a bit closer. This situation means that A and D cannot be true at the same time. Therefore, if A is true then D is false and vice versa.
Let us have a closer look at (8):
A ⇒ B ∧ C ∧ D = (8)
¬A ∨ B ∧ C ∧ D = by A ⇒ B ∧ C ∧ D is defined as ¬A ∨ B ∧ C ∧ D(d)
true ∧ (¬A ∨ B ∧ C ∧ D) = by law of identity(c)
(A = true ∨ A = false) ∧ (¬A ∨ B ∧ C ∧ D) = as (A = true ∨ A = false) is true as A is either true or false
(A = true) ∧ (¬A ∨ B ∧ C ∧ D) ∨ (A = false) ∧ (¬A ∨ B ∧ C ∧ D) =
by law of distributivity(c)
(A = true) ∧ (false ∨ B ∧ C ∧ false) ∨ (A = false) ∧ (true ∨ B ∧ C ∧ D) =
as D (the first one) is false when A is true
(A = true) ∧ (false ∨ false) ∨ (A = false) ∧ (true) =
as B ∧ C ∧ false = false(f1) and true ∨ B ∧ C ∧ D = true(f2)
false ∨ ¬A = by law of identity(c) and (A = true) ∧ false = false(f1) and (A = false) = ¬A
¬A by law of identity(c)
which is the same as (4), which is what we wanted to prove.

References.
(a) Proof by contradiction, https://en.wikipedia.org/wiki/Proof_by_contradiction.
(b) Contradiction, https://en.wikipedia.org/wiki/Contradiction.
(c) Boolean algebra (structure), https://en.wikipedia.org/wiki/Boolean_algebra_(structure).
(d) Material implication (rule of inference), https://en.wikipedia.org/wiki/Material_implication_(rule_of_inference).
(e) Logical biconditional, https://en.wikipedia.org/wiki/Logical_biconditional (see Truth table, the column A↔B).
(f) Boolean algebra, https://en.wikipedia.org/wiki/Boolean_algebra.
1. See Logical operation Conjunction.
2. See Logical operation Disjunction.

Ivar Sand · This is Cool

Now I have updated post #4. I have removed the word differentiable and also made other changes in order to make the mathematics more precise.

Ivar Sand · This is Cool

I see, Bob. I agree that f is not differentiable at x_0. I'm sorry about that. I propose to replace in post #4:
"Suppose f is differentiable at a point x_0 and that its derivative at x_0 is infinite"
by:
"Suppose f is not differentiable at a point x_0 because its derivative is infinite there"

Ivar Sand · This is Cool

Well, anonimnystefy, let's see. We go to the Wikipedia page named "Continuous function", the paragraph "Weierstrass and Jordan definitions (epsilon–delta) of continuous functions" and the sentence "For any positive real number ..." This sentence expresses something like: given an ε > 0, a δ > 0 exists ..., where ε is part of a condition in the codomain and δ is part of a condition in the domain. Topsy-turvy continuity, on the other hand, looks like: given a δ > 0, a k > 0 exists ..., where δ is part of a condition in the domain and k is part of a condition in the codomain.

To summarise:
The continuity of a function f at x_0 is defined:
Given an ε > 0, a δ > 0 exists so that ∣f(x) - f(x_0)∣ < ε whenever ∣x - x_0∣ < δ.
The topsy-turvy continuity of a function f at x_0 is defined:
Given a δ > 0 a k > 0 exists so that |f(x) - f(x_0)| ≤ k |x - x_0| whenever |x - x_0| < δ.

Ivar Sand · This is Cool

No, Bob, I haven't. I suppose you are thinking about whether continuity implies topsy-turvy continuity. I don't think this is true so a counter-proof would be called for. I believe the problem is connected to points in f's domain where f has an infinite derivative. Let's have a closer look.

Let f have an infinite derivative at a point x_0. This means that (f(x) - f(x_0)) / (x - x_0) approaches infinity as x approaches x_0, or, in other words, that for every M > 0 a δ > 0 exists so that (f(x) - f(x_0)) / (x - x_0) > M whenever |x - x_0| < δ and x ≠ x_0. (We need the condition x ≠ x_0 to avoid 0/0, which is undefined.) Now we take the absolute value of the left hand side of this inequality and multiply both sides by |x - x_0|, and we get
|f(x) - f(x_0)| > M |x - x_0| whenever |x - x_0| < δ and x ≠ x_0 (1)

Now, topsy-turvy continuity at x_0 means: given a δ' > 0, a k > 0 exists so that
|f(x) - f(x_0)| ≤ k |x - x_0| whenever |x - x_0| < δ' (2)

Let us try to prove that f is not topsy-turvy continuous at x_0. We do that by proposing (2) to be true and see if we can derive a contradiction.

We choose M = k + 1 in (1). Regard the point x_1 = x_0 + min(δ',δ)/2. Since |x_1 - x_0| < δ and |x_1 - x_0| < δ' x_1 fits into both (1) and (2). From (1) we get:
|f(x_1) - f(x_0)| > (k + 1) |x_1 - x_0|,
and from (2) we get:
|f(x_1) - f(x_0)| ≤ k |x_1 - x_0|.
These two inequalities are contradictory to each other, and we conclude that our proposition of (2) being true is false. Therefore f is not topsy-turvy continuous at x_0.

Ivar Sand · This is Cool

Background. As you would now, the continuity(1)(2) of a function f at x_0, a point in the domain of f, is defined using a condition in f's codomain first ("given ε > 0 ∣f(x) - f(x_0)∣ < ε") and then proceeding with a condition in f's domain ("then there exists a δ > 0 so that ∣x - x_0∣ < δ").

Once a friend of mine remarked that he felt that the definition of continuity is unnatural in that it starts in a function's codomain instead of in its domain. I remember that I had the same feeling myself when I learned about continuity. I guess that the natural approach, as indicated by my friend, was adopted first, and that the standard definition came later and proved to be more fruitful than the natural approach in some way. (This seems indeed to have been the case.(2))

I thought that it might be fun to see what such a natural approach would look like. (If you disagree this is the place to stop reading this thread.) So I put my mathematical explorer's boots on and dived into the jungle of mathematics. I wanted to see if I could build myself a natural definition of continuity and compare this concept with the existing definition of continuity. I decided to call this natural definition of continuity topsy-turvy continuity because it does things in the opposite way as compared to the existing definition of continuity. The result follows below.

Definition. A function f is said to be topsy-turvy continuous at a point x_0 if for a given δ > 0 a k > 0 exists such that whenever |x - x_0| < δ, it is true that |f(x) - f(x_0)| ≤ k |x - x_0|.

Comments:
- This is just a simple definition of topsy-turvy continuity.
- Note that k may depend on x_0 and δ.
- Note that ≤ is used instead of <. This is to cover the case x = x_0 (a < would have given 0 < 0).

Now we must check that this definition relates to the standard definition of continuity in a proper way. A topsy-turvy continuous function should be continuous. However, the topsy-turvy continuity concept may be weaker than standard continuity so that a continuous function may not be topsy-turvy continuous at all points. Therefore, topsy-turvy continuity should imply continuity. Indeed it does, as is stated in the following theorem.

Theorem. A topsy-turvy continuous function f is continuous.
Proof: Let ε > 0 be given. Choose some point x_0 in the domain of f. Since f is topsy-turvy continuous at x_0, a positive k exists so that when |x - x_0| < ε it is true that |f(x) - f(x_0)| ≤ k |x - x_0|.

Define δ = min (ε, ε/k). We observe that δ > 0. Since we have δ ≤ ε, the topsy-turvy continuity condition |f(x) - f(x_0)| ≤ k |x - x_0| still holds with the same constant k when |x - x_0| < δ.

When |x - x_0| < δ it is also true that |x - x_0| < ε/k since δ ≤ ε/k, and multiplying both sides by k yields k |x - x_0| < k*ε/k = ε. Using this in |f(x) - f(x_0)| ≤ k |x - x_0| gives |f(x) - f(x_0)| < ε still maintaining |x - x_0| < δ.

We conclude that for an ε > 0 a δ > 0 exists so that |f(x) - f(x_0)| < ε whenever |x - x_0| < δ, and this means that f is continuous at x_0. QED.

References:
(1) Tom M. Apostol, Calculus, Volume 1, One-Variable Calculus, with an Introduction to Linear Algebra, Second edition, 1967, p. 130-131.
(2) The wikipedia page for Continuous function.

Ivar Sand · Exercises

Ivar Sand · This is Cool

bobbym Thanks for your welcome, bobbym. I am glad to be here.

scientia Thanks for your post, scientia. I am not as much familiar with the various kinds of mappings as you are. As you probably have observed, I do not use mappings in my proof, I simply use the fact that if A is a subset of B and B is a subset of A then A and B are identical.

I'll see if I can squeeze my brain into solving one of your exercises.

Ivar Sand · This is Cool

Suppose you have two lists of elements, and you are wondering if they contain the same elements. You observe that the lists have the same lengths. You check that every element of the first list exists among the elements of the second list. To complete the process, you must find out if every element of the second list exists among the elements of the first list. Now you stop and think: is it necessary to do this last step? No, it is not because the following theorem exists:

Theorem: If two finite sets have the same number of elements and one set is a subset of the other, the two sets are equal.

Proof: We name the sets A and B, and let A be the subset. We argue by contradiction: We assume that the theorem is false, meaning that A and B are not equal. Since A is a subset of B, there must be an element of B which is not in A because otherwise A and B would be equal. Since B contains A and an element that does not exist in A, the number of elements of B ≥ the number of elements of A + 1. However, this is against the requirement that the number of elements of B = number of elements of A. We have thus arrived at a contradiction and have proved the theorem.

This theorem has some practical benefit. Suppose, for example, that you are considering two lists of file names. They are sorted differently, and you suspect them to contain the same elements. To see that the lists have the same lengths is often easy. To compare them, if the lists have e.g. 3 elements, glancing back and forth between them a few times should be sufficient, but if the lists have 5-6 elements or more, this eye method becomes unreliable and a more systematic method, like one of the two methods indicated above, must be used. However, both of these methods have the weakness that to compare the two lists element by element is often a complicated, tedious, and time-consuming task. It is the amount of time spent comparing the lists that you can use the theorem to reduce.

Math Is Fun Forum

#1 This is Cool » The divisibility by 3 rule – direct proof » 2025-09-04 01:47:10

#2 Re: This is Cool » Behind the scenes of proof by contradiction II » 2025-08-26 22:46:30

#3 This is Cool » Behind the scenes of proof by contradiction II » 2025-08-25 01:56:57

#4 This is Cool » Proof that the standard procedure for adding two numbers is correct » 2023-09-10 23:11:44

#5 This is Cool » 2 + 2 = 4 » 2023-08-18 01:02:45

#6 This is Cool » A proof of the residue theorem (complex analysis) » 2013-05-15 21:23:27

#7 This is Cool » The divisibility by 3 rule – proof by induction » 2013-04-14 20:41:34

#8 Re: This is Cool » Exploring mathematicians and smart mathematicians » 2013-03-25 00:02:12

#9 Re: Help Me ! » Easier Proofs for the Basic Limit Laws? » 2013-03-19 21:08:44

#10 This is Cool » Exploring mathematicians and smart mathematicians » 2013-03-11 23:36:35

#11 This is Cool » Behind the scenes of proof by contradiction » 2013-03-03 21:40:40

#12 Re: This is Cool » Topsy-turvy continuity » 2013-02-03 23:26:59

#13 Re: This is Cool » Topsy-turvy continuity » 2013-01-31 21:16:55

#14 Re: This is Cool » Topsy-turvy continuity » 2013-01-30 22:44:14

#15 Re: This is Cool » Topsy-turvy continuity » 2013-01-30 22:07:24

#16 This is Cool » Topsy-turvy continuity » 2013-01-30 00:09:39

#17 Re: Exercises » Scientias exercises » 2013-01-24 23:52:16

#18 Re: This is Cool » How to reduce the time checking if two lists contain the same elements » 2013-01-23 21:22:13

#19 This is Cool » How to reduce the time checking if two lists contain the same elements » 2013-01-22 23:51:43

Board footer