In short. The divisibility by 3 rule is shown by using proof by induction. The objects of induction are sets consisting of three consecutive integers.
Background. While out walking, I tried from time to time through a period of several years to figure out how to prove that a number is divisible by 3 if and only if the sum of its digits is divisible by 3. I was not able to do that. It seemed obvious that proof by induction would have to be employed. The first step of proof by induction was easy and natural, but the next step seemed to be difficult. I found that it was necessary to sit down and concentrate in order to be able to write a proof.
Now I know that the proof is naively simple. The proof is also smart, I think, a proof that I would not be likely to find myself unless by accident.
The divisibility by 3 rule was obviously discovered by accident. I think one never suspected it to be true in the first place and therefore never started looking for a proof. If that had been the situation, my idea of there being two kinds of mathematicians, the exploring mathematicians who pioneer the results, and the smart mathematicians who write the elegant proofs that survive to the text books (see my thread "Exploring mathematicians and smart mathematicians"), might apply.
Admittedly, on this background it looks stupid to make an "exploring mathematician's" proof. Nevertheless I wanted to try to construct such a proof because I found the work challenging. Anyway, we get an exercise of doing an exploring trip into the realm of mathematics. Below, Proof 1 is the smart proof and Proof 2 is my proof.
Theorem 1. An integer is divisible by 3 if and only if the sum of its digits is divisible by 3.
Proof 1¹. I'll only sketch the proof here. One observes that any integer x can be written as a sum of products, every product being of the form of a digit of x multiplied by 1 or 10 or 100 or … 100…0, where the last number, 100…0, has n digits where n is the number of digits of x. One then observes that the numbers 10, 100, and … 100…0 can be written as respectively 9 + 1, 99 + 1, … 99…9 + 1. When these sums are multiplied by their respective digits of x and are summed, one gets one sum of products of factors including the factor 9 plus another sum. The first sum is clearly divisible by 3 so that the divisibility by 3 of x is the same as the divisibility by 3 of the second sum, which proves to be the sum of the digits of x.
Proof 2. We limit ourselves to nonnegative integers. The proof for nonpositive integers is similar.
First we observe that the theorem is true for a number consisting of one digit because, in this case, the number itself and the sum of its digit(s) have the same value and therefore have the same divisibility by 3 property, i.e. they are either both divisible by 3 or both not divisible by 3. Now, we intend to use proof by induction. We must decide what objects the proof by induction shall refer to. I suggest these objects to be sets of three consecutive integers. The first set consists of 0, 1, and 2, call this set S_{1}. The maximum element of S_{i}, where i ≥ 1, is 1 less than the least element of S_{i+1} so that the union of all the S_{i} sets is the set of the nonnegative integers, the set referred to in the theorem. Let us call S_{i} valid if the theorem is proved for all three numbers in it.
We have already seen that the theorem is true for each of the numbers in S_{1}, so S_{1} is valid. Suppose S_{i} is valid for some i ≥ 1. We have to prove that S_{i+1} is valid, as well.
Definitions:
x is the least number in S_{i+1}.
s() is a function that gives the sum of the digits of its nonnegative integer argument.
Now take a look at "A proof diagram for the proof of Theorem 1" in the appendix.
Let us start with x, the least number of S_{i+1}. It is easy to see that x equals 3*i, which is clearly divisible by 3. We must show that the sum of the digits of x, s(x), also is divisible by 3.
x3 in S_{i} is divisible by 3 because x is.
Since S_{i} is valid, s(x3) is divisible by 3 because x3 is divisible by 3.
Lemma 1 gives us that s(x) is divisible by 3 because s(x3) is divisible by 3. This concludes the proof for x.
Let us consider x+1, the middle number of S_{i+1}. x+1 is 1 higher than a number divisible by 3 and is therefore itself not divisible by 3. We must show that s(x+1) is not divisible by 3 either.
We send x+1 down to S_{i} by subtracting 3: x+1  3 = x2. x2 is not divisible by 3 because x is divisible by 3 while the difference between x and x2, 2, is not divisible by 3.
Since S_{i} is valid, s(x2) is not divisible by 3 because x2 is not divisible by 3.
Lemma 1 gives us that s(x+1) is not divisible by 3 because s(x2) is not divisible by 3. This concludes the proof for x+1.
Let us consider x+2, the highest number of S_{i+1} and the last one we have to consider. Its case is similar to the case of x+1, and the proof is left to the reader. QED.
Appendix.
Lemma 1. If the sum of the digits of a nonnegative integer x is divisible by 3 then the same holds for the sum of the digits of x + 3, and if the sum of the digits of x is not divisible by 3 then the same holds for the sum of the digits of x + 3.
Proof. Definitions:
d_{j}() is a function that produces digit number j of its nonnegative integer argument where j = 1 for the least significant digit of the argument.
s() is a function that gives the sum of the digits of its nonnegative integer argument.
We need to show that s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3.
If d_{1}(x) ≤ 6 then d_{1}(x+3) = d_{1}(x) + 3 and d_{j}(x+3) = d_{j}(x) for j > 1. Therefore, for such an x, s(x+3) = s(x) + 3. Since s(x+3)  s(x) is divisible by 3, s(x+3) and s(x) are either both divisible by 3 or both not divisible by 3, and this finishes the treatment of this case.
If d_{1}(x) ≥ 7 then d_{1}(x+3) = d_{1}(x)  7 and there is a carry equalling 1. Now suppose, without limiting the situation in any way, that d_{j}(x) = 9 for j = 2, 3, … k and d_{k+1}(x) < 9 for a k ≥ 1. Then d_{j}(x+3) = 0 for j = 2, 3, … k and d_{k+1}(x+3) = d_{k+1}(x) + 1. We find s(x+3) = s(x)  7 + (9)*(k1) + 1 = s(x) + (9)*(k1)  6. Since s(x+3)  s(x) is divisible by 3, s(x+3) and s(x) share the same divisibility by 3 property, and this concludes the proof.
A proof diagram for the proof of Theorem 1.
Below is a proof diagram where y = x, x+1, or x+2. All y, y3, s(y3), and s(y) in the diagram have the same divisibility by 3 property.
S_{i+1}: y s(y)
 ∧
Subtract 3 >   < Lemma 1
∨ 
S_{i}: y3 > s(y3)
∧

S_{i} is valid
Reference.
1. http://mathforum.org/k12/mathtips/division.tips.html.