I'm not sure, but this is the
best way I have found so far to
get the number in human readable
form.
Here is pi, shown backwards as
it came into memory on my PC:
24 93 97 58 53 26 59 41 31
Here it turned around.
314159265358979324
I multiplied pi by
one-thousand, five times and
then multiplied in by ten,
twice. I did that because
the "fbstp" command for
conversion to BCD (binary
coded decimal) rounds the
number to a whole number, so
I would have lost the fractional
part if I didn't move the
decimal point over first.
Also, the "fbld" command I learned
to use to load in an integral
number of 18 digits in length,
which I use for the constants
of value 1000, and 10.
Here is some of the code to enjoy reading
that displayed the 18 digits of pi
backwards like this:
24 93 97 58 53 26 59 41 31
jf_thousand db 0x00,0x10,0, 0,0,0,
0,0,0, 0,0,0, 0,0,0, 0,0,0, 0,0,0
;;bcd is 18 bytes long; I have 21 bytes
here to make sure.
;;
;;This is low-nibble first, hence 3
zeros and a 1 for 1000.
jf_ten db 0x10,0,0, 0,0,0, 0,0,0,
0,0,0, 0,0,0, 0,0,0, 0,0,0 ;21
bytes, only need 18 according to
documentation.
section '.code' code readable executable
start:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
fwait
fninit
fwait
fldpi
fbld tword[jf_thousand]
fmulp
fbld tword[jf_thousand]
fmulp
fbld tword[jf_thousand]
fmulp
fbld tword[jf_thousand]
fmulp
fbld tword[jf_thousand]
fmulp
fbld tword[jf_ten]
fmulp
fbld tword[jf_ten]
fmulp
fbstp tword[jf_print_area + 20]
fwait
;;;;Below loop makes the number human
readable
;;;;by enlarging each nibble or half-
byte into
;;;;a whole byte character '0'..'9' and
'A'..'F'
;;;;Note for BCD, only '0' to '9'
appear.
mov ebx,0
@@: ;;beginLOOPHEXupNUMat20
mov al,[jf_print_area + 20 + ebx]
mov cl,al
and al,0x0F
or al,0x30
cmp al,0x39
jbe jf_skip_09hex_notAtoF_AAAD
add al,7 ;;because of ASCII chart 0x3A
--> 0x41, letter A (numbers and letters
are not contiguous like HEX)
jf_skip_09hex_notAtoF_AAAD:
shr cl,4
and cl,0x0F
or cl,0x30
cmp cl,0x39
jbe jf_skip_hex_again_9A_border_AAAD
add cl,7
jf_skip_hex_again_9A_border_AAAD:
mov [jf_print_area + ebx + ebx],cl
mov [jf_print_area + ebx + ebx + 1],al
inc ebx
cmp ebx, 9
jbe @B ;;endLOOPHEXupNUMat20
;;output was 35C26821A2DA0FC900405Âh!¢ÚÉ
;;The twenty Hex digits represent the 80 bit code for pi, though it might be
;;backwards, or mixed up a little still.
fldz
fstp tword[jf_print_area + 30]
fldpi
fstp tword[jf_print_area + 20]
fwait
mov ebx,0
@@:
mov al,[jf_print_area + 20 + ebx]
mov cl,al
and al,0x0F
or al,0x30
cmp al,0x39
jbe jf_skip_09hex_notAtoF_AAAD
add al,7 ;;because of ASCII chart 0x3A --> 0x41, letter A (numbers and letters are not contiguous like HEX)
jf_skip_09hex_notAtoF_AAAD:
shr cl,4
and cl,0x0F
or cl,0x30
cmp cl,0x39
jbe jf_skip_hex_again_9A_border_AAAD
add cl,7
jf_skip_hex_again_9A_border_AAAD:
mov [jf_print_area + ebx + ebx],cl
mov [jf_print_area + ebx + ebx + 1],al
inc ebx
cmp ebx, 9
jbe @B
;;output was 35C26821A2DA0FC90040 5Âh!¢ÚÉ
Here's the output:
5Âh!¢ÚÉ
Now I have to do some more learning...
]]>The purpose of this is to be able to compute trig in the
fastest possible language known to the PC.
That's not exactly correct. Every thing that runs on the computer, whether it be programmed in Java, Lisp, C++, or Whitespace, runs with machine code.
However, the translation from something such as C++ code to machine code may be inefficient, and by programming directly in assembly (which is machine code, for all practical purposes) you may be able to write the code more efficiently.
Typically the most you can do is reduce the number of times variables need to be loaded for flushed out of the registers. This is one of the longest operations of the CPU, so it can speed up a program a great bit.
As for trig functions, you have the potential to speed them up in assembly. I've known others who have somehow gotten access to the graphics card through assembly, and were able to use it as a second processor. For those who are not aware, the graphics card is like its own specialized processor. It has it's out PU (processing unit) and memory.
Perhaps the trig functions on a GPU are much more specialized than those on the CPU. After all, the amount of trig functions required in graphics is amazing.
]]>