走火入魔 Orz
想著 toyasm 睡不著,竟在剛剛完成說明書(spec)了 Orz。JK-extended TOY assembly 的設計有點模仿 C++ 與 C 早期的關係 ─ 新功能都以舊語言實作,並改良一些細節。目前全文如下。
NAME
toyasm - Joshsoft TOY Assembler
SYNOPSIS
toyasm [-c] [-a] [-o outfile] infile
OPTIONS
-c Use classical (cyy) TOY assembly. If not specified, JK-extended TOY assembly will be used.
-a Assemble only and generate object file; do not link.
-o outfile Specifying the output file name. If not specified, the default name is 'a.toy' or 'a.obj'.
DESCRIPTION
The first part is a description of classical (cyy) TOY assembly. JK-extended TOY assembly are explained later.
The TOY assembly language is case-insensitive and line-oriented. Instructions and directives cannot span multiple lines, and one line can at most contain one instruction or directive. Within a line, you can add arbitrary number of whitespaces as you like.
Comments begin with semicolons and extend to the end of the line.
There are two types of numeric literals: decimal and hexidecimal. Hexadecimal literals begin with 0x.
Two assembly directives for data declaration:
DW for declaring a single variable, optionally initialized
DUP for declaring an uninitialized array
Examples:
A DW 32 ; variable A initialized to 32 B DW ; uninitialized variable B C DUP 10 ; array C of length 10Data declarations must precede the first instruction.
Labels consist of letters and digits, and they must begin with a letter. No colons after labels.
Program starts from the first instruction the assembler meets.
Instructions:
0 hlt 1 add RD, RS, RT 2 sub RD, RS, RT 3 and RD, RS, RT 4 xor RD, RS, RT 5 shl RD, RS, RT 6 shr RD, RS, RT 7 lda RD, addr 8 ld RD, addr 9 st RD, addr A ldi RD, RT B sti RD, RT C bz RD, addr D bp RD, addr E jr RD F jl RD, addr
The rest describes JK-extended TOY assembly. The greatest change from classical to JK-extended TOY assembly is that the latter supports a simulated stack, thereby implements several "built-in" stack operations and procedure-related directives. It also introduces label scope. To avoid potential confusion and for convenience, RF is always constant 1 in the entire program and RB is read-only in a procedure. RE also plays a special role. These are explained later.
By default, all labels (including variables and procedures) are local to the translation unit. You can precede a label with the keyword "export" to make the label globally visible. Labels inside a procedure cannot be exported.
Examples:
A DW ; A is local to this translation unit export B DW ; B can be seen from other translation units PROC p ; p is local to this translation unit ; ... loop1 sub R1, R1, R2 ; loop1 is local to p ; ... ENDPROC export PROC q ; q can be called from other translation units ; ... loop1 add R3, R1, R2 ; ok, this loop1 does not conflict with the loop1 in p ; ... ENDPROC
The stack is located in the tail part of the memory, growing from higher address to lower one. RE is the stack pointer, pointing to the next position where the next pushed element will be put. When the program starts, RE is initialized to 0xFE and RF is initialized to 1 and guaranteed to remain 1, so you can use RF as the constant 1. The two stack operations and their equivalent classical code are listed below:
push RX sti RX, RE sub RE, RE, RF pop RX ; RX cannot be RF. Inside a procedure, RX cannot be RB, either. add RE, RE, RF ldi RX, RE pop add RE, RE, RF
Procedures begin with a procedure declaration, which consists of the keyword PROC and the name of the procedure, and end with the keyword ENDPROC. When the assembler sees a procedure, it automatically does the following translation, building up the stack frame:
export PROC p export p sti RF, RE ; push RF (the return address) ; code ... ==> lda RF, 1 ; restore RF to constant 1 ENDPROC sub RE, RE, RF sti RB, RE ; save previous RB add RB, RE, R0 ; RB <- RE sub RE, RE, RF ; code ... ret ; if the line above ENDPROC ; is not "ret"
If a translation unit contains only procedures and data declarations (i.e., no "exposed" instructions), it can only be assembled into an object file.
Two directives "call" and "ret" are used to transfer control to and from a procedure.
call p ; p is a visible procedure jl RF, p lda RF, 1 ret ; valid only inside a procedure add RE, RB, 0 ; RE <- RB, clear local variables ldi RB, RE ; restore previous RB add RE, RE, RF ldi RF, RE ; return address now in RF jr RF
The following figures show the stages of a procedure call and the status of the stack:
1. pushing arguments (caller)
push R1 push R2 | | | | RE -> |____| | | ==> RE -> |____| ==> |0001| RE -> |____| |0002| |0002|
2. calling the procedure (caller & callee)
call p | | RE -> |____| RE -> |____| |00F1| <- RB |001D| ==> |001D| |0001| |0002| |0002| RB = 00F1 |0001|
3. allocating spaces for local variables (callee)
sub RE, RE, RF RE -> |____| |????| |00F1| <- RB |001D| |0001| |0002|
4. returning from the procedure (callee & caller)
ret RE -> |____| | | | | |????| | | | | |00F1| <- RB ==> RE -> |____| ==> | | |001D| |001D| RE -> |____| |0001| |0001| |0001| |0002| |0002| RB = 00F1 |0002| RF = 001D
5. clearing arguments (caller)
pop pop RE -> |____| | | | | |0001| ==> RE -> |____| ==> | | |0002| |0002| RE -> |____|
AUTHOR
Josh Ko, Department of Computer Science and Information Engineering, National Taiwan University.
ACKNOWLEDGMENTS
Department of Computer Science, Princeton for inventing the TOY machine.
Yung-Yu Chuang (cyy) at Dept. of CSIE, NTU for designing the classical TOY assembly and instructing the course on Computer Organization and Assembly Languages.
--
cyy 法力廣大無邊啊 Orz,該不會最後被他拐去做圖學吧 XD。
天呀XD
<< 回到主頁