Difference between revisions of "Z80 Routines:Graphic:Fastcopy"
(erase buffer fastcopy is now at the same speed as the regulair one) |
m |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Category:Z80 Routines:Graphic| | + | [[Category:Z80 Routines:Graphic|FastCopyToLCD]][[Category:Z80 Routines|FastCopyToLCD]] |
__FORCETOC__ | __FORCETOC__ | ||
− | + | ==IonFastCopy== | |
− | == | + | |
'''Warning:''' ''The routines presented below may fail on some calculators due to manufacturing defects. Before using any of the routines below, read the safecopy section below.'' | '''Warning:''' ''The routines presented below may fail on some calculators due to manufacturing defects. Before using any of the routines below, read the safecopy section below.'' | ||
The '''Fastcopy''' routine is used to copy the content of the Graph Buffer to the screen. It concerns all TI-z80 calculators, except TI-85 and TI-86, that has a special RAM area directly mapped to the screen. | The '''Fastcopy''' routine is used to copy the content of the Graph Buffer to the screen. It concerns all TI-z80 calculators, except TI-85 and TI-86, that has a special RAM area directly mapped to the screen. | ||
− | + | FastCopy is widely used, because the rom call _GrBufCpy waits too long between each output to the LCD driver. Using Fastcopy instead of _GrBufCpy increases significantly the speed of a program if it refreshes often the display (such as a lot of games). Most shells have this routine as built-in (ION, MirageOS, Venus ...). | |
Here is Joe Wingbermuehle's version, which is the one used in ION. | Here is Joe Wingbermuehle's version, which is the one used in ION. | ||
Line 14: | Line 13: | ||
;-----> Copy the gbuf to the screen (fast) | ;-----> Copy the gbuf to the screen (fast) | ||
;Input: nothing | ;Input: nothing | ||
− | ;Output:graph buffer is copied to the screen | + | ;Output: graph buffer is copied to the screen |
fastCopy: | fastCopy: | ||
di | di | ||
Line 85: | Line 84: | ||
'''ld (hl),0''' takes 10 cycles, so we can replace both '''inc hl''' (6 cycles) whit '''nop''' (4 cycles) and thus load 12 into de instead of 10. | '''ld (hl),0''' takes 10 cycles, so we can replace both '''inc hl''' (6 cycles) whit '''nop''' (4 cycles) and thus load 12 into de instead of 10. | ||
− | this is still 2 clocks slower than the | + | this is still 2 clocks slower than the original routine, so we change '''inc de''' and '''dec de''' (6 cycles) and use '''ret z''' (5 cycles) instead. |
==Safe Copy== | ==Safe Copy== | ||
Many calculators recently manufactured by TI contained a buggy LCD driver, which had different (or varying) delays required to interface with it. Using the fast copy routines above with the LCDs will cause the LCD to display garbled information. However, we can do some additional hardware work to solve this problem, by waiting until we know the LCD is ready to accept a command: | Many calculators recently manufactured by TI contained a buggy LCD driver, which had different (or varying) delays required to interface with it. Using the fast copy routines above with the LCDs will cause the LCD to display garbled information. However, we can do some additional hardware work to solve this problem, by waiting until we know the LCD is ready to accept a command: | ||
− | * Bit 7 of Port 10 tells us that the | + | * Bit 7 of Port 10 tells us that the LCD can accept an instruction |
* '''SE Only:''' Bit 1 of Port 2 tells us that how long since the last lcd access | * '''SE Only:''' Bit 1 of Port 2 tells us that how long since the last lcd access | ||
Line 144: | Line 143: | ||
ret | ret | ||
</nowiki> | </nowiki> | ||
+ | |||
+ | ==Double-buffered copy== | ||
+ | |||
+ | This routine uses two buffers. The first, (HL), should contain the current contents of the LCD and the second, (DE), contains the new data to be sent to the LCD. The routine compares the contents of the two buffers and only sends the bytes which have been altered. This can save a lot of time in programs which don't alter the screen much, though it will cost extra time if a lot is changing. | ||
+ | |||
+ | This routine is also "safe" - it will wait on the LCD busy bit rather than assuming it is ok to send. | ||
+ | |||
+ | <nowiki> | ||
+ | ;------------------------------------------------------------------------------- | ||
+ | ; | ||
+ | ; === DoubleBufferFlip === | ||
+ | ; | ||
+ | ; Sends data to the LCD driver by comparing the new frame (DE) with the | ||
+ | ; old one (HL) and only sending the bytes which have been altered. | ||
+ | ; | ||
+ | ; The front-buffer (HL) is updated to keep in-sync with the back-buffer (DE). | ||
+ | ; | ||
+ | ; INPUTS: | ||
+ | ; | ||
+ | ; REGISTERS | ||
+ | ; * HL - Address of front-buffer (holding current contents of the display). | ||
+ | ; * DE - Address of back-buffer (holding new contents of the display). | ||
+ | ; | ||
+ | ; OUTPUTS: | ||
+ | ; | ||
+ | ; MEMORY | ||
+ | ; * (HL) - Synchronised with (DE) | ||
+ | ; | ||
+ | ; | ||
+ | ; DESTROYED: | ||
+ | ; | ||
+ | ; REGISTERS | ||
+ | ; * AF, BC | ||
+ | ; | ||
+ | ;------------------------------------------------------------------------------- | ||
+ | DoubleBufferFlip: | ||
+ | |||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; We will be writing to the LCD in Y auto-decrement mode, so | ||
+ | ; set the mode first and then add 767 to the buffer addresses, | ||
+ | ; since we will be reading them backwards. | ||
+ | ;------------------------------------------------------------------- | ||
+ | ld c, $10 ; [7] command port number. | ||
+ | ld a, $06 ; [7] | ||
+ | in f, (c) ; [12] wait on LCD. | ||
+ | jp m, $-2 ; [10] | ||
+ | out ($10), a ; [11] set y auto-decrement | ||
+ | ld bc, 767 ; [10] | ||
+ | add hl, bc ; [11] | ||
+ | ex de, hl ; [4] front-buffer <-> back-buffer | ||
+ | add hl, bc ; [11] | ||
+ | ex de, hl ; [4] back-buffer <-> front-buffer | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; The accumulator will be used to hold the current row command | ||
+ | ; which will be sent to the LCD at the beginning of each line. | ||
+ | ; This value will be kept on the stack through most of the code | ||
+ | ; since we only need it every now and then. C will contain the | ||
+ | ; LCD command port ($10) for the entire routine. | ||
+ | ;------------------------------------------------------------------- | ||
+ | ld a, $bf ; [7] row command counter. | ||
+ | ld c, $10 ; [7] command port number. | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; This is the beginning of the outer loop which we come into at | ||
+ | ; the start of each line. We load B with 12 to serve as a column | ||
+ | ; counter. This is why we go backwards - so we can use DJNZ in | ||
+ | ; the inner loop and then offset B to get the set-column command | ||
+ | ; when we need it. At this point we also output the row command | ||
+ | ; before pushing AF. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _nextLine: ld b, $0c ; [7] reset column counter. | ||
+ | in f, (c) ; [12] wait on LCD. | ||
+ | jp m, $-2 ; [10] | ||
+ | out (c), a ; [12] output set row command. | ||
+ | push af ; [11] | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; This is where we compare two buffer bytes to see if we need | ||
+ | ; to write anything. If we don't, we keep looping until we do or | ||
+ | ; we reach the end of the line. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _testByte: ld a, (de) ; [7] load new byte | ||
+ | cp (hl) ; [7] compare existing | ||
+ | jr nz, _putByte ; [12/7] | ||
+ | dec de ; [6] | ||
+ | dec hl ; [6] | ||
+ | djnz _testByte ; [13/8] | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; This is the tail of the outer loop (the row loop). We pop | ||
+ | ; the row counter and see if there are any more lines left to | ||
+ | ; process. If there aren't we play nice with TI-OS and reset | ||
+ | ; x auto-increment mode before returning. We also reset HL and | ||
+ | ; DE to their input values. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _loopTail: pop af ; [10] | ||
+ | dec a ; [4] | ||
+ | jp m, _nextLine ; [10] | ||
+ | in f, (c) ; [12] | ||
+ | jp m, $-2 ; [10] | ||
+ | ld a, $05 ; [7] reset x auto-increment | ||
+ | out (c), a ; [11] | ||
+ | inc hl ; [6] | ||
+ | inc de ; [6] | ||
+ | ret ; [10] | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; We come here when a difference has been found between a byte in | ||
+ | ; the front and back buffers. We offset B (the column counter) | ||
+ | ; to get the column command and then send it to the LCD. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _putByte: ld a, $1f ; [7] | ||
+ | add a, b ; [4] | ||
+ | in f, (c) ; [12] | ||
+ | jp m, $-2 ; [10] | ||
+ | out (c), a ; [12] set column | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; Here we update the front buffer, load the byte to send to the | ||
+ | ; LCD, wait for the LCD to become ready, and then send it. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _streamPut: ld a, (de) ; [7] load byte to send. | ||
+ | ld (hl), a ; [7] update front-buffer. | ||
+ | dec de ; [6] | ||
+ | dec hl ; [6] | ||
+ | in f, (c) ; [12] wait on LCD. | ||
+ | jp m, $-2 ; [10] | ||
+ | out ($11), a ; [11] | ||
+ | djnz _streamChk ; [13/8] | ||
+ | jp _loopTail ; [10] | ||
+ | |||
+ | ;------------------------------------------------------------------- | ||
+ | ; Finally, we get here when we have just sent a byte to the LCD | ||
+ | ; and there are still more bytes left in this line. Rather than | ||
+ | ; going back to the inner loop, we check to see if the next byte | ||
+ | ; also needs to be sent. Doing this saves us the need to output | ||
+ | ; the column command between consecutive writes. | ||
+ | ;------------------------------------------------------------------- | ||
+ | _streamChk: ld a, (de) ; [7] | ||
+ | cp (hl) ; [7] | ||
+ | jp nz, _streamPut + 1 ; [10] | ||
+ | dec de ; [6] | ||
+ | dec hl ; [6] | ||
+ | djnz _testByte ; [13/8] | ||
+ | jp _loopTail ; [10] | ||
+ | ;------------------------------------------------------------------------------- | ||
+ | ; End of DoubleBufferFlip | ||
+ | ;------------------------------------------------------------------------------- | ||
+ | </nowiki> |
Latest revision as of 14:17, 26 October 2009
IonFastCopy
Warning: The routines presented below may fail on some calculators due to manufacturing defects. Before using any of the routines below, read the safecopy section below.
The Fastcopy routine is used to copy the content of the Graph Buffer to the screen. It concerns all TI-z80 calculators, except TI-85 and TI-86, that has a special RAM area directly mapped to the screen.
FastCopy is widely used, because the rom call _GrBufCpy waits too long between each output to the LCD driver. Using Fastcopy instead of _GrBufCpy increases significantly the speed of a program if it refreshes often the display (such as a lot of games). Most shells have this routine as built-in (ION, MirageOS, Venus ...).
Here is Joe Wingbermuehle's version, which is the one used in ION.
;-----> Copy the gbuf to the screen (fast) ;Input: nothing ;Output: graph buffer is copied to the screen fastCopy: di ld a,$80 out ($10),a ld hl,gbuf-12-(-(12*64)+1) ld a,$20 ld c,a inc hl dec hl fastCopyAgain: ld b,64 inc c ld de,-(12*64)+1 out ($10),a add hl,de ld de,10 fastCopyLoop: add hl,de inc hl inc hl inc de ld a,(hl) out ($11),a dec de djnz fastCopyLoop ld a,c cp $2B+1 jr nz,fastCopyAgain ret
Remarks and Improvements
- Some instructions in Joe Wingbermuehle's Fastcopy are only there for having enough delay between two outputs to the LCD driver. One can modify Fastcopy to change these useless instructions into instructions that will clear the Graph Buffer at the same time:
;-----> Copy the gbuf to the screen and clear graph buffer (fast) ;Input: nothing ;Output:graph buffer is copied to the screen and subsequently cleared fastCopy: di ld a,$80 out ($10),a ld hl,gbuf-12-(-(12*64)+1) ld a,$20 ld c,a inc hl dec hl fastCopyAgain: ld b,64 inc c ld de,-(12*64)+1 out ($10),a add hl,de ld de,12 fastCopyLoop: add hl,de nop ret z ld a,(hl) ld (hl),0 ; clears the graph buffer at the same time out ($11),a ret z djnz fastCopyLoop ld a,c cp $2B+1 jr nz,fastCopyAgain ret
ld (hl),0 takes 10 cycles, so we can replace both inc hl (6 cycles) whit nop (4 cycles) and thus load 12 into de instead of 10. this is still 2 clocks slower than the original routine, so we change inc de and dec de (6 cycles) and use ret z (5 cycles) instead.
Safe Copy
Many calculators recently manufactured by TI contained a buggy LCD driver, which had different (or varying) delays required to interface with it. Using the fast copy routines above with the LCDs will cause the LCD to display garbled information. However, we can do some additional hardware work to solve this problem, by waiting until we know the LCD is ready to accept a command:
- Bit 7 of Port 10 tells us that the LCD can accept an instruction
- SE Only: Bit 1 of Port 2 tells us that how long since the last lcd access
The TI-OS and other apps generally use port 2 but being that bit 7 is sign bit we can use that as faster method of waiting.
;-----> Copy the gbuf to the screen, guaranteed ;Input: nothing ;Output:graph buffer is copied to the screen, no matter the speed settings ; ;in f,(c) is an unofficial instruction. ;It must be noted that you cannot specify any other register. Only f works. ;You may have to add it in order for the routine to work. ;if addinstr doesn't work, you may manually insert the opcodes .db 0EDh,070h .addinstr IN F,(C) 70ED 2 NOP 1 SafeCopy: di ;DI is only required if an interrupt will alter the lcd. ld hl,PlotSScreen ;This can be commented out or another entry placed after it ;so the buffer can be provided by option. ld c,$10 ld a,$80 setrow: in f,(c) jp m,setrow out ($10),a ld de,12 ld a,$20 col: in f,(c) jp m,col out ($10),a push af ld b,64 row: ld a,(hl) rowwait: in f,(c) jp m,rowwait out ($11),a add hl,de djnz row pop af dec h dec h dec h inc hl inc a cp $2c jp nz,col ret
Double-buffered copy
This routine uses two buffers. The first, (HL), should contain the current contents of the LCD and the second, (DE), contains the new data to be sent to the LCD. The routine compares the contents of the two buffers and only sends the bytes which have been altered. This can save a lot of time in programs which don't alter the screen much, though it will cost extra time if a lot is changing.
This routine is also "safe" - it will wait on the LCD busy bit rather than assuming it is ok to send.
;------------------------------------------------------------------------------- ; ; === DoubleBufferFlip === ; ; Sends data to the LCD driver by comparing the new frame (DE) with the ; old one (HL) and only sending the bytes which have been altered. ; ; The front-buffer (HL) is updated to keep in-sync with the back-buffer (DE). ; ; INPUTS: ; ; REGISTERS ; * HL - Address of front-buffer (holding current contents of the display). ; * DE - Address of back-buffer (holding new contents of the display). ; ; OUTPUTS: ; ; MEMORY ; * (HL) - Synchronised with (DE) ; ; ; DESTROYED: ; ; REGISTERS ; * AF, BC ; ;------------------------------------------------------------------------------- DoubleBufferFlip: ;------------------------------------------------------------------- ; We will be writing to the LCD in Y auto-decrement mode, so ; set the mode first and then add 767 to the buffer addresses, ; since we will be reading them backwards. ;------------------------------------------------------------------- ld c, $10 ; [7] command port number. ld a, $06 ; [7] in f, (c) ; [12] wait on LCD. jp m, $-2 ; [10] out ($10), a ; [11] set y auto-decrement ld bc, 767 ; [10] add hl, bc ; [11] ex de, hl ; [4] front-buffer <-> back-buffer add hl, bc ; [11] ex de, hl ; [4] back-buffer <-> front-buffer ;------------------------------------------------------------------- ; The accumulator will be used to hold the current row command ; which will be sent to the LCD at the beginning of each line. ; This value will be kept on the stack through most of the code ; since we only need it every now and then. C will contain the ; LCD command port ($10) for the entire routine. ;------------------------------------------------------------------- ld a, $bf ; [7] row command counter. ld c, $10 ; [7] command port number. ;------------------------------------------------------------------- ; This is the beginning of the outer loop which we come into at ; the start of each line. We load B with 12 to serve as a column ; counter. This is why we go backwards - so we can use DJNZ in ; the inner loop and then offset B to get the set-column command ; when we need it. At this point we also output the row command ; before pushing AF. ;------------------------------------------------------------------- _nextLine: ld b, $0c ; [7] reset column counter. in f, (c) ; [12] wait on LCD. jp m, $-2 ; [10] out (c), a ; [12] output set row command. push af ; [11] ;------------------------------------------------------------------- ; This is where we compare two buffer bytes to see if we need ; to write anything. If we don't, we keep looping until we do or ; we reach the end of the line. ;------------------------------------------------------------------- _testByte: ld a, (de) ; [7] load new byte cp (hl) ; [7] compare existing jr nz, _putByte ; [12/7] dec de ; [6] dec hl ; [6] djnz _testByte ; [13/8] ;------------------------------------------------------------------- ; This is the tail of the outer loop (the row loop). We pop ; the row counter and see if there are any more lines left to ; process. If there aren't we play nice with TI-OS and reset ; x auto-increment mode before returning. We also reset HL and ; DE to their input values. ;------------------------------------------------------------------- _loopTail: pop af ; [10] dec a ; [4] jp m, _nextLine ; [10] in f, (c) ; [12] jp m, $-2 ; [10] ld a, $05 ; [7] reset x auto-increment out (c), a ; [11] inc hl ; [6] inc de ; [6] ret ; [10] ;------------------------------------------------------------------- ; We come here when a difference has been found between a byte in ; the front and back buffers. We offset B (the column counter) ; to get the column command and then send it to the LCD. ;------------------------------------------------------------------- _putByte: ld a, $1f ; [7] add a, b ; [4] in f, (c) ; [12] jp m, $-2 ; [10] out (c), a ; [12] set column ;------------------------------------------------------------------- ; Here we update the front buffer, load the byte to send to the ; LCD, wait for the LCD to become ready, and then send it. ;------------------------------------------------------------------- _streamPut: ld a, (de) ; [7] load byte to send. ld (hl), a ; [7] update front-buffer. dec de ; [6] dec hl ; [6] in f, (c) ; [12] wait on LCD. jp m, $-2 ; [10] out ($11), a ; [11] djnz _streamChk ; [13/8] jp _loopTail ; [10] ;------------------------------------------------------------------- ; Finally, we get here when we have just sent a byte to the LCD ; and there are still more bytes left in this line. Rather than ; going back to the inner loop, we check to see if the next byte ; also needs to be sent. Doing this saves us the need to output ; the column command between consecutive writes. ;------------------------------------------------------------------- _streamChk: ld a, (de) ; [7] cp (hl) ; [7] jp nz, _streamPut + 1 ; [10] dec de ; [6] dec hl ; [6] djnz _testByte ; [13/8] jp _loopTail ; [10] ;------------------------------------------------------------------------------- ; End of DoubleBufferFlip ;-------------------------------------------------------------------------------