<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="https://wikiti.brandonw.net/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://wikiti.brandonw.net/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Galandros</id>
		<title>WikiTI - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://wikiti.brandonw.net/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Galandros"/>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Special:Contributions/Galandros"/>
		<updated>2026-06-05T00:58:02Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.23.5</generator>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Math:Square_root</id>
		<title>Z80 Routines:Math:Square root</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Math:Square_root"/>
				<updated>2011-01-29T09:22:20Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: xor a destroys a, or a doesn't and it indeed clears the carry&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Math|Square root]]&lt;br /&gt;
[[Category:Z80 Routines|Square root]]&lt;br /&gt;
&lt;br /&gt;
==Size Optimization==&lt;br /&gt;
This version is size optimized, it compares every perfect square against HL until a square that is larger is found.  Obviously slower, but does get the job done in only 12 bytes.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;-------------------------------&lt;br /&gt;
;Square Root&lt;br /&gt;
;Inputs:&lt;br /&gt;
;HL = number to be square rooted&lt;br /&gt;
;Outputs:&lt;br /&gt;
;A  = square root&lt;br /&gt;
&lt;br /&gt;
sqrt:&lt;br /&gt;
   ld a,$ff&lt;br /&gt;
   ld de,1&lt;br /&gt;
sqrtloop:&lt;br /&gt;
   inc a&lt;br /&gt;
   dec e&lt;br /&gt;
   dec de&lt;br /&gt;
   add hl,de&lt;br /&gt;
   jr c,sqrtloop&lt;br /&gt;
   ret &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Speed Optimization==&lt;br /&gt;
This version uses the high school method of finding a square root and so it is much faster, running at about ~850 tstates.  Unfortunately it requires 180 bytes and is quite obfuscated.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;-------------------------------&lt;br /&gt;
;Square Root&lt;br /&gt;
;Inputs:&lt;br /&gt;
;DE = number to be square rooted&lt;br /&gt;
;Outputs:&lt;br /&gt;
;A  = square root&lt;br /&gt;
&lt;br /&gt;
sqrt:&lt;br /&gt;
    xor a&lt;br /&gt;
    ld h,a&lt;br /&gt;
    ld l,a&lt;br /&gt;
    ld b,a&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    ld c,1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    rl d&lt;br /&gt;
    rl l&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    add a,a&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    rl e&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    ld c,a&lt;br /&gt;
    scf&lt;br /&gt;
    rl c&lt;br /&gt;
    rl b&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jp c,$+3+2+1&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    inc a&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    ret&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Balanced Optimization==&lt;br /&gt;
This version is a balance between speed and size. It also uses the high school method and runs under 1200 tstates. It only costs 41 bytes.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;-------------------------------&lt;br /&gt;
;Square Root&lt;br /&gt;
;Inputs:&lt;br /&gt;
;DE = number to be square rooted&lt;br /&gt;
;Outputs:&lt;br /&gt;
;A  = square root&lt;br /&gt;
&lt;br /&gt;
Sqrt:&lt;br /&gt;
    ld hl,0&lt;br /&gt;
    ld c,l&lt;br /&gt;
    ld b,h&lt;br /&gt;
    ld a,8&lt;br /&gt;
Sqrtloop:&lt;br /&gt;
    sla e&lt;br /&gt;
    rl d&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    sla e&lt;br /&gt;
    rl d&lt;br /&gt;
    adc hl,hl&lt;br /&gt;
    scf               ;Can be optimised&lt;br /&gt;
    rl c              ;with SL1 instruction&lt;br /&gt;
    rl b&lt;br /&gt;
    sbc hl,bc&lt;br /&gt;
    jr nc,Sqrtaddbit&lt;br /&gt;
    add hl,bc&lt;br /&gt;
    dec c&lt;br /&gt;
Sqrtaddbit:&lt;br /&gt;
    inc c&lt;br /&gt;
    res 0,c&lt;br /&gt;
    dec a&lt;br /&gt;
    jr nz,Sqrtloop&lt;br /&gt;
    ld a,c&lt;br /&gt;
    rr b&lt;br /&gt;
    rra&lt;br /&gt;
    ret&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Presumably the best ==&lt;br /&gt;
&lt;br /&gt;
This code was found on z80 bits and has the advantage of being both faster than all three versions above and smaller than the last two (it runs in under 720 T-states (under 640 if fully unrolled) and takes a mere 29 bytes). On the other hand it takes a somewhat unconventionnal input... It computes the square root of the 16bit number formed by la and places the result in d.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
sqrt_la:&lt;br /&gt;
	ld	de, 0040h	; 40h appends &amp;quot;01&amp;quot; to D&lt;br /&gt;
	ld	h, d&lt;br /&gt;
	&lt;br /&gt;
	ld	b, 7&lt;br /&gt;
	&lt;br /&gt;
	; need to clear the carry beforehand&lt;br /&gt;
	or	a&lt;br /&gt;
	&lt;br /&gt;
_loop:&lt;br /&gt;
	sbc	hl, de&lt;br /&gt;
	jr	nc, $+3&lt;br /&gt;
	add	hl, de&lt;br /&gt;
	ccf&lt;br /&gt;
	rl	d&lt;br /&gt;
	rla&lt;br /&gt;
	adc	hl, hl&lt;br /&gt;
	rla&lt;br /&gt;
	adc	hl, hl&lt;br /&gt;
	&lt;br /&gt;
	djnz	_loop&lt;br /&gt;
	&lt;br /&gt;
	sbc	hl, de		; optimised last iteration&lt;br /&gt;
	ccf&lt;br /&gt;
	rl	d&lt;br /&gt;
	&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Other Options==&lt;br /&gt;
A binary search of a square table would yield much better best case scenarios and the worst case scenarios would be similar to the high school method. However this would also require 512 byte table making it significantly larger than the other routines.  Of course the table could also serve as a rapid squaring method.&lt;br /&gt;
&lt;br /&gt;
== Credits and Contributions ==&lt;br /&gt;
* '''James Montelongo'''&lt;br /&gt;
* '''Milos &amp;quot;baze&amp;quot; Bazelides''' (or possibly one of the contributor of [http://baze.au.com/misc/z80bits.html z80bits])&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=TI_websites</id>
		<title>TI websites</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=TI_websites"/>
				<updated>2010-11-17T21:08:56Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Guides and Tutorials */ updated&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Official =&lt;br /&gt;
&lt;br /&gt;
== General ==&lt;br /&gt;
* [http://www.ti.com/ Texas Instrument] (TI)&lt;br /&gt;
* [http://education.ti.com/ Official TI calculator site]&lt;br /&gt;
* [http://www.zilog.com/ z80 CPU Site]&lt;br /&gt;
&lt;br /&gt;
= Community =&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
* [http://calcg.org/ calcG.org] [http://www.calcgames.org/ CalcGames]&lt;br /&gt;
&lt;br /&gt;
== News ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
* [http://www.unitedti.org/forum United-TI Forum] (UTI Forum)&lt;br /&gt;
* [http://www.maxcoderz.com/ MaxCoderz Forum] (MC Forum)&lt;br /&gt;
* [http://www.revsoft.org/phpBB2/ Revsoft Forum] (RS Forum)&lt;br /&gt;
* [http://www.cemetech.net/forum/ Cemetech Forum]&lt;br /&gt;
* [http://www.omnimaga.org Omnimaga Forum]&lt;br /&gt;
* [http://www.detachedsolutions.com/forum Detached Solutions Forum] (DS Forum)&lt;br /&gt;
* [http://tifreakware.ath.cx/ TI-Freakware Forum]&lt;br /&gt;
* [http://otbp.tifreakware.net/phpBB2/ Outside the Box Programming Forum] (OTBP Forum)&lt;br /&gt;
&lt;br /&gt;
== Wikis ==&lt;br /&gt;
* [http://tibasicdev.wikidot.com/ TI-Basic Developer] (everything about TI-BASIC in z80, 68 and Nspire)&lt;br /&gt;
* [http://wikiti.brandonw.net/ WikiTI] (this wiki)&lt;br /&gt;
* [http://hackspire.unsads.com/ Hackspire]&lt;br /&gt;
* [http://z80-heaven.wikidot.com/ z80 Heaven]&lt;br /&gt;
&lt;br /&gt;
== Guides and Tutorials ==&lt;br /&gt;
* [http://www.technicalc.org/ technicalc.org]&lt;br /&gt;
* [http://guide.ticalc.org/ The Guide]&lt;br /&gt;
* [http://karma.ticalc.org/ TI-82 ASM Corner]&lt;br /&gt;
* [http://sgate.emt.bme.hu/patai/publications/z80guide/ Z80 Assembly]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/268/26877.htmlz80 Learn Z80 28 days]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/429/42937.html Hot Dog's TI-83 Plus Z80 ASM Lessons (Preview/Beta Stage)]&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Z80 Docs]&lt;br /&gt;
* [http://baze.au.com/misc/z80bits.html Z80 Bits]&lt;br /&gt;
* [http://www.z80.info/ Z80 INFO]&lt;br /&gt;
&lt;br /&gt;
== Active Teams ==&lt;br /&gt;
* [http://www.omnimaga.org/ Coders of Tomorrow]&lt;br /&gt;
* [http://www.revsoft.org/ Revolution Software]&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/ MaxCoderz]&lt;br /&gt;
* [http://www.detachedsolutions.com/ Detached Solutions]&lt;br /&gt;
&lt;br /&gt;
== Inactive/Ended Teams ==&lt;br /&gt;
* [http://tift.tuxfamily.org/ TIFT]&lt;br /&gt;
* [http://void.ticalc.org/ Void]&lt;br /&gt;
* [http://sicode.ticalc.org/ SiCoDe]&lt;br /&gt;
* [http://www.radicalsoft.org/ Radical Software]&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous Community Websites ==&lt;br /&gt;
* [http://tifreakware.net/ TI-Freakware]&lt;br /&gt;
&lt;br /&gt;
== IRC ==&lt;br /&gt;
* [http://www.omnimaga.org/irc #omnimaga] (efnet)&lt;br /&gt;
* [http://www.omnimaga.org/irc #unitedti]&lt;br /&gt;
* [http://chat.efnet.org/ #tcpa] (efnet)&lt;br /&gt;
* [http://chat.efnet.org/ #ti] (efnet)&lt;br /&gt;
* [http://tcpa.calcg.org/ TCPA site] (efnet)&lt;br /&gt;
&lt;br /&gt;
== Programmers site (experiments, projects...) ==&lt;br /&gt;
* [http://brandonw.net/ Brandon Wilson site]&lt;br /&gt;
* [http://benryves.com/ Ben Ryves blog]&lt;br /&gt;
* [http://www.michaelv.org/programs/calcs/ Michael Vincent site]&lt;br /&gt;
* [http://www.geocities.com/jimm09876/calc/ James Montelongo experiments]&lt;br /&gt;
* [http://sami.ticalc.org/ Sami TI Calculators page]&lt;br /&gt;
* [http://www.dwedit.org/ Dwedit's Website]&lt;br /&gt;
&lt;br /&gt;
* [http://weregoose.unitedti.org/ Weregoose TI-BASIC archive]&lt;br /&gt;
* [http://antipi.omnimaga.org/ {AP} Site]&lt;br /&gt;
* [http://www.kalanrock.us/ kalan_vod Site]&lt;br /&gt;
&lt;br /&gt;
== French Sites ==&lt;br /&gt;
* [http://ti.bank.free.fr/ ti.bank.free.fr]&lt;br /&gt;
* [http://www.ti-fr.com/ ti-fr.com]&lt;br /&gt;
* [http://ti83.free.fr/ ti83.free.fr]&lt;br /&gt;
* [http://tibank.forumactif.com/forum.htm tibank.forumactif.com]&lt;br /&gt;
* [http://membres.lycos.fr/virtuaart/ticalculette/tiprogs.htm lycos.fr]&lt;br /&gt;
* [http://tift.tuxfamily.org/ tift]&lt;br /&gt;
* [http://ti-wiki.pbwiki.com/ ti-wiki.pbwiki]&lt;br /&gt;
* [http://www.yaronet.com/ yaronet]&lt;br /&gt;
&lt;br /&gt;
== German Sites ==&lt;br /&gt;
* [http://mobifiles.bytefox.de/ MobiFiles]&lt;br /&gt;
&lt;br /&gt;
== Dutch Sites ==&lt;br /&gt;
* [http://www.ti-wereld.nl TI-Wereld]&lt;br /&gt;
* [http://www.scholieren.be Scholieren.be] (mostly 'inhabited' by school students, not by programmers)&lt;br /&gt;
&amp;lt;!-- contributed by Mapar007 --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Projects ==&lt;br /&gt;
&lt;br /&gt;
===3rd party OS===&lt;br /&gt;
* [http://sourceforge.net/projects/lifos/ LIFOS]&lt;br /&gt;
* [http://lifos.sourceforge.net/wordpress/ LIFOS blog]&lt;br /&gt;
* [http://vera.timendus.com/ Vera]&lt;br /&gt;
* [http://pongos.sourceforge.net/ PongOS]&lt;br /&gt;
* [http://brandonw.net/ OS2]&lt;br /&gt;
* [http://code.google.com/p/8xpos/ XOS]&lt;br /&gt;
* KnigthOS by SirCmpwn&lt;br /&gt;
&lt;br /&gt;
===Others===&lt;br /&gt;
* [http://api.timendus.com/ z80 assembly API]&lt;br /&gt;
* [http://clap.timendus.com/ CLAP project]&lt;br /&gt;
* [http://bell.timendus.com/ BELL project]&lt;br /&gt;
* [http://usb8x.sourceforge.net/ USB8X]&lt;br /&gt;
* [http://msd8x.denglend.net/ MSD8X]&lt;br /&gt;
* [http://usbtools.denlend.net/ USBTools]&lt;br /&gt;
* [http://dcs.cemetech.net/ DoorsCS]&lt;br /&gt;
&lt;br /&gt;
== Online Tools ==&lt;br /&gt;
* [http://www.cemetech.net/projects/basicelite/sourcecoder2.php SourceCoder2] (Online TI-BASIC editor, syntax colouring and optimizer)&lt;br /&gt;
* [http://ti.zewaren.net/ TI.ZEWAREN.NET - BETA] (general calculator variables editor)&lt;br /&gt;
* [http://galandrosdev.2kool4u.net/online_asm_unsquish.php Online Hex Disassembler] (for typing asm programs on calculator)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Memorable TI Websites =&lt;br /&gt;
That are no longer online, RIP.&lt;br /&gt;
* TI groups discussion (closed in ...)&lt;br /&gt;
* Cirrus (merged in UTI)&lt;br /&gt;
* Kevtiva (82 TI-BASIC and ASM stuff)&lt;br /&gt;
* Alienhead&lt;br /&gt;
* Assemblers Coders Zenith&lt;br /&gt;
* Badja TI Programs&lt;br /&gt;
&lt;br /&gt;
Note: use the wayback machine to see how they were: [http://www.archive.org/]&lt;br /&gt;
&lt;br /&gt;
= Other TI websites lists =&lt;br /&gt;
* [http://tifreakware.net/admin/link.php?catag=index TI-Freakware list]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=83Plus:OS:Hooks</id>
		<title>83Plus:OS:Hooks</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=83Plus:OS:Hooks"/>
				<updated>2010-11-17T20:20:07Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added link to hooks&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus:OS Information|Hooks]]&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
=Description=&lt;br /&gt;
Hooks are a hidden feature of the TI OS that were originally included for the official TI flash apps to use. The hooks allow a program or app to gain control at different times during the operating system to modify values or add additional features. Although the hooks were designed to be used by flash apps, they also work for programs in ram.&lt;br /&gt;
&lt;br /&gt;
=Usage=&lt;br /&gt;
*The first step to using a hook is to decide which one you need. The list of [[:Category:83Plus:Hooks:By_Name|hooks]] can help you decide which one to use. Just be sure that the hook you pick won't interfere with other OS processes.&lt;br /&gt;
*Next, you must find an area of memory for your hook that won't get destroyed during the life of the hook. Typically hooks are saved in flash apps so this is not a problem as you can't overwrite an app. However, if you must put your hook in ram, try to find an unusual place if you want your hook to survive. AppBackUpScreen is definitely not the place to put a long-term hook.&lt;br /&gt;
*Once you have your location figured out. Put the address in HL and the page in A. If the hook is in ram, the page is 1. And then call the specific bcall for that hook.&lt;br /&gt;
*For the actual hook, it is imperative that the first line is .db $83. The OS uses this as a marker for hooks. It is a safety check so that the OS doesn't jump to a hook that has been destroyed.&lt;br /&gt;
*When you are done with your hook. Just call the specific disable bcall for that hook.&lt;br /&gt;
&lt;br /&gt;
=Memory=&lt;br /&gt;
*Each hook has 3 bytes of memory and a flag&lt;br /&gt;
**The first two bytes are the address of the hook. The third byte is the page it is on.&lt;br /&gt;
**The flag is as simple as 1 is on 0 is off.&lt;br /&gt;
&lt;br /&gt;
=Restoring and Chaining=&lt;br /&gt;
Since the hooks are so useful, many different programs make use of them. This can create problems when two different programs want to use the same hook at the same time. There are three options in the case: either ignore the first hook, save the first hook and restore later, or chain the hooks.&lt;br /&gt;
==Restoring==&lt;br /&gt;
Restoring the hook is the easier of the two options.&lt;br /&gt;
*The first task is to save the hook's three bytes and its active flag to somewhere where they won't get destroyed.&lt;br /&gt;
*Next install your hook and let it run its course.&lt;br /&gt;
*When it is time to uninstall your hook, just restore the three bytes and the active flag.&lt;br /&gt;
==Chaining==&lt;br /&gt;
This is much harder than restoring, but the benefits of having one hook occur after the other can be worth the hassle.&lt;br /&gt;
*Just like restoring the hook, the first step is to save the hook's three bytes and it's active flag.&lt;br /&gt;
*Install your hook like normal.&lt;br /&gt;
*When your hook is finished, it now needs to jump to the address you saved in the first step. When this jump is made, care should be taken to put the calculator in exactly the same state that you found it when the hook was called. Remember: the other hook thinks it is being called by the OS.&lt;br /&gt;
A few notes:&lt;br /&gt;
*Watch out for double chaining. Say program A lays a hook. Program B sees A's hook and chains with it. So the current order is B-&amp;gt;A. But now program A gets run again. It sees that program B's hook and chains to it. Program A has no idea that it is already being chained to. So now the order is A-&amp;gt;B-&amp;gt;A. Program A's hook gets called twice. While the situation is not completely avoidable, remember that if you chain hooks, it is  a possibility.&lt;br /&gt;
*When chaining hooks. If you need to return a value directly back to the OS, depending on the situation, it is probably safe the skip the jump to the other hook and just return to the OS. If you have to return a value, you don't have much of a choice anyways.&lt;br /&gt;
&lt;br /&gt;
=See also=&lt;br /&gt;
[[Category:83Plus:Hooks:By_Name]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:83Plus:Quirks</id>
		<title>Category:83Plus:Quirks</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:83Plus:Quirks"/>
				<updated>2010-11-17T20:19:58Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added software quirks&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Hardware Quirks ==&lt;br /&gt;
&lt;br /&gt;
The TI-83+ family of calculators has some interesting quirks in the hardware.&lt;br /&gt;
&lt;br /&gt;
*The calculator will crash if PC is greater than or equal to C000, provided an even-numbered RAM page is swapped in the upper bank. This is the default (page 80h).  This is where the 8kb limit comes from. Seems like an imposed hardware constraint.&lt;br /&gt;
*Later models of calculators are missing RAM pages&lt;br /&gt;
*Later models of calculators have a longer LCD delay&lt;br /&gt;
*If a LD A,I or LD A,R instruction is interrupted, then the P/V flag is reset, even if interrupts were enabled beforehand.&lt;br /&gt;
&lt;br /&gt;
== TIOS Quirks ==&lt;br /&gt;
*In TI-BASIC interpreter an If inside a For( loop without ending parentheses can cause an abnormal time wasten in interpretation.&lt;br /&gt;
*There are lots of bugs specially in old buggy versions as [http://www.omnimaga.org/index.php?topic=1090.0 this topic] describes.&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Calculator_Documentation</id>
		<title>Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Calculator_Documentation"/>
				<updated>2010-11-17T20:18:48Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* TI-83 Plus Family */ quirks for hardware and OS so general name&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== TI-83 (Regular) ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== TI-83 Plus Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83Plus:BCALLs:By_Name|B_CALLs (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83Plus:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83Plus:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83Plus:Hooks:By_Name|Hooks]]&lt;br /&gt;
* [[:Category:83Plus:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83Plus:RAM:By_Name|RAM Areas]]&lt;br /&gt;
* [[:Category:83Plus:Basic|TI-83+ Basic]]&lt;br /&gt;
* [[:Category:83Plus:Quirks|Quirks]]&lt;br /&gt;
&lt;br /&gt;
== TI-86 ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:86:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:86:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:86:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:86:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:86:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== M68K Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:68k:Ports:By_Address|Ports]]&lt;br /&gt;
&lt;br /&gt;
== Z80 programming ==&lt;br /&gt;
* [[Meta-tutorial]]&lt;br /&gt;
* [[:Category:Z80_Routines|Z80 Routines]]&lt;br /&gt;
* [[Z80 Instruction Set]]&lt;br /&gt;
* [[Z80 Good Programming Practices]]&lt;br /&gt;
* [[Z80 Optimization]]&lt;br /&gt;
* [[Programming cross z80 calculators]]&lt;br /&gt;
* [[Programming APPS vs. Ram Programs]]&lt;br /&gt;
* [[Programming under Unix-like operating systems]]&lt;br /&gt;
* [[Programming an OS for z80 calculators]]&lt;br /&gt;
&lt;br /&gt;
== Calculator Software ==&lt;br /&gt;
* [[Notable programs]]&lt;br /&gt;
* [[:Category:83Plus:Software|83Plus Software Documentation]]&lt;br /&gt;
* [[Experiments]]&lt;br /&gt;
* [[83Plus:OS:TIOS Alternatives|TIOS Alternatives]]&lt;br /&gt;
&lt;br /&gt;
== Computer Software ==&lt;br /&gt;
* [[Link software]]&lt;br /&gt;
* [[:Category:Emulators|Emulators]]&lt;br /&gt;
* [[Assemblers]]&lt;br /&gt;
* [[Compilers]]&lt;br /&gt;
* [[IDEs]]&lt;br /&gt;
* [[Disassemblers]]&lt;br /&gt;
&lt;br /&gt;
== TI Community ==&lt;br /&gt;
&lt;br /&gt;
* [[Beginners|Beginners' manual]]&lt;br /&gt;
* [[Calculator General FAQ]]&lt;br /&gt;
* [[History of the TI Z80 community]]&lt;br /&gt;
* [[:Category:Teams|Programming Teams List]]&lt;br /&gt;
* [[TI websites|TI Web Sites List]]&lt;br /&gt;
&lt;br /&gt;
== Contributing ==&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before you start contributing to WikiTI.&lt;br /&gt;
&lt;br /&gt;
If you feel like contributing but do not know where, see the [[To Do List]].&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category_talk:83Plus:Quirks</id>
		<title>Category talk:83Plus:Quirks</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category_talk:83Plus:Quirks"/>
				<updated>2010-11-17T20:12:19Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Category/page */ new section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;quot;The calculator will crash if PC is greater than or equal to C000, provided an even-numbered RAM page is swapped in the upper bank. This is the default (page 80h). This is where the 8kb limit comes from &amp;quot;&lt;br /&gt;
&lt;br /&gt;
This is not a &amp;quot;quirk&amp;quot; but a relatively well documented hardware constraint known as memory (execution) protection.&lt;br /&gt;
&lt;br /&gt;
I don't know that &amp;quot;quirk&amp;quot; is well defined.  I intended it to be something that was useful for a programmer to know about, but that didn't really fit anywhere else.&lt;br /&gt;
&lt;br /&gt;
== Category/page ==&lt;br /&gt;
&lt;br /&gt;
This should be a page rather than a category.&lt;br /&gt;
&lt;br /&gt;
I guess the mistake was derived from new users using as template a category like they have seen in most other links under &amp;quot;TI-83 Plus Family&amp;quot; section.&lt;br /&gt;
&lt;br /&gt;
Is an admin able to change this to page instead of creating a page and deleting the category?&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Name</id>
		<title>Category:83Plus:Hooks:By Name</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Name"/>
				<updated>2010-11-17T20:08:12Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added new general hooks page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus:Hooks|Hooks by Name]]&lt;br /&gt;
&lt;br /&gt;
See also [[:Category:83Plus:Hooks:By Address|list of hooks by address]] and [[83Plus:OS:Hooks|Hooks explanation]] for what hooks are and general how to use.&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before editing these pages!&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Address</id>
		<title>Category:83Plus:Hooks:By Address</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Address"/>
				<updated>2010-11-17T20:08:07Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added new general hooks page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus:Hooks|Hooks by Address]]&lt;br /&gt;
&lt;br /&gt;
See also [[:Category:83Plus:Hooks:By Name|list of hooks by name]] and [[83Plus:OS:Hooks|Hooks explanation]] for what hooks are and general how to use.&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before editing these pages!&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Address</id>
		<title>Category:83Plus:Hooks:By Address</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:83Plus:Hooks:By_Address"/>
				<updated>2010-11-17T20:05:29Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added new general hooks page link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus:Hooks|Hooks by Address]]&lt;br /&gt;
&lt;br /&gt;
See also [[:Category:83Plus:Hooks:By Name|list of hooks by name]] and [[83Plus:OS:Hooks]].&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before editing these pages!&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:83Plus:Basic</id>
		<title>Category:83Plus:Basic</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:83Plus:Basic"/>
				<updated>2010-11-17T19:56:49Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: little changes&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus|Basic]]&lt;br /&gt;
&lt;br /&gt;
TI-BASIC is the built-in interpreted language for the TI-83 family. The syntax is similar to the PC version of BASIC, and includes very simple commands for I/O (Disp, Output, Text, etc.), flow control (If, For, While, etc.), etc.&lt;br /&gt;
&lt;br /&gt;
The main advantage of writing programs in BASIC is that it is a very high-level language, so programs can be written faster and more easily compared to assembly. Also BASIC very unlikely can crash your calculator. The disadvantages are its execution speed (since programs are run by an interpreter) and power (the built-in graphics commands are extremely limited, for example), although some assembly programs, such as xLib, can provide extra functions (such as sprites) for BASIC programs to use.&lt;br /&gt;
&lt;br /&gt;
Programs can be created on a desktop computer with TI's Graph Link software or edited on the calculator with the built-in program editor. The manual that comes with the calculator will usually have a reference for all the BASIC commands, but there are also tutorials such as [http://www.ticalc.org/archives/files/fileinfo/290/29088.html this one] or [http://www.ticalc.org/archives/files/fileinfo/137/13770.html this one] which also teach optimization and programming tricks that the manual does not cover.&lt;br /&gt;
&lt;br /&gt;
See also [http://tibasicdev.wikidot.com/home TI-Basic Developer Home] for complete documentation.&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Good_Programming_Practices</id>
		<title>Z80 Good Programming Practices</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Good_Programming_Practices"/>
				<updated>2010-11-17T19:53:31Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: improved&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Programming Techniques =&lt;br /&gt;
&lt;br /&gt;
== Lookup table ==&lt;br /&gt;
&lt;br /&gt;
If you have a place in your code where a value is tested to choose between a lot of things, like subroutines or data, it can be a good idea to use lookup tables instead of a series of tests. It makes the code more readable, concise and extensible.&lt;br /&gt;
&lt;br /&gt;
In terms optimisation though it should be used when the data is not sequentially ordered or when the objects being pointed to are not the same size.  For example, using LUTs (Look Up Tables) to find a tile in a block of memory that is only tiles would both slower and cost more memory.  Using LUTs to find a particular string would be quicker but would waste more memory than a linear search.  Using LUTs as a jump table to different code blocks located through out a program would be faster and smaller compared to the alternative.&lt;br /&gt;
&lt;br /&gt;
However, if there aren't many jumps and many of the values of a are sequential, it would be more efficient to do something like:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Examples:&lt;br /&gt;
&amp;lt;table border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Without&amp;lt;/th&amp;gt;&amp;lt;th&amp;gt;With&amp;lt;/th&amp;gt;&amp;lt;/tr&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&lt;br /&gt;
&amp;lt;td&amp;gt;&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld a,(SpriteNumber)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,ChooseSprite0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,ChooseSprite1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,ChooseSprite2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,ChooseSprite3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,ChooseSprite4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,ChooseSprite5&lt;br /&gt;
...&lt;br /&gt;
ChooseSprite0&lt;br /&gt;
 ld hl,Sprite0&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
ChooseSprite1&lt;br /&gt;
 ld hl,Sprite1&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
ChooseSprite2&lt;br /&gt;
 ld hl,Sprite2&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
ChooseSprite3&lt;br /&gt;
 ld hl,Sprite3&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
ChooseSprite4&lt;br /&gt;
 ld hl,Sprite4&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
ChooseSprite5&lt;br /&gt;
 ld hl,Sprite5&lt;br /&gt;
 jp DisplaySprite&lt;br /&gt;
...&lt;br /&gt;
DisplaySprite&lt;br /&gt;
 ld bc,(coordinates)&lt;br /&gt;
 call SpriteRoutine&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/td&amp;gt;&lt;br /&gt;
&amp;lt;td&amp;gt;&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld a,(SpriteNumber)&lt;br /&gt;
 add a,a   ; a*2 (limits SpriteNumber to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,SpriteAddressLUT&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 ld bc,(coordinates)&lt;br /&gt;
 jp SpriteRoutine&lt;br /&gt;
...&lt;br /&gt;
SpriteAddressLUT&lt;br /&gt;
 .dw Sprite0&lt;br /&gt;
 .dw Sprite1&lt;br /&gt;
 .dw Sprite2&lt;br /&gt;
 .dw Sprite3&lt;br /&gt;
 .dw Sprite4&lt;br /&gt;
 .dw Sprite5&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/td&amp;gt;&lt;br /&gt;
&amp;lt;/tr&amp;gt;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And this one :&lt;br /&gt;
&amp;lt;table border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&amp;lt;th&amp;gt;Without&amp;lt;/th&amp;gt;&amp;lt;th&amp;gt;With&amp;lt;/th&amp;gt;&amp;lt;/tr&amp;gt;&lt;br /&gt;
&amp;lt;tr&amp;gt;&lt;br /&gt;
&amp;lt;td&amp;gt;&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld a,(MenuChoice)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,Choice0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,Choice1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,Choice2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,Choice3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,Choice4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,Choice5&lt;br /&gt;
 cp 6&lt;br /&gt;
 jp z,Choice6&lt;br /&gt;
 cp 7&lt;br /&gt;
 jp z,Choice7&lt;br /&gt;
...&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/td&amp;gt;&lt;br /&gt;
&amp;lt;td&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld a,(MenuChoice)&lt;br /&gt;
 add a,a ; a*2&lt;br /&gt;
 ld h,0&lt;br /&gt;
 ld l,a&lt;br /&gt;
 ld de,CodeBranchLUT&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
...&lt;br /&gt;
CodeBranchLUT:&lt;br /&gt;
 .dw Choice0&lt;br /&gt;
 .dw Choice1&lt;br /&gt;
 .dw Choice2&lt;br /&gt;
 .dw Choice3&lt;br /&gt;
 .dw Choice4&lt;br /&gt;
 .dw Choice5&lt;br /&gt;
 .dw Choice6&lt;br /&gt;
 .dw Choice7&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&amp;lt;/td&amp;gt;&lt;br /&gt;
&amp;lt;/tr&amp;gt;&lt;br /&gt;
&amp;lt;/table&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Source Code =&lt;br /&gt;
Some advices to take into consideration:&lt;br /&gt;
* split into various files in a logical way (header, main, subroutines, data) when it turns many pages long.&lt;br /&gt;
* document every routine with input, output, destroyed registers and a short description when appropriate.&lt;br /&gt;
* document well ugly and difficult parts of code&lt;br /&gt;
&lt;br /&gt;
= Related topics =&lt;br /&gt;
* [http://www.unitedti.org/forum/index.php?showtopic=8461 Common Mistakes and Good techniques]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Calculator_Documentation</id>
		<title>Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Calculator_Documentation"/>
				<updated>2010-11-17T19:46:46Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Computer Software */ made it smaller&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== TI-83 (Regular) ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== TI-83 Plus Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83Plus:BCALLs:By_Name|B_CALLs (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83Plus:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83Plus:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83Plus:Hooks:By_Name|Hooks]]&lt;br /&gt;
* [[:Category:83Plus:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83Plus:RAM:By_Name|RAM Areas]]&lt;br /&gt;
* [[:Category:83Plus:Basic|TI-83+ Basic]]&lt;br /&gt;
* [[:Category:83Plus:Quirks|Hardware Quirks]]&lt;br /&gt;
&lt;br /&gt;
== TI-86 ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:86:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:86:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:86:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:86:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:86:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== M68K Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:68k:Ports:By_Address|Ports]]&lt;br /&gt;
&lt;br /&gt;
== Z80 programming ==&lt;br /&gt;
* [[Meta-tutorial]]&lt;br /&gt;
* [[:Category:Z80_Routines|Z80 Routines]]&lt;br /&gt;
* [[Z80 Instruction Set]]&lt;br /&gt;
* [[Z80 Good Programming Practices]]&lt;br /&gt;
* [[Z80 Optimization]]&lt;br /&gt;
* [[Programming cross z80 calculators]]&lt;br /&gt;
* [[Programming APPS vs. Ram Programs]]&lt;br /&gt;
* [[Programming under Unix-like operating systems]]&lt;br /&gt;
* [[Programming an OS for z80 calculators]]&lt;br /&gt;
&lt;br /&gt;
== Calculator Software ==&lt;br /&gt;
* [[Notable programs]]&lt;br /&gt;
* [[:Category:83Plus:Software|83Plus Software Documentation]]&lt;br /&gt;
* [[Experiments]]&lt;br /&gt;
* [[83Plus:OS:TIOS Alternatives|TIOS Alternatives]]&lt;br /&gt;
&lt;br /&gt;
== Computer Software ==&lt;br /&gt;
* [[Link software]]&lt;br /&gt;
* [[:Category:Emulators|Emulators]]&lt;br /&gt;
* [[Assemblers]]&lt;br /&gt;
* [[Compilers]]&lt;br /&gt;
* [[IDEs]]&lt;br /&gt;
* [[Disassemblers]]&lt;br /&gt;
&lt;br /&gt;
== TI Community ==&lt;br /&gt;
&lt;br /&gt;
* [[Beginners|Beginners' manual]]&lt;br /&gt;
* [[Calculator General FAQ]]&lt;br /&gt;
* [[History of the TI Z80 community]]&lt;br /&gt;
* [[:Category:Teams|Programming Teams List]]&lt;br /&gt;
* [[TI websites|TI Web Sites List]]&lt;br /&gt;
&lt;br /&gt;
== Contributing ==&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before you start contributing to WikiTI.&lt;br /&gt;
&lt;br /&gt;
If you feel like contributing but do not know where, see the [[To Do List]].&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Link_software</id>
		<title>Link software</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Link_software"/>
				<updated>2010-11-17T19:38:47Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: links and description&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Windows ==&lt;br /&gt;
* TI-Connect&lt;br /&gt;
Official TI (Texas Instruments) linking software for calculators (except Nspire). ([http://education.ti.com/educationportal link] start search there) &amp;lt;!--because TI changes their site almost every year, and every time it seems harder to find software--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* TilP and Tilp2&lt;br /&gt;
3rd party linking software supporting all Texas Instruments calculators' variables. ([http://lpg.ticalc.org/prj_tilp/download/setup.exe latest beta] [http://www.omnimaga.org/index.php?topic=1413.0 forum link])&lt;br /&gt;
&lt;br /&gt;
* TI-Nspire™ Computer Link Software ([http://education.ti.com/educationportal link])&lt;br /&gt;
Official TI for linking with a Nspire.&lt;br /&gt;
&lt;br /&gt;
* TI-Graphlink&lt;br /&gt;
Old TI linking software. No longer supported.&lt;br /&gt;
&lt;br /&gt;
* (there is in existence other 3rd party software although rarely with full support of calculators models and their variables)&lt;br /&gt;
&lt;br /&gt;
== Mac ==&lt;br /&gt;
* TI-Connect&lt;br /&gt;
&lt;br /&gt;
== Linux ==&lt;br /&gt;
* TilP and Tilp2 ([http://lpg.ticalc.org/prj_tilp/download/install_tilp.sh install script, read it])&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
&lt;br /&gt;
[[Link_cables]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Axe</id>
		<title>Axe</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Axe"/>
				<updated>2010-11-17T19:16:23Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added to TI-83 Plus Software category&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:83Plus:Software|Axe Parser]]&lt;br /&gt;
&lt;br /&gt;
=== '''Axe Parser''' ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
It is a new programming language for the calculator.  It is typed directly into a program just like BASIC and with a similar syntax.  Unlike BASIC however, this is a compiled language, not an interpreted one.  The program gets compiled into an assembly program.  In the future you will be able to make Ion, MirageOS, and Doors programs; possibly apps too.&lt;br /&gt;
&lt;br /&gt;
'''Advantages:'''&lt;br /&gt;
You basically get the simplicity of BASIC programming but with nearly the same size, speed, and compatibility of assembly programs.  You won’t need “Shells” or “Libraries” to run the programs.  They are just like any other assembly program.&lt;br /&gt;
&lt;br /&gt;
'''Syntax:'''&lt;br /&gt;
It is similar to BASIC, but also very different.  First, it has an extremely loose syntax.  You know how you can leave the end parenthesis off of BASIC commands and do multiple same-line DelVars?  It’s like that on steroids (if you so choose).  For instance: the store “-&amp;gt;” can be used in expressions like A+B-&amp;gt;C+1-&amp;gt;D so now C holds A+B and D holds A+B+1.&lt;br /&gt;
&lt;br /&gt;
'''Differences With BASIC:'''&lt;br /&gt;
A lot of commands will be re-defined.  Most are usually unused anyway, but some are not.  For instance, “DiagnosticOff” turns off the run indicator.  But “sub()” now runs a subroutine since you will be able to take characters from a string the same way you do with lists in the future.&lt;br /&gt;
&lt;br /&gt;
'''Variables and Numbers:'''&lt;br /&gt;
All numbers and letters A-Z are 16-bit unsigned integers.  Unlike BASIC variables, they don’t reside in the user ram so they take up zero memory.  You might want to read about unsigned numbers on Wikipedia or something if you are not familiar with it.&lt;br /&gt;
&lt;br /&gt;
'''User Defined Variables:'''&lt;br /&gt;
Things like strings, lists, sprites, and floats will be defined by the user.  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Axe was created and is developed by Quigibo/Kevin Horowitz.&lt;br /&gt;
&lt;br /&gt;
Download link: [http://www.omnimaga.org/index.php?action=dlattach;topic=1463.0;attach=3502 Download Here]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators</id>
		<title>Programming an OS for z80 calculators</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators"/>
				<updated>2010-11-15T21:42:54Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: for now that's it&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;It is now fairly easy for an intermediate to advanced z80 assembly programmer to write its own Operating System (from now on OS) for its calculator. Mainly because there is a lot of documentation floating about z80 calculators hardware, emulator and examples of OS.&lt;br /&gt;
&lt;br /&gt;
And now an OS can be very easily distributed and installed in .8xu format.&lt;br /&gt;
&lt;br /&gt;
== Main concerns before starting ==&lt;br /&gt;
{{stub}}&lt;br /&gt;
&lt;br /&gt;
== Sample basic OS functionality ==&lt;br /&gt;
&lt;br /&gt;
Every OS needs these, just to boot:&lt;br /&gt;
* Page 00 boot code&lt;br /&gt;
* A valid OS header&lt;br /&gt;
&lt;br /&gt;
But if you want it to actually do anything useful, it needs to: &lt;br /&gt;
* Initialize the LCD&lt;br /&gt;
* Set up memory&lt;br /&gt;
* Initialize the stack&lt;br /&gt;
&lt;br /&gt;
== Memory layout ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
&lt;br /&gt;
== Tools for building the OS ==&lt;br /&gt;
You really should get an adequate assembler. SPASM and Brass are well suited. But others may be used.&lt;br /&gt;
You will also need a program to take the binary to convert into a .8xu file.&lt;br /&gt;
http://www.ticalc.org/archives/files/fileinfo/383/38392.html&lt;br /&gt;
&lt;br /&gt;
Here is a script using spasm to assemble and sign the OS.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
$ spasm main.asm main.bin&lt;br /&gt;
$ &lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Testing ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
* http://www.cemetech.net/forum/viewtopic.php?t=5008&lt;br /&gt;
* other OSes source code&lt;br /&gt;
* ports&lt;br /&gt;
* (many more I will add)&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* SirCmpwn for initial topics on forums&lt;br /&gt;
* Brandonw for giving more documentation about this&lt;br /&gt;
* etc.&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators</id>
		<title>Programming an OS for z80 calculators</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators"/>
				<updated>2010-11-15T21:39:48Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: more stuff&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;It is now fairly easy for an intermediate to advanced z80 assembly programmer to write its own Operating System (from now on OS) for its calculator. Mainly because there is a lot of documentation floating about z80 calculators hardware, emulator and examples of OS.&lt;br /&gt;
&lt;br /&gt;
And now an OS can be very easily distributed and installed in .8xu format.&lt;br /&gt;
&lt;br /&gt;
== Main concerns before starting ==&lt;br /&gt;
{{stub}}&lt;br /&gt;
&lt;br /&gt;
== Sample basic OS functionality ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Every OS needs these, just to boot:&lt;br /&gt;
* Page 00 boot code&lt;br /&gt;
* A valid OS header&lt;br /&gt;
&lt;br /&gt;
But if you want it to actually do anything useful, it needs to: &lt;br /&gt;
* Initialize the LCD&lt;br /&gt;
* Set up memory&lt;br /&gt;
* Initialize the stack&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
&lt;br /&gt;
== Tools ==&lt;br /&gt;
You really should get an adequate assembler. SPASM and Brass are well suited. But others may be used.&lt;br /&gt;
You will also need a program to take the binary to convert into a .8xu file.&lt;br /&gt;
http://www.ticalc.org/archives/files/fileinfo/383/38392.html&lt;br /&gt;
&lt;br /&gt;
Here is a script using spasm to assemble and sign the OS.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
$ spasm main.asm main.bin&lt;br /&gt;
$ &lt;br /&gt;
 &amp;lt;/wiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
* http://www.cemetech.net/forum/viewtopic.php?t=5008&lt;br /&gt;
* other OSes source code&lt;br /&gt;
* ports&lt;br /&gt;
* (many more I will add)&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* SirCmpwn for initial topics on forums&lt;br /&gt;
* Brandonw for giving more documentation about this&lt;br /&gt;
* etc.&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators</id>
		<title>Programming an OS for z80 calculators</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Programming_an_OS_for_z80_calculators"/>
				<updated>2010-11-15T21:35:59Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: created and sketched...&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;It is now fairly easy for an intermediate to advanced z80 assembly programmer to write its own Operating System (from now on OS) for its calculator. Mainly because there is a lot of documentation floating about z80 calculators hardware, emulator and examples of OS.&lt;br /&gt;
&lt;br /&gt;
And now an OS can be very easily distributed and installed in .8xu format.&lt;br /&gt;
&lt;br /&gt;
== Main concerns before starting ==&lt;br /&gt;
{{stub}}&lt;br /&gt;
&lt;br /&gt;
== Sample basic OS functionality ==&lt;br /&gt;
&lt;br /&gt;
== Hardware ==&lt;br /&gt;
&lt;br /&gt;
== Tools ==&lt;br /&gt;
You really should get an adequate assembler. SPASM and Brass are well suited. But others may be used.&lt;br /&gt;
You will also need a program to take the binary to convert into a .8xu file.&lt;br /&gt;
http://www.ticalc.org/archives/files/fileinfo/383/38392.html&lt;br /&gt;
&lt;br /&gt;
Here is a script using spasm to assemble and sign the OS.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
$ spasm main.asm main.bin&lt;br /&gt;
$ &lt;br /&gt;
 &amp;lt;/wiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
* http://www.cemetech.net/forum/viewtopic.php?t=5008&lt;br /&gt;
* other OSes source code&lt;br /&gt;
* ports&lt;br /&gt;
* (many more I will add)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Calculator_Documentation</id>
		<title>Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Calculator_Documentation"/>
				<updated>2010-11-15T21:28:16Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Z80 programming */ added new page, maybe I will move later&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== TI-83 (Regular) ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== TI-83 Plus Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:83Plus:BCALLs:By_Name|B_CALLs (System Entrypoints)]]&lt;br /&gt;
* [[:Category:83Plus:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:83Plus:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:83Plus:Hooks:By_Name|Hooks]]&lt;br /&gt;
* [[:Category:83Plus:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:83Plus:RAM:By_Name|RAM Areas]]&lt;br /&gt;
* [[:Category:83Plus:Basic|TI-83+ Basic]]&lt;br /&gt;
* [[:Category:83Plus:Quirks|Hardware Quirks]]&lt;br /&gt;
&lt;br /&gt;
== TI-86 ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:86:ROMCalls:By_Name|ROMCalls (System Entrypoints)]]&lt;br /&gt;
* [[:Category:86:Flags:By_Name|Flags]]&lt;br /&gt;
* [[:Category:86:OS_Information|General OS Documentation]]&lt;br /&gt;
* [[:Category:86:Ports:By_Address|Ports]]&lt;br /&gt;
* [[:Category:86:RAM:By_Name|RAM Areas]]&lt;br /&gt;
&lt;br /&gt;
== M68K Family ==&lt;br /&gt;
&lt;br /&gt;
* [[:Category:68k:Ports:By_Address|Ports]]&lt;br /&gt;
&lt;br /&gt;
== Z80 programming ==&lt;br /&gt;
* [[Meta-tutorial]]&lt;br /&gt;
* [[:Category:Z80_Routines|Z80 Routines]]&lt;br /&gt;
* [[Z80 Instruction Set]]&lt;br /&gt;
* [[Z80 Good Programming Practices]]&lt;br /&gt;
* [[Z80 Optimization]]&lt;br /&gt;
* [[Programming cross z80 calculators]]&lt;br /&gt;
* [[Programming APPS vs. Ram Programs]]&lt;br /&gt;
* [[Programming under Unix-like operating systems]]&lt;br /&gt;
* [[Programming an OS for z80 calculators]]&lt;br /&gt;
&lt;br /&gt;
== Calculator Software ==&lt;br /&gt;
* [[Notable programs]]&lt;br /&gt;
* [[:Category:83Plus:Software|83Plus Software Documentation]]&lt;br /&gt;
* [[Experiments]]&lt;br /&gt;
* [[83Plus:OS:TIOS Alternatives|TIOS Alternatives]]&lt;br /&gt;
&lt;br /&gt;
== Computer Software ==&lt;br /&gt;
* [[:Category:Emulators|Emulators]]&lt;br /&gt;
&lt;br /&gt;
=== Linking ===&lt;br /&gt;
* [[Link cables]]&lt;br /&gt;
* [[Link software]]&lt;br /&gt;
&lt;br /&gt;
=== Programming ===&lt;br /&gt;
* [[Assemblers]]&lt;br /&gt;
* [[Compilers]]&lt;br /&gt;
* [[IDEs]]&lt;br /&gt;
* [[:Category:Emulators|Emulators and Debuggers]]&lt;br /&gt;
* [[Disassemblers]]&lt;br /&gt;
&lt;br /&gt;
== TI Community ==&lt;br /&gt;
&lt;br /&gt;
* [[Beginners|Beginners' manual]]&lt;br /&gt;
* [[Calculator General FAQ]]&lt;br /&gt;
* [[History of the TI Z80 community]]&lt;br /&gt;
* [[:Category:Teams|Programming Teams List]]&lt;br /&gt;
* [[TI websites|TI Web Sites List]]&lt;br /&gt;
&lt;br /&gt;
== Contributing ==&lt;br /&gt;
&lt;br /&gt;
Please read our page on [[Contributing]] before you start contributing to WikiTI.&lt;br /&gt;
&lt;br /&gt;
If you feel like contributing but do not know where, see the [[To Do List]].&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation</id>
		<title>Talk:Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation"/>
				<updated>2010-11-14T15:15:14Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Nspire in WikiTI */ on question of minimazing Calculator Documentation page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;I changed some headings and moved stuff around a bit. I think it's more logical this way, but I don't mind if you change it back. [[User:Jib|Jib]] 02:37, 5 November 2006 (PST)&lt;br /&gt;
: The only thing that bugs me about it is that there's a &amp;quot;software&amp;quot; heading that includes both PC software (emulators) and calculator software (TIOS Alternatives), not to mention that software documentation is listed under a different heading.  What do you think about:&lt;br /&gt;
 Z80 Calculators&lt;br /&gt;
  TI-83&lt;br /&gt;
   Programming&lt;br /&gt;
    Ports&lt;br /&gt;
    RAM Areas&lt;br /&gt;
    B_CALLs&lt;br /&gt;
    etc&lt;br /&gt;
   Software&lt;br /&gt;
    Calculator Software&lt;br /&gt;
    Emulators&lt;br /&gt;
    Linking Software&lt;br /&gt;
    TIOS Alternatives&lt;br /&gt;
  TI-83 Plus Family&lt;br /&gt;
   Programming&lt;br /&gt;
    etc...&lt;br /&gt;
   Software&lt;br /&gt;
    etc...&lt;br /&gt;
  General Z80 Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 68k Calculators&lt;br /&gt;
  TI-89/92+&lt;br /&gt;
   Programming&lt;br /&gt;
    etc.&lt;br /&gt;
   Software&lt;br /&gt;
    etc.&lt;br /&gt;
  TI-92&lt;br /&gt;
   etc.&lt;br /&gt;
  General 68k Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 TI Community&lt;br /&gt;
  History&lt;br /&gt;
  Programming Teams&lt;br /&gt;
  etc.&lt;br /&gt;
&lt;br /&gt;
For example, in that organization, the TI-83 Emulators page would basically just be a list of the emulators compatible with the 83, each of which would have their own separate page.  So the VTI page might be linked from the TI-83 emulators page, the TI-85 emulators page, the TI-89 emulators page, etc.  I like this organization because it makes it clear what calculators everything is applicable to, and it decreases the number of top-level categories.  But maybe it's too complicated or has too many levels.  Thoughts?  --[[User:Dan Englender|Dan Englender]] 09:11, 5 November 2006 (PST)&lt;br /&gt;
:Yeah, it seems like a good idea to me. [[User:Jib|Jib]] 12:35, 5 November 2006 (PST)&lt;br /&gt;
&lt;br /&gt;
== Descriptions ==&lt;br /&gt;
&lt;br /&gt;
I was just made aware that searching doesn't always work so well when looking for certain things.  I think adding little descriptions after each link may help with new user finding things that aren't easy to find using search. I would do this however I have the grammar and spelling skills of a jar of mayonaise.&lt;br /&gt;
&lt;br /&gt;
Also right now the Wiki isn't very friendly to new comers. Perhaps brain storming some ways to improve ease of use is in order.--[[User:Jim e|Jim e]] 23:03, 13 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
:Maybe I'm &amp;quot;thinking divergently&amp;quot;, but from the very beginning, I thought the WikiTI might be a bit more useful if one could export the contents directly to a &amp;quot;wikiti.inc&amp;quot; file or something, where we can select which information we need exported (choose a calc model, choose whether you want all the bcalls, the port info, flags, and/or the RAM addresses, with or without their accompanying documentation).  Maybe, since ti83plus.inc has recently been added, a script or something could use it as a basis, adding documentation as comments such that one can simply export the necessary WikiTI data to a file, and include it directly in one's source. [[User:Saibot84|Saibot84]] 06:03, 15 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
If you type up descriptions, I'll be more than happy to make sure the grammar and spelling is clear and correct. [[User:Threefingeredguy|threefingeredguy]] 02:36, 14 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
== A suggestion ==&lt;br /&gt;
* What about pages to specifications of each calculator like in:&lt;br /&gt;
http://www.ticalc.org/basics/calculators/&lt;br /&gt;
They are on wanted pages, also.&lt;br /&gt;
I am not sure how to add it.&lt;br /&gt;
* And what about TI-82 and TI-85?&lt;br /&gt;
[[User:Galandros|Galandros]] 20:12, 25 October 2009 (UTC)&lt;br /&gt;
* This maybe needs serious reorganization but I don't know how exactly. It is hard.&lt;br /&gt;
[[User:Galandros|Galandros]] 15:15, 2 November 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Nspire in WikiTI ==&lt;br /&gt;
&lt;br /&gt;
WikiTI does not have Nspire documentation in it but I guess it can be left for Hackspire.&lt;br /&gt;
&lt;br /&gt;
At least I think we should link to hackspire. Should it be linked in Calculator Documentation page or only in TI websites page?&lt;br /&gt;
{{unsigned|Galandros|04:15, 13 November 2010}}&lt;br /&gt;
&lt;br /&gt;
:As a practical matter, it makes sense to keep the documentation of the Nspire OS and hardware on Hackspire.  (I don't personally see any problem with putting that stuff here on WikiTI - I don't think Brandon would, either, but I can't speak for him - but in any case we should avoid duplicating effort.)  On the other hand, since a large part of WikiTI is devoted to Z80 programming, it would make a lot of sense for us to include documentation about the Nspire's TI-84 Plus emulation mode: which I/O ports are emulated, what the special opcodes do, what differences are known from one version to the next.  And any special features of the Nspire-84+ operating systems, in terms of BCALLs and memory areas, are worth documenting.  I'd recommend that this stuff be added to the regular 83Plus documentation pages, but make it clear which parts of the page apply to the real hardware and which to the emulated hardware.&lt;br /&gt;
:I'd be inclined not to put external links on the Calculator Documentation page, so as to avoid cluttering it - if anything, I'd say that page should be trimmed down a bit.  But I don't feel strongly either way.  [[User:FloppusMaximus|FloppusMaximus]] 04:04, 14 November 2010 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: I forgot to sign my earlier message, it happens after going away from WikiTI for too much time. :)&lt;br /&gt;
:: I have linked to hackspire in http://wikiti.brandonw.net/index.php?title=TI_websites . We really should focus efforts of Nspire documentation in one wiki, so this question is almost settled.&lt;br /&gt;
:: About emulated TI-84PSE I started already here: http://wikiti.brandonw.net/index.php?title=Programming_cross_z80_calculators#Emulated_TI-84.2BSE_by_Nspire . (lacks further technical and precise documentation but is a start) But if it gets big, we can move to a new page.&lt;br /&gt;
:: I see some possible reducing of Calculator Documentation Page, but that would require merge some pages, moving things around or create a category for each calculator. But that disrupts the actual logic and makes their access one click away. I read some tips on Wikipedia and it suggests to keep original flow of pages and categories unless a con-sensuous and superior in almost every point is found. And I use them as guidelines.&lt;br /&gt;
:: Actually in the past I have reorganized Calculator Documentation, so I have thought about these problems through-fully.&lt;br /&gt;
:: [[User:Galandros|Galandros]] 15:07, 14 November 2010 (UTC)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation</id>
		<title>Talk:Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation"/>
				<updated>2010-11-14T15:07:06Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Nspire in WikiTI */ reply to Floppus&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;I changed some headings and moved stuff around a bit. I think it's more logical this way, but I don't mind if you change it back. [[User:Jib|Jib]] 02:37, 5 November 2006 (PST)&lt;br /&gt;
: The only thing that bugs me about it is that there's a &amp;quot;software&amp;quot; heading that includes both PC software (emulators) and calculator software (TIOS Alternatives), not to mention that software documentation is listed under a different heading.  What do you think about:&lt;br /&gt;
 Z80 Calculators&lt;br /&gt;
  TI-83&lt;br /&gt;
   Programming&lt;br /&gt;
    Ports&lt;br /&gt;
    RAM Areas&lt;br /&gt;
    B_CALLs&lt;br /&gt;
    etc&lt;br /&gt;
   Software&lt;br /&gt;
    Calculator Software&lt;br /&gt;
    Emulators&lt;br /&gt;
    Linking Software&lt;br /&gt;
    TIOS Alternatives&lt;br /&gt;
  TI-83 Plus Family&lt;br /&gt;
   Programming&lt;br /&gt;
    etc...&lt;br /&gt;
   Software&lt;br /&gt;
    etc...&lt;br /&gt;
  General Z80 Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 68k Calculators&lt;br /&gt;
  TI-89/92+&lt;br /&gt;
   Programming&lt;br /&gt;
    etc.&lt;br /&gt;
   Software&lt;br /&gt;
    etc.&lt;br /&gt;
  TI-92&lt;br /&gt;
   etc.&lt;br /&gt;
  General 68k Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 TI Community&lt;br /&gt;
  History&lt;br /&gt;
  Programming Teams&lt;br /&gt;
  etc.&lt;br /&gt;
&lt;br /&gt;
For example, in that organization, the TI-83 Emulators page would basically just be a list of the emulators compatible with the 83, each of which would have their own separate page.  So the VTI page might be linked from the TI-83 emulators page, the TI-85 emulators page, the TI-89 emulators page, etc.  I like this organization because it makes it clear what calculators everything is applicable to, and it decreases the number of top-level categories.  But maybe it's too complicated or has too many levels.  Thoughts?  --[[User:Dan Englender|Dan Englender]] 09:11, 5 November 2006 (PST)&lt;br /&gt;
:Yeah, it seems like a good idea to me. [[User:Jib|Jib]] 12:35, 5 November 2006 (PST)&lt;br /&gt;
&lt;br /&gt;
== Descriptions ==&lt;br /&gt;
&lt;br /&gt;
I was just made aware that searching doesn't always work so well when looking for certain things.  I think adding little descriptions after each link may help with new user finding things that aren't easy to find using search. I would do this however I have the grammar and spelling skills of a jar of mayonaise.&lt;br /&gt;
&lt;br /&gt;
Also right now the Wiki isn't very friendly to new comers. Perhaps brain storming some ways to improve ease of use is in order.--[[User:Jim e|Jim e]] 23:03, 13 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
:Maybe I'm &amp;quot;thinking divergently&amp;quot;, but from the very beginning, I thought the WikiTI might be a bit more useful if one could export the contents directly to a &amp;quot;wikiti.inc&amp;quot; file or something, where we can select which information we need exported (choose a calc model, choose whether you want all the bcalls, the port info, flags, and/or the RAM addresses, with or without their accompanying documentation).  Maybe, since ti83plus.inc has recently been added, a script or something could use it as a basis, adding documentation as comments such that one can simply export the necessary WikiTI data to a file, and include it directly in one's source. [[User:Saibot84|Saibot84]] 06:03, 15 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
If you type up descriptions, I'll be more than happy to make sure the grammar and spelling is clear and correct. [[User:Threefingeredguy|threefingeredguy]] 02:36, 14 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
== A suggestion ==&lt;br /&gt;
* What about pages to specifications of each calculator like in:&lt;br /&gt;
http://www.ticalc.org/basics/calculators/&lt;br /&gt;
They are on wanted pages, also.&lt;br /&gt;
I am not sure how to add it.&lt;br /&gt;
* And what about TI-82 and TI-85?&lt;br /&gt;
[[User:Galandros|Galandros]] 20:12, 25 October 2009 (UTC)&lt;br /&gt;
* This maybe needs serious reorganization but I don't know how exactly. It is hard.&lt;br /&gt;
[[User:Galandros|Galandros]] 15:15, 2 November 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Nspire in WikiTI ==&lt;br /&gt;
&lt;br /&gt;
WikiTI does not have Nspire documentation in it but I guess it can be left for Hackspire.&lt;br /&gt;
&lt;br /&gt;
At least I think we should link to hackspire. Should it be linked in Calculator Documentation page or only in TI websites page?&lt;br /&gt;
{{unsigned|Galandros|04:15, 13 November 2010}}&lt;br /&gt;
&lt;br /&gt;
:As a practical matter, it makes sense to keep the documentation of the Nspire OS and hardware on Hackspire.  (I don't personally see any problem with putting that stuff here on WikiTI - I don't think Brandon would, either, but I can't speak for him - but in any case we should avoid duplicating effort.)  On the other hand, since a large part of WikiTI is devoted to Z80 programming, it would make a lot of sense for us to include documentation about the Nspire's TI-84 Plus emulation mode: which I/O ports are emulated, what the special opcodes do, what differences are known from one version to the next.  And any special features of the Nspire-84+ operating systems, in terms of BCALLs and memory areas, are worth documenting.  I'd recommend that this stuff be added to the regular 83Plus documentation pages, but make it clear which parts of the page apply to the real hardware and which to the emulated hardware.&lt;br /&gt;
:I'd be inclined not to put external links on the Calculator Documentation page, so as to avoid cluttering it - if anything, I'd say that page should be trimmed down a bit.  But I don't feel strongly either way.  [[User:FloppusMaximus|FloppusMaximus]] 04:04, 14 November 2010 (UTC)&lt;br /&gt;
&lt;br /&gt;
:: I forgot to sign my earlier message, it happens after going away from WikiTI for too much time. :)&lt;br /&gt;
:: I have linked to hackspire in http://wikiti.brandonw.net/index.php?title=TI_websites . We really should focus efforts of Nspire documentation in one wiki, so this question is almost settled.&lt;br /&gt;
:: About emulated TI-84PSE I started already here: http://wikiti.brandonw.net/index.php?title=Programming_cross_z80_calculators#Emulated_TI-84.2BSE_by_Nspire . (lacks further technical and precise documentation but is a start) But if it gets big, we can move to a new page.&lt;br /&gt;
:: [[User:Galandros|Galandros]] 15:07, 14 November 2010 (UTC)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:Z80_Routines:Data:Quicksort</id>
		<title>Category:Z80 Routines:Data:Quicksort</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:Z80_Routines:Data:Quicksort"/>
				<updated>2010-11-13T14:02:45Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: created&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Sort|Quicksort]]&lt;br /&gt;
[[Category:Z80 Routines:Data|Quicksort]]&lt;br /&gt;
[[Category:Z80 Routines|Quicksort]]&lt;br /&gt;
&lt;br /&gt;
=== Description ===&lt;br /&gt;
This code snippet sorts a list using quicksort algorithm.&lt;br /&gt;
To know about this algorithm read [[http://en.wikipedia.org/wiki/Quicksort Quicksort in Wikipedia]]&lt;br /&gt;
&lt;br /&gt;
=== Code ===&lt;br /&gt;
&lt;br /&gt;
Note: sorts 1 byte numbers, it needs adaptation for real situation use.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;&lt;br /&gt;
; &amp;gt;&amp;gt;&amp;gt; Quicksort routine v1.1 &amp;lt;&amp;lt;&amp;lt;&lt;br /&gt;
; by Frank Yaul 7/14/04&lt;br /&gt;
;&lt;br /&gt;
; Usage: bc-&amp;gt;first, de-&amp;gt;last,&lt;br /&gt;
;        call qsort&lt;br /&gt;
; Destroys: abcdefhl&lt;br /&gt;
;&lt;br /&gt;
qsort:  ld      hl,0&lt;br /&gt;
        push    hl&lt;br /&gt;
qsloop: ld      h,b&lt;br /&gt;
        ld      l,c&lt;br /&gt;
        or      a&lt;br /&gt;
        sbc     hl,de&lt;br /&gt;
        jp      c,next1 ;loop until lo&amp;lt;hi&lt;br /&gt;
        pop     bc&lt;br /&gt;
        ld      a,b&lt;br /&gt;
        or      c&lt;br /&gt;
        ret     z       ;bottom of stack&lt;br /&gt;
        pop     de&lt;br /&gt;
        jp      qsloop&lt;br /&gt;
next1:  push    de      ;save hi,lo&lt;br /&gt;
        push    bc&lt;br /&gt;
        ld      a,(bc)  ;pivot&lt;br /&gt;
        ld      h,a&lt;br /&gt;
        dec     bc&lt;br /&gt;
        inc     de&lt;br /&gt;
fleft:  inc     bc      ;do i++ while cur&amp;lt;piv&lt;br /&gt;
        ld      a,(bc)&lt;br /&gt;
        cp      h&lt;br /&gt;
        jp      c,fleft&lt;br /&gt;
fright: dec     de      ;do i-- while cur&amp;gt;piv&lt;br /&gt;
        ld      a,(de)&lt;br /&gt;
        ld      l,a&lt;br /&gt;
        ld      a,h&lt;br /&gt;
        cp      l&lt;br /&gt;
        jp      c,fright&lt;br /&gt;
        push    hl      ;save pivot&lt;br /&gt;
        ld      h,d     ;exit if lo&amp;gt;hi&lt;br /&gt;
        ld      l,e&lt;br /&gt;
        or      a&lt;br /&gt;
        sbc     hl,bc&lt;br /&gt;
        jp      c,next2&lt;br /&gt;
        ld      a,(bc)  ;swap (bc),(de)&lt;br /&gt;
        ld      h,a&lt;br /&gt;
        ld      a,(de)&lt;br /&gt;
        ld      (bc),a&lt;br /&gt;
        ld      a,h&lt;br /&gt;
        ld      (de),a&lt;br /&gt;
        pop     hl      ;restore pivot&lt;br /&gt;
        jp      fleft&lt;br /&gt;
next2:  pop     hl      ;restore pivot&lt;br /&gt;
        pop     hl      ;pop lo&lt;br /&gt;
        push    bc      ;stack=left-hi&lt;br /&gt;
        ld      b,h&lt;br /&gt;
        ld      c,l     ;bc=lo,de=right&lt;br /&gt;
        jp      qsloop&lt;br /&gt;
;&lt;br /&gt;
; &amp;gt;&amp;gt;&amp;gt; end Quicksort &amp;lt;&amp;lt;&amp;lt;&lt;br /&gt;
;&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
source: [[http://frank_y.scripts.mit.edu/pages/z80qsort/]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-11-13T09:44:27Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added table alignment&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you are out of registers, try using ixh/ixl/iyh/iyl  and even the i register for loop counters instead of maintaining a counter in memory or pushing/popping an already used register to the stack inside a loop. Using ixh/ixl/iyh/iyl will break compatibility with the TI-84+SE emulated by the Nspire. You can only use i register for other purposes if you disable interrupts first (di).&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Code Flow ====&lt;br /&gt;
&lt;br /&gt;
Almost never call and return...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
Fallthrough looping&lt;br /&gt;
If you need to repeat a routine several times but can't spare registers for a loop counter or unroll the routine, try structuring the routine so it can call itself several times and fall through at the end. For example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
foo:&lt;br /&gt;
  ld hl, data&lt;br /&gt;
  call bar      ; Run routine once&lt;br /&gt;
  call bar      ; .. twice&lt;br /&gt;
  call bar      ; .. three times&lt;br /&gt;
bar:&lt;br /&gt;
  ld a, (hl)    ; .. fourth and final time&lt;br /&gt;
  inc l&lt;br /&gt;
  and $0F&lt;br /&gt;
  out (c), a&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Toggling values in loops.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:Table alignment&lt;br /&gt;
&lt;br /&gt;
If you align tables to a 256-byte boundary, you can access the contents by placing the index in a register such as l and the table address in h. This is faster than loading the full unaligned 16-bit address and adding a 16-bit index to it, and makes accessing tables with a size of 256 bytes or less very convenient: &lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld h, (sineTable &amp;gt;&amp;gt; 8) &amp;amp; $FF    ; Get MSB of table&lt;br /&gt;
 ld a, (frame_count)             ; Get index&lt;br /&gt;
 ld l, a&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld hl, sineTable                ; Get address of table&lt;br /&gt;
 xor a&lt;br /&gt;
 ld d, a                         ; Set index high byte to zero&lt;br /&gt;
 ld a, (frame_count)&lt;br /&gt;
 ld e, a                         ; Set index low byte&lt;br /&gt;
 add hl, de                      ; Add offset to base&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; directive and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]]) (see wabbitemu)&lt;br /&gt;
&lt;br /&gt;
== Very specific optimizations (hardly practical) ==&lt;br /&gt;
&lt;br /&gt;
=== Table alignment ===&lt;br /&gt;
Use an aligned address on memory such as $8000 (theoretical example) and if you will only use 256 bytes ($8000 to $80FF), to get the next byte use inc l instead of inc hl.&lt;br /&gt;
&lt;br /&gt;
== Crazy, &amp;quot;magick&amp;quot;, hacks and obscure optimization's tricks ==&lt;br /&gt;
&lt;br /&gt;
These are not normally recommend for use because some disturb disassembly and even coders understanding the code.&lt;br /&gt;
&lt;br /&gt;
=== Better else ===&lt;br /&gt;
So you normally have an if-else-endif block like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
jr nz,else    ;the IF&lt;br /&gt;
;some code&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
;some code&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
But here's a crazy trick for when the Else code is a single 2-byte instruction:&lt;br /&gt;
You use the first byte of a 3 byte instruction with no side effects instead of the &amp;quot;jr endif&amp;quot; line!&lt;br /&gt;
So if you had code like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You could replace it with this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
.db $C2  ;jp nz,xxxx&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of branching over the ld a,4 instruction, it now executes a jp nz,XXXX instruction where the XXXX is the two bytes of the next instruction. You already know what the flags will be here, so you can make the jump never taken. You can use this to skip the next two bytes of execution! Who needs to branch over it?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
This only takes 28 T-states for if. A small saving, but could be useful in tight loops, and saves 2 bytes!&lt;br /&gt;
The only reason not to use this for 1-byte instructions would be code readability and bug safety. Watch those flags!&lt;br /&gt;
&lt;br /&gt;
=== Conditional rst ===&lt;br /&gt;
&lt;br /&gt;
For a smaller conditional rst $38, use jr cc, -1. This will cause a conditional jump to the displacement byte ($FF) which is the rst $38 opcode. &lt;br /&gt;
&lt;br /&gt;
=== DAA trick ===&lt;br /&gt;
&lt;br /&gt;
Normally DAA instruction is used for BCD math but can be used for converting (?) ASCII integer.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
	cp 10&lt;br /&gt;
	ccf&lt;br /&gt;
	adc a, 30h&lt;br /&gt;
	daa&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
* [http://www.smspower.org/Development/Z80ProgrammingTechniques SMS Power! dev wiki z80 Techniques]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* Dwedit for sharing in MaxCoderz the &amp;quot;Better else&amp;quot;&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;br /&gt;
* SMS Power wiki&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Assemblers</id>
		<title>Assemblers</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Assemblers"/>
				<updated>2010-11-13T09:34:24Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: completed spasm&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:PC Software|Assemblers]]&lt;br /&gt;
{{stub}}&lt;br /&gt;
{{wikify}}&lt;br /&gt;
&lt;br /&gt;
== Z80 Assemblers ==&lt;br /&gt;
&lt;br /&gt;
=== tasm ===&lt;br /&gt;
&lt;br /&gt;
* need ''' linker to TI-files'''&lt;br /&gt;
* command-line usage syntax&lt;br /&gt;
* features&lt;br /&gt;
* documents (included readme)&lt;br /&gt;
* warn it is shareware&lt;br /&gt;
&lt;br /&gt;
=== spasm ===&lt;br /&gt;
[http://wabbit.codeplex.com/releases/view/45088 SPASM2 stable build]&lt;br /&gt;
* command-line usage syntax&lt;br /&gt;
&lt;br /&gt;
Features:&lt;br /&gt;
* blazing fast assembling&lt;br /&gt;
* linking to all TI calculators&lt;br /&gt;
* powerful macros&lt;br /&gt;
* import bitmaps pictures (.bmp)&lt;br /&gt;
* TASM compatibility (TASM to spasm) (partial?)&lt;br /&gt;
&lt;br /&gt;
=== Brass ===&lt;br /&gt;
&lt;br /&gt;
Features:&lt;br /&gt;
* TASM compatibility (partial?)&lt;br /&gt;
* macros&lt;br /&gt;
&lt;br /&gt;
=== tpasm ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== phasm ===&lt;br /&gt;
&lt;br /&gt;
== 68k Assemblers ==&lt;br /&gt;
* none&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=TI_websites</id>
		<title>TI websites</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=TI_websites"/>
				<updated>2010-11-13T09:30:08Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: various stuff added or reordered&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Official =&lt;br /&gt;
&lt;br /&gt;
== General ==&lt;br /&gt;
* [http://www.ti.com/ Texas Instrument] (TI)&lt;br /&gt;
* [http://education.ti.com/ Official TI calculator site]&lt;br /&gt;
* [http://www.zilog.com/ z80 CPU Site]&lt;br /&gt;
&lt;br /&gt;
= Community =&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
* [http://calcg.org/ calcG.org] [http://www.calcgames.org/ CalcGames]&lt;br /&gt;
&lt;br /&gt;
== News ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
* [http://www.unitedti.org/forum United-TI Forum] (UTI Forum)&lt;br /&gt;
* [http://www.maxcoderz.com/ MaxCoderz Forum] (MC Forum)&lt;br /&gt;
* [http://www.revsoft.org/phpBB2/ Revsoft Forum] (RS Forum)&lt;br /&gt;
* [http://www.cemetech.net/forum/ Cemetech Forum]&lt;br /&gt;
* [http://www.omnimaga.org Omnimaga Forum]&lt;br /&gt;
* [http://www.detachedsolutions.com/forum Detached Solutions Forum] (DS Forum)&lt;br /&gt;
* [http://tifreakware.ath.cx/ TI-Freakware Forum]&lt;br /&gt;
* [http://otbp.tifreakware.net/phpBB2/ Outside the Box Programming Forum] (OTBP Forum)&lt;br /&gt;
&lt;br /&gt;
== Wikis ==&lt;br /&gt;
* [http://tibasicdev.wikidot.com/ TI-Basic Developer] (everything about TI-BASIC in z80, 68 and Nspire)&lt;br /&gt;
* [http://wikiti.brandonw.net/ WikiTI] (this wiki)&lt;br /&gt;
* [http://hackspire.unsads.com/ Hackspire]&lt;br /&gt;
* [http://z80-heaven.wikidot.com/ z80 Heaven]&lt;br /&gt;
&lt;br /&gt;
== Guides and Tutorials ==&lt;br /&gt;
* [http://www.technicalc.org/ technicalc.org]&lt;br /&gt;
* [http://guide.ticalc.org/ The Guide]&lt;br /&gt;
* [http://karma.ticalc.org/ TI-82 ASM Corner]&lt;br /&gt;
* [http://users.hszk.bme.hu/%7Epg429/z80guide/index.htm]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/268/26877.htmlz80 Learn Z80 28 days]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/429/42937.html Hot Dog's TI-83 Plus Z80 ASM Lessons]&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Z80 Docs]&lt;br /&gt;
* [http://baze.au.com/misc/z80bits.html Z80 Bits]&lt;br /&gt;
* [http://www.z80.info/ Z80 INFO]&lt;br /&gt;
&lt;br /&gt;
== Active Teams ==&lt;br /&gt;
* [http://www.omnimaga.org/ Coders of Tomorrow]&lt;br /&gt;
* [http://www.revsoft.org/ Revolution Software]&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/ MaxCoderz]&lt;br /&gt;
* [http://www.detachedsolutions.com/ Detached Solutions]&lt;br /&gt;
&lt;br /&gt;
== Inactive/Ended Teams ==&lt;br /&gt;
* [http://tift.tuxfamily.org/ TIFT]&lt;br /&gt;
* [http://void.ticalc.org/ Void]&lt;br /&gt;
* [http://sicode.ticalc.org/ SiCoDe]&lt;br /&gt;
* [http://www.radicalsoft.org/ Radical Software]&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous Community Websites ==&lt;br /&gt;
* [http://tifreakware.net/ TI-Freakware]&lt;br /&gt;
&lt;br /&gt;
== IRC ==&lt;br /&gt;
* [http://www.omnimaga.org/irc #omnimaga] (efnet)&lt;br /&gt;
* [http://www.omnimaga.org/irc #unitedti]&lt;br /&gt;
* [http://chat.efnet.org/ #tcpa] (efnet)&lt;br /&gt;
* [http://chat.efnet.org/ #ti] (efnet)&lt;br /&gt;
* [http://tcpa.calcg.org/ TCPA site] (efnet)&lt;br /&gt;
&lt;br /&gt;
== Programmers site (experiments, projects...) ==&lt;br /&gt;
* [http://brandonw.net/ Brandon Wilson site]&lt;br /&gt;
* [http://benryves.com/ Ben Ryves blog]&lt;br /&gt;
* [http://www.michaelv.org/programs/calcs/ Michael Vincent site]&lt;br /&gt;
* [http://www.geocities.com/jimm09876/calc/ James Montelongo experiments]&lt;br /&gt;
* [http://sami.ticalc.org/ Sami TI Calculators page]&lt;br /&gt;
* [http://www.dwedit.org/ Dwedit's Website]&lt;br /&gt;
&lt;br /&gt;
* [http://weregoose.unitedti.org/ Weregoose TI-BASIC archive]&lt;br /&gt;
* [http://antipi.omnimaga.org/ {AP} Site]&lt;br /&gt;
* [http://www.kalanrock.us/ kalan_vod Site]&lt;br /&gt;
&lt;br /&gt;
== French Sites ==&lt;br /&gt;
* [http://ti.bank.free.fr/ ti.bank.free.fr]&lt;br /&gt;
* [http://www.ti-fr.com/ ti-fr.com]&lt;br /&gt;
* [http://ti83.free.fr/ ti83.free.fr]&lt;br /&gt;
* [http://tibank.forumactif.com/forum.htm tibank.forumactif.com]&lt;br /&gt;
* [http://membres.lycos.fr/virtuaart/ticalculette/tiprogs.htm lycos.fr]&lt;br /&gt;
* [http://tift.tuxfamily.org/ tift]&lt;br /&gt;
* [http://ti-wiki.pbwiki.com/ ti-wiki.pbwiki]&lt;br /&gt;
* [http://www.yaronet.com/ yaronet]&lt;br /&gt;
&lt;br /&gt;
== German Sites ==&lt;br /&gt;
* [http://mobifiles.bytefox.de/ MobiFiles]&lt;br /&gt;
&lt;br /&gt;
== Dutch Sites ==&lt;br /&gt;
* [http://www.ti-wereld.nl TI-Wereld]&lt;br /&gt;
* [http://www.scholieren.be Scholieren.be] (mostly 'inhabited' by school students, not by programmers)&lt;br /&gt;
&amp;lt;!-- contributed by Mapar007 --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Projects ==&lt;br /&gt;
&lt;br /&gt;
===3rd party OS===&lt;br /&gt;
* [http://sourceforge.net/projects/lifos/ LIFOS]&lt;br /&gt;
* [http://lifos.sourceforge.net/wordpress/ LIFOS blog]&lt;br /&gt;
* [http://vera.timendus.com/ Vera]&lt;br /&gt;
* [http://pongos.sourceforge.net/ PongOS]&lt;br /&gt;
* [http://brandonw.net/ OS2]&lt;br /&gt;
* [http://code.google.com/p/8xpos/ XOS]&lt;br /&gt;
* KnigthOS by SirCmpwn&lt;br /&gt;
&lt;br /&gt;
===Others===&lt;br /&gt;
* [http://api.timendus.com/ z80 assembly API]&lt;br /&gt;
* [http://clap.timendus.com/ CLAP project]&lt;br /&gt;
* [http://bell.timendus.com/ BELL project]&lt;br /&gt;
* [http://usb8x.sourceforge.net/ USB8X]&lt;br /&gt;
* [http://msd8x.denglend.net/ MSD8X]&lt;br /&gt;
* [http://usbtools.denlend.net/ USBTools]&lt;br /&gt;
* [http://dcs.cemetech.net/ DoorsCS]&lt;br /&gt;
&lt;br /&gt;
== Online Tools ==&lt;br /&gt;
* [http://www.cemetech.net/projects/basicelite/sourcecoder2.php SourceCoder2] (Online TI-BASIC editor, syntax colouring and optimizer)&lt;br /&gt;
* [http://ti.zewaren.net/ TI.ZEWAREN.NET - BETA] (general calculator variables editor)&lt;br /&gt;
* [http://galandrosdev.2kool4u.net/online_asm_unsquish.php Online Hex Disassembler] (for typing asm programs on calculator)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Memorable TI Websites =&lt;br /&gt;
That are no longer online, RIP.&lt;br /&gt;
* TI groups discussion (closed in ...)&lt;br /&gt;
* Cirrus (merged in UTI)&lt;br /&gt;
* Kevtiva (82 TI-BASIC and ASM stuff)&lt;br /&gt;
* Alienhead&lt;br /&gt;
* Assemblers Coders Zenith&lt;br /&gt;
* Badja TI Programs&lt;br /&gt;
&lt;br /&gt;
Note: use the wayback machine to see how they were: [http://www.archive.org/]&lt;br /&gt;
&lt;br /&gt;
= Other TI websites lists =&lt;br /&gt;
* [http://tifreakware.net/admin/link.php?catag=index TI-Freakware list]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Meta-tutorial</id>
		<title>Meta-tutorial</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Meta-tutorial"/>
				<updated>2010-11-13T09:23:39Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added Hot Dog's tut&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Assembly ==&lt;br /&gt;
&lt;br /&gt;
=== For beginners ===&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/429/42937.html Hot Dog's TI-83 Plus Z80 ASM Lessons]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/268/26877.html Learn TI-83 Plus Assembly In 28 Days]&lt;br /&gt;
* [http://z80-heaven.wikidot.com/ z80 Heaven]&lt;br /&gt;
&lt;br /&gt;
=== For intermediates ===&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/112/11269.html IonGuru]&lt;br /&gt;
&lt;br /&gt;
== TI-BASIC ==&lt;br /&gt;
* [http://tibasicdev.wikidot.com/ TI-Basic Developer]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=TI_websites</id>
		<title>TI websites</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=TI_websites"/>
				<updated>2010-11-13T09:21:13Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Guides and Tutorials */ added links&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Official =&lt;br /&gt;
&lt;br /&gt;
== General ==&lt;br /&gt;
* [http://www.ti.com/ Texas Instrument] (TI)&lt;br /&gt;
* [http://education.ti.com/ Official TI calculator site]&lt;br /&gt;
* [http://www.zilog.com/ z80 CPU Site]&lt;br /&gt;
&lt;br /&gt;
= Community =&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
* [http://calcg.org/ calcG.org] [http://www.calcgames.org/ CalcGames]&lt;br /&gt;
&lt;br /&gt;
== News ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
* [http://www.unitedti.org/forum United-TI Forum] (UTI Forum)&lt;br /&gt;
* [http://www.maxcoderz.com/ MaxCoderz Forum] (MC Forum)&lt;br /&gt;
* [http://www.revsoft.org/phpBB2/ Revsoft Forum] (RS Forum)&lt;br /&gt;
* [http://www.cemetech.net/forum/ Cemetech Forum]&lt;br /&gt;
* [http://www.omnimaga.org Omnimaga Forum]&lt;br /&gt;
* [http://www.detachedsolutions.com/forum Detached Solutions Forum] (DS Forum)&lt;br /&gt;
* [http://tifreakware.ath.cx/ TI-Freakware Forum]&lt;br /&gt;
* [http://otbp.tifreakware.net/phpBB2/ Outside the Box Programming Forum] (OTBP Forum)&lt;br /&gt;
&lt;br /&gt;
== Wikis ==&lt;br /&gt;
* [http://tibasicdev.wikidot.com/ TI-Basic Developer]&lt;br /&gt;
* [http://z80-heaven.wikidot.com/ z80 Heaven]&lt;br /&gt;
* [http://wikiti.brandonw.net/ WikiTI] (this wiki)&lt;br /&gt;
* [http://hackspire.unsads.com/ Hackspire]&lt;br /&gt;
&lt;br /&gt;
== Guides and Tutorials ==&lt;br /&gt;
* [http://www.technicalc.org/ technicalc.org]&lt;br /&gt;
* [http://guide.ticalc.org/ The Guide]&lt;br /&gt;
* [http://karma.ticalc.org/ TI-82 ASM Corner]&lt;br /&gt;
* [http://users.hszk.bme.hu/%7Epg429/z80guide/index.htm]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/268/26877.htmlz80 Learn Z80 28 days]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/429/42937.html Hot Dog's TI-83 Plus Z80 ASM Lessons]&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Z80 Docs]&lt;br /&gt;
* [http://baze.au.com/misc/z80bits.html Z80 Bits]&lt;br /&gt;
* [http://www.z80.info/ Z80 INFO]&lt;br /&gt;
&lt;br /&gt;
== Active Teams ==&lt;br /&gt;
* [http://www.omnimaga.org/ Coders of Tomorrow]&lt;br /&gt;
* [http://www.revsoft.org/ Revolution Software]&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/ MaxCoderz]&lt;br /&gt;
* [http://www.detachedsolutions.com/ Detached Solutions]&lt;br /&gt;
&lt;br /&gt;
== Inactive/Ended Teams ==&lt;br /&gt;
* [http://tift.tuxfamily.org/ TIFT]&lt;br /&gt;
* [http://void.ticalc.org/ Void]&lt;br /&gt;
* [http://sicode.ticalc.org/ SiCoDe]&lt;br /&gt;
* [http://www.radicalsoft.org/ Radical Software]&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous Community Websites ==&lt;br /&gt;
* [http://tifreakware.net/ TI-Freakware]&lt;br /&gt;
&lt;br /&gt;
== IRC ==&lt;br /&gt;
* [http://www.omnimaga.org/irc #omnimaga]&lt;br /&gt;
* [http://www.omnimaga.org/irc #unitedti]&lt;br /&gt;
* [http://chat.efnet.org/ #tcpa]&lt;br /&gt;
* [http://chat.efnet.org/ #ti]&lt;br /&gt;
* [http://tcpa.calcg.org/ TCPA site]&lt;br /&gt;
&lt;br /&gt;
== Programmers site (experiments, projects...) ==&lt;br /&gt;
* [http://brandonw.net/ Brandon Wilson site]&lt;br /&gt;
* [http://benryves.com/ Ben Ryves blog]&lt;br /&gt;
* [http://www.michaelv.org/programs/calcs/ Michael Vincent]&lt;br /&gt;
* [http://www.geocities.com/jimm09876/calc/ James Montelongo experiments]&lt;br /&gt;
* [http://sami.ticalc.org/ Sami TI Calculators page]&lt;br /&gt;
* [http://www.dwedit.org/ Dwedit's Website]&lt;br /&gt;
&lt;br /&gt;
* [http://weregoose.unitedti.org/ Weregoose TI-BASIC archive]&lt;br /&gt;
* [http://antipi.omnimaga.org/ {AP} Site]&lt;br /&gt;
* [http://www.kalanrock.us/ kalan_vod Site]&lt;br /&gt;
&lt;br /&gt;
== French Sites ==&lt;br /&gt;
* [http://www.ti-fr.com/ ti-fr.com]&lt;br /&gt;
* [http://ti.bank.free.fr/ ti.bank.free.fr]&lt;br /&gt;
* [http://ti83.free.fr/ ti83.free.fr]&lt;br /&gt;
* [http://tibank.forumactif.com/forum.htm tibank.forumactif.com]&lt;br /&gt;
* [http://membres.lycos.fr/virtuaart/ticalculette/tiprogs.htm lycos.fr]&lt;br /&gt;
* [http://tift.tuxfamily.org/ tift]&lt;br /&gt;
* [http://ti-wiki.pbwiki.com/ ti-wiki.pbwiki]&lt;br /&gt;
* [http://www.yaronet.com/ yaronet]&lt;br /&gt;
&lt;br /&gt;
== German Sites ==&lt;br /&gt;
* [http://mobifiles.bytefox.de/ MobiFiles]&lt;br /&gt;
&lt;br /&gt;
== Dutch Sites ==&lt;br /&gt;
* [http://www.ti-wereld.nl TI-Wereld]&lt;br /&gt;
* [http://www.scholieren.be Scholieren.be] (mostly 'inhabited' by school students, not by programmers)&lt;br /&gt;
&amp;lt;!-- there is at least one, mapar007 knows --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Edit by Mapar: added two of them--&amp;gt;&lt;br /&gt;
== Projects ==&lt;br /&gt;
&lt;br /&gt;
===3rd party OS===&lt;br /&gt;
* [http://sourceforge.net/projects/lifos/ LIFOS]&lt;br /&gt;
* [http://lifos.sourceforge.net/wordpress/ LIFOS blog]&lt;br /&gt;
* [http://vera.timendus.com/ Vera]&lt;br /&gt;
* [http://pongos.sourceforge.net/ PongOS]&lt;br /&gt;
* [http://brandonw.net/ OS2]&lt;br /&gt;
* [http://code.google.com/p/8xpos/ XOS]&lt;br /&gt;
&lt;br /&gt;
===Others===&lt;br /&gt;
* [http://api.timendus.com/ z80 assembly API]&lt;br /&gt;
* [http://clap.timendus.com/ CLAP project]&lt;br /&gt;
* [http://bell.timendus.com/ BELL project]&lt;br /&gt;
* [http://usb8x.sourceforge.net/ USB8X]&lt;br /&gt;
* [http://msd8x.denglend.net/ MSD8X]&lt;br /&gt;
* [http://usbtools.denlend.net/ USBTools]&lt;br /&gt;
* [http://dcs.cemetech.net/ DoorsCS]&lt;br /&gt;
&lt;br /&gt;
== Online Tools ==&lt;br /&gt;
* [http://www.cemetech.net/projects/basicelite/sourcecoder2.php SourceCoder2] (Online TI-BASIC editor, syntax colouring and optimizer)&lt;br /&gt;
* [http://ti.zewaren.net/ TI.ZEWAREN.NET - BETA]&lt;br /&gt;
* [http://galandrosdev.2kool4u.net/online_asm_unsquish.php Online Hex Disassembler]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Memorable TI Websites =&lt;br /&gt;
That are no longer online, RIP.&lt;br /&gt;
* TI groups discussion (closed in ...)&lt;br /&gt;
* Cirrus (merged in UTI)&lt;br /&gt;
* Kevtiva (82 TI-BASIC and ASM stuff)&lt;br /&gt;
* Alienhead&lt;br /&gt;
* Assemblers Coders Zenith&lt;br /&gt;
* Badja TI Programs&lt;br /&gt;
&lt;br /&gt;
Note: use the wayback machine to see how they were: [http://www.archive.org/]&lt;br /&gt;
&lt;br /&gt;
= Other TI websites lists =&lt;br /&gt;
* [http://tifreakware.net/admin/link.php?catag=index TI-Freakware list]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=TI_websites</id>
		<title>TI websites</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=TI_websites"/>
				<updated>2010-11-13T09:18:31Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Wikis */ added hackspire&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Official =&lt;br /&gt;
&lt;br /&gt;
== General ==&lt;br /&gt;
* [http://www.ti.com/ Texas Instrument] (TI)&lt;br /&gt;
* [http://education.ti.com/ Official TI calculator site]&lt;br /&gt;
* [http://www.zilog.com/ z80 CPU Site]&lt;br /&gt;
&lt;br /&gt;
= Community =&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
* [http://calcg.org/ calcG.org] [http://www.calcgames.org/ CalcGames]&lt;br /&gt;
&lt;br /&gt;
== News ==&lt;br /&gt;
* [http://www.ticalc.org/ ticalc.org]&lt;br /&gt;
&lt;br /&gt;
== Forums ==&lt;br /&gt;
* [http://www.unitedti.org/forum United-TI Forum] (UTI Forum)&lt;br /&gt;
* [http://www.maxcoderz.com/ MaxCoderz Forum] (MC Forum)&lt;br /&gt;
* [http://www.revsoft.org/phpBB2/ Revsoft Forum] (RS Forum)&lt;br /&gt;
* [http://www.cemetech.net/forum/ Cemetech Forum]&lt;br /&gt;
* [http://www.omnimaga.org Omnimaga Forum]&lt;br /&gt;
* [http://www.detachedsolutions.com/forum Detached Solutions Forum] (DS Forum)&lt;br /&gt;
* [http://tifreakware.ath.cx/ TI-Freakware Forum]&lt;br /&gt;
* [http://otbp.tifreakware.net/phpBB2/ Outside the Box Programming Forum] (OTBP Forum)&lt;br /&gt;
&lt;br /&gt;
== Wikis ==&lt;br /&gt;
* [http://tibasicdev.wikidot.com/ TI-Basic Developer]&lt;br /&gt;
* [http://z80-heaven.wikidot.com/ z80 Heaven]&lt;br /&gt;
* [http://wikiti.brandonw.net/ WikiTI] (this wiki)&lt;br /&gt;
* [http://hackspire.unsads.com/ Hackspire]&lt;br /&gt;
&lt;br /&gt;
== Guides and Tutorials ==&lt;br /&gt;
* [http://www.technicalc.org/ technicalc.org]&lt;br /&gt;
* [http://guide.ticalc.org/ The Guide]&lt;br /&gt;
* [http://karma.ticalc.org/ TI-82 ASM Corner]&lt;br /&gt;
* [http://users.hszk.bme.hu/%7Epg429/z80guide/index.htm]&lt;br /&gt;
* [z80 in 28 days online]&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Z80 Docs]&lt;br /&gt;
* [http://baze.au.com/misc/z80bits.html Z80 Bits]&lt;br /&gt;
* [http://www.z80.info/ Z80 INFO]&lt;br /&gt;
&lt;br /&gt;
== Active Teams ==&lt;br /&gt;
* [http://www.omnimaga.org/ Coders of Tomorrow]&lt;br /&gt;
* [http://www.revsoft.org/ Revolution Software]&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/ MaxCoderz]&lt;br /&gt;
* [http://www.detachedsolutions.com/ Detached Solutions]&lt;br /&gt;
&lt;br /&gt;
== Inactive/Ended Teams ==&lt;br /&gt;
* [http://tift.tuxfamily.org/ TIFT]&lt;br /&gt;
* [http://void.ticalc.org/ Void]&lt;br /&gt;
* [http://sicode.ticalc.org/ SiCoDe]&lt;br /&gt;
* [http://www.radicalsoft.org/ Radical Software]&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous Community Websites ==&lt;br /&gt;
* [http://tifreakware.net/ TI-Freakware]&lt;br /&gt;
&lt;br /&gt;
== IRC ==&lt;br /&gt;
* [http://www.omnimaga.org/irc #omnimaga]&lt;br /&gt;
* [http://www.omnimaga.org/irc #unitedti]&lt;br /&gt;
* [http://chat.efnet.org/ #tcpa]&lt;br /&gt;
* [http://chat.efnet.org/ #ti]&lt;br /&gt;
* [http://tcpa.calcg.org/ TCPA site]&lt;br /&gt;
&lt;br /&gt;
== Programmers site (experiments, projects...) ==&lt;br /&gt;
* [http://brandonw.net/ Brandon Wilson site]&lt;br /&gt;
* [http://benryves.com/ Ben Ryves blog]&lt;br /&gt;
* [http://www.michaelv.org/programs/calcs/ Michael Vincent]&lt;br /&gt;
* [http://www.geocities.com/jimm09876/calc/ James Montelongo experiments]&lt;br /&gt;
* [http://sami.ticalc.org/ Sami TI Calculators page]&lt;br /&gt;
* [http://www.dwedit.org/ Dwedit's Website]&lt;br /&gt;
&lt;br /&gt;
* [http://weregoose.unitedti.org/ Weregoose TI-BASIC archive]&lt;br /&gt;
* [http://antipi.omnimaga.org/ {AP} Site]&lt;br /&gt;
* [http://www.kalanrock.us/ kalan_vod Site]&lt;br /&gt;
&lt;br /&gt;
== French Sites ==&lt;br /&gt;
* [http://www.ti-fr.com/ ti-fr.com]&lt;br /&gt;
* [http://ti.bank.free.fr/ ti.bank.free.fr]&lt;br /&gt;
* [http://ti83.free.fr/ ti83.free.fr]&lt;br /&gt;
* [http://tibank.forumactif.com/forum.htm tibank.forumactif.com]&lt;br /&gt;
* [http://membres.lycos.fr/virtuaart/ticalculette/tiprogs.htm lycos.fr]&lt;br /&gt;
* [http://tift.tuxfamily.org/ tift]&lt;br /&gt;
* [http://ti-wiki.pbwiki.com/ ti-wiki.pbwiki]&lt;br /&gt;
* [http://www.yaronet.com/ yaronet]&lt;br /&gt;
&lt;br /&gt;
== German Sites ==&lt;br /&gt;
* [http://mobifiles.bytefox.de/ MobiFiles]&lt;br /&gt;
&lt;br /&gt;
== Dutch Sites ==&lt;br /&gt;
* [http://www.ti-wereld.nl TI-Wereld]&lt;br /&gt;
* [http://www.scholieren.be Scholieren.be] (mostly 'inhabited' by school students, not by programmers)&lt;br /&gt;
&amp;lt;!-- there is at least one, mapar007 knows --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Edit by Mapar: added two of them--&amp;gt;&lt;br /&gt;
== Projects ==&lt;br /&gt;
&lt;br /&gt;
===3rd party OS===&lt;br /&gt;
* [http://sourceforge.net/projects/lifos/ LIFOS]&lt;br /&gt;
* [http://lifos.sourceforge.net/wordpress/ LIFOS blog]&lt;br /&gt;
* [http://vera.timendus.com/ Vera]&lt;br /&gt;
* [http://pongos.sourceforge.net/ PongOS]&lt;br /&gt;
* [http://brandonw.net/ OS2]&lt;br /&gt;
* [http://code.google.com/p/8xpos/ XOS]&lt;br /&gt;
&lt;br /&gt;
===Others===&lt;br /&gt;
* [http://api.timendus.com/ z80 assembly API]&lt;br /&gt;
* [http://clap.timendus.com/ CLAP project]&lt;br /&gt;
* [http://bell.timendus.com/ BELL project]&lt;br /&gt;
* [http://usb8x.sourceforge.net/ USB8X]&lt;br /&gt;
* [http://msd8x.denglend.net/ MSD8X]&lt;br /&gt;
* [http://usbtools.denlend.net/ USBTools]&lt;br /&gt;
* [http://dcs.cemetech.net/ DoorsCS]&lt;br /&gt;
&lt;br /&gt;
== Online Tools ==&lt;br /&gt;
* [http://www.cemetech.net/projects/basicelite/sourcecoder2.php SourceCoder2] (Online TI-BASIC editor, syntax colouring and optimizer)&lt;br /&gt;
* [http://ti.zewaren.net/ TI.ZEWAREN.NET - BETA]&lt;br /&gt;
* [http://galandrosdev.2kool4u.net/online_asm_unsquish.php Online Hex Disassembler]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Memorable TI Websites =&lt;br /&gt;
That are no longer online, RIP.&lt;br /&gt;
* TI groups discussion (closed in ...)&lt;br /&gt;
* Cirrus (merged in UTI)&lt;br /&gt;
* Kevtiva (82 TI-BASIC and ASM stuff)&lt;br /&gt;
* Alienhead&lt;br /&gt;
* Assemblers Coders Zenith&lt;br /&gt;
* Badja TI Programs&lt;br /&gt;
&lt;br /&gt;
Note: use the wayback machine to see how they were: [http://www.archive.org/]&lt;br /&gt;
&lt;br /&gt;
= Other TI websites lists =&lt;br /&gt;
* [http://tifreakware.net/admin/link.php?catag=index TI-Freakware list]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation</id>
		<title>Talk:Calculator Documentation</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Talk:Calculator_Documentation"/>
				<updated>2010-11-13T09:15:56Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Nspire in WikiTI */ new section&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;I changed some headings and moved stuff around a bit. I think it's more logical this way, but I don't mind if you change it back. [[User:Jib|Jib]] 02:37, 5 November 2006 (PST)&lt;br /&gt;
: The only thing that bugs me about it is that there's a &amp;quot;software&amp;quot; heading that includes both PC software (emulators) and calculator software (TIOS Alternatives), not to mention that software documentation is listed under a different heading.  What do you think about:&lt;br /&gt;
 Z80 Calculators&lt;br /&gt;
  TI-83&lt;br /&gt;
   Programming&lt;br /&gt;
    Ports&lt;br /&gt;
    RAM Areas&lt;br /&gt;
    B_CALLs&lt;br /&gt;
    etc&lt;br /&gt;
   Software&lt;br /&gt;
    Calculator Software&lt;br /&gt;
    Emulators&lt;br /&gt;
    Linking Software&lt;br /&gt;
    TIOS Alternatives&lt;br /&gt;
  TI-83 Plus Family&lt;br /&gt;
   Programming&lt;br /&gt;
    etc...&lt;br /&gt;
   Software&lt;br /&gt;
    etc...&lt;br /&gt;
  General Z80 Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 68k Calculators&lt;br /&gt;
  TI-89/92+&lt;br /&gt;
   Programming&lt;br /&gt;
    etc.&lt;br /&gt;
   Software&lt;br /&gt;
    etc.&lt;br /&gt;
  TI-92&lt;br /&gt;
   etc.&lt;br /&gt;
  General 68k Info&lt;br /&gt;
   Instruction Set&lt;br /&gt;
   etc.&lt;br /&gt;
&lt;br /&gt;
 TI Community&lt;br /&gt;
  History&lt;br /&gt;
  Programming Teams&lt;br /&gt;
  etc.&lt;br /&gt;
&lt;br /&gt;
For example, in that organization, the TI-83 Emulators page would basically just be a list of the emulators compatible with the 83, each of which would have their own separate page.  So the VTI page might be linked from the TI-83 emulators page, the TI-85 emulators page, the TI-89 emulators page, etc.  I like this organization because it makes it clear what calculators everything is applicable to, and it decreases the number of top-level categories.  But maybe it's too complicated or has too many levels.  Thoughts?  --[[User:Dan Englender|Dan Englender]] 09:11, 5 November 2006 (PST)&lt;br /&gt;
:Yeah, it seems like a good idea to me. [[User:Jib|Jib]] 12:35, 5 November 2006 (PST)&lt;br /&gt;
&lt;br /&gt;
== Descriptions ==&lt;br /&gt;
&lt;br /&gt;
I was just made aware that searching doesn't always work so well when looking for certain things.  I think adding little descriptions after each link may help with new user finding things that aren't easy to find using search. I would do this however I have the grammar and spelling skills of a jar of mayonaise.&lt;br /&gt;
&lt;br /&gt;
Also right now the Wiki isn't very friendly to new comers. Perhaps brain storming some ways to improve ease of use is in order.--[[User:Jim e|Jim e]] 23:03, 13 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
:Maybe I'm &amp;quot;thinking divergently&amp;quot;, but from the very beginning, I thought the WikiTI might be a bit more useful if one could export the contents directly to a &amp;quot;wikiti.inc&amp;quot; file or something, where we can select which information we need exported (choose a calc model, choose whether you want all the bcalls, the port info, flags, and/or the RAM addresses, with or without their accompanying documentation).  Maybe, since ti83plus.inc has recently been added, a script or something could use it as a basis, adding documentation as comments such that one can simply export the necessary WikiTI data to a file, and include it directly in one's source. [[User:Saibot84|Saibot84]] 06:03, 15 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
If you type up descriptions, I'll be more than happy to make sure the grammar and spelling is clear and correct. [[User:Threefingeredguy|threefingeredguy]] 02:36, 14 February 2007 (PST)&lt;br /&gt;
&lt;br /&gt;
== A suggestion ==&lt;br /&gt;
* What about pages to specifications of each calculator like in:&lt;br /&gt;
http://www.ticalc.org/basics/calculators/&lt;br /&gt;
They are on wanted pages, also.&lt;br /&gt;
I am not sure how to add it.&lt;br /&gt;
* And what about TI-82 and TI-85?&lt;br /&gt;
[[User:Galandros|Galandros]] 20:12, 25 October 2009 (UTC)&lt;br /&gt;
* This maybe needs serious reorganization but I don't know how exactly. It is hard.&lt;br /&gt;
[[User:Galandros|Galandros]] 15:15, 2 November 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
== Nspire in WikiTI ==&lt;br /&gt;
&lt;br /&gt;
WikiTI does not have Nspire documentation in it but I guess it can be left for Hackspire.&lt;br /&gt;
&lt;br /&gt;
At least I think we should link to hackspire. Should it be linked in Calculator Documentation page or only in TI websites page?&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=To_Do_List</id>
		<title>To Do List</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=To_Do_List"/>
				<updated>2010-11-13T09:10:49Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: small edit&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= To Do List =&lt;br /&gt;
&lt;br /&gt;
== Special Pages ==&lt;br /&gt;
* [[Special:WantedCategories]]&lt;br /&gt;
* [[Special:WantedPages]]&lt;br /&gt;
* Stub pages&lt;br /&gt;
* Wikify pages&lt;br /&gt;
* Check yourself [[Special:SpecialPages]] and see what is needed&lt;br /&gt;
&lt;br /&gt;
== WikiTI ==&lt;br /&gt;
* Guide to Editing&lt;br /&gt;
* WikiTI Conventions&lt;br /&gt;
* Automatic way to delete old spam&lt;br /&gt;
&lt;br /&gt;
== Z80 Calculators Documentation ==&lt;br /&gt;
* ???&lt;br /&gt;
&lt;br /&gt;
== 68K Calculators Documentation ==&lt;br /&gt;
* ???&lt;br /&gt;
&lt;br /&gt;
== Calculator related PC Software ==&lt;br /&gt;
* ???&lt;br /&gt;
&lt;br /&gt;
== TI Community ==&lt;br /&gt;
* add more links to [[TI websites]]&lt;br /&gt;
* more complete and updated [[History of the TI Z80 community]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite</id>
		<title>Z80 Routines:Graphic:putLargeSprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite"/>
				<updated>2010-06-25T09:15:23Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: letters case&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Graphic|PutLargeSprite]]&lt;br /&gt;
[[Category:Z80 Routines|PutLargeSprite]]&lt;br /&gt;
The '''Largesprite''' routine is used to copy the contents of a variable sized sprite to the Graph Buffer.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
Here is Joe Wingbermuehle's version, which is the one used in ION. Gbuf must be defined before its use.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;=======================&lt;br /&gt;
;LargeSprite&lt;br /&gt;
;by Joe Wingbermuehle&lt;br /&gt;
;=======================&lt;br /&gt;
;Does:   Copy a sprite to the gbuf&lt;br /&gt;
;Input:  ix=sprite address, a='x', l='y', b='height' (in pixels), c='width' (in bytes, e.g. 2 would be 16)&lt;br /&gt;
;Output: The sprite is copied to the gbuf&lt;br /&gt;
;-----------------------&lt;br /&gt;
largeSprite:&lt;br /&gt;
   di                                 ;turn interrupts off (we want to use shadow registers)&lt;br /&gt;
   ex   af,af'&lt;br /&gt;
                                      ;exchange af with af'     \&lt;br /&gt;
   ld   a,c                           ;ld c in a (a = 'width')  | for not destroying a ('x')&lt;br /&gt;
   push   af                          ;push a                   |&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;exchange back            | and 'width' is now in a' (saved)&lt;br /&gt;
      ld   e,l                        ;e = 'y'&lt;br /&gt;
      ld   h,$00                      ;h =  0&lt;br /&gt;
      ld   d,h                        ;d =  0&lt;br /&gt;
      add   hl,de                     ;'y' *2  \&lt;br /&gt;
      add   hl,de                     ;    *3  | calculate 'y' *12 because 'y' is 'in rows'&lt;br /&gt;
      add   hl,hl                     ;    *6  |   (screen is 12 bytes in length)&lt;br /&gt;
      add   hl,hl                     ;    *12 /&lt;br /&gt;
      ld   e,a                        ;e = 'x'&lt;br /&gt;
      and   $07                       ;and %00000111&lt;br /&gt;
      ld   c,a                        ;last 3 bits in c (amount of bits to shift all bytes)&lt;br /&gt;
      srl   e                         ;e/2   | shifting e ('x') 3 bits to the right&lt;br /&gt;
      srl   e                         ; /4   |   %11111111 becomes %00011111 for example&lt;br /&gt;
      srl   e                         ; /8   /&lt;br /&gt;
      add   hl,de                     ;hl = 'y'; de = 'x' (rounded) | add them&lt;br /&gt;
      ld   de, gbuf                   ;de = the adress of graph buffer&lt;br /&gt;
      add   hl,de                     ;add hl to the adress of the gbuf&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
      push   hl                       ;save adress&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
         ld   d,(ix)                  ;first sprite data in d&lt;br /&gt;
         ld   e,$00                   ;e = 0&lt;br /&gt;
         ld   a,c                     ;a = c (to not destroy c)&lt;br /&gt;
         or   a                       ;is a = 0? (same as cp 0)&lt;br /&gt;
         jr   z,largeSpriteSkip1      ;if theres nothing to shift (a = 0) loop it&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
         srl   d                      ;shift one bit to the right; put the destroyed bit in the carry flag&lt;br /&gt;
         rr   e                       ;put the carry flag in e (%00000000 becomes %10000000 if carry flag = 1)&lt;br /&gt;
         dec   a                      ;decrease counter (with was 'the amount of bits to shift')&lt;br /&gt;
         jr   nz,largeSpriteLoop3     ;if the counter is not 0 loop back&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   d                      ;xor first byte of sprite (that can be changed to 'or d' if you want a OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   hl                     ;increase pointer&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   e                      ;xor with shifted sprite byte (change to 'or e' for OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   ix                     ;increase sprite adress&lt;br /&gt;
         ex   af,af'&lt;br /&gt;
                                      ;exchange af with af' ( a is now the 'width' from the first line)&lt;br /&gt;
         dec   a                      ;decrease 'width'&lt;br /&gt;
         push   af                    ;push the 'width'&lt;br /&gt;
            ex   af,af'&lt;br /&gt;
                                      ;exchange back&lt;br /&gt;
         pop   af                     ;pop the 'width'&lt;br /&gt;
         jr   nz,largeSpriteLoop2     ;if a is not 0 (if a = 0 then we would be done) loop it&lt;br /&gt;
      pop   hl                        ;pop gbuf adress (search the last push hl!)&lt;br /&gt;
   pop   af                           ;pop  | to restore the real 'width'&lt;br /&gt;
   push   af                          ;push /&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;af' must be the original 'width' when loop 'largeSpriteLoop1'&lt;br /&gt;
      ld   de,$0C                     ;ld de,12&lt;br /&gt;
      add   hl,de                     ;next line&lt;br /&gt;
      djnz   largeSpriteLoop1         ;if not b = 0 loop (b = height of sprite)&lt;br /&gt;
   pop   af                           ;pop because we dont want a stack problem :)&lt;br /&gt;
   ret                                ;return&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   l,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,8   ;height&lt;br /&gt;
   ld   c,2   ;width in bytes&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call largesprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %11111111,%11111111&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version without shadow registers ==&lt;br /&gt;
&lt;br /&gt;
by Tijl Coosemans, made for Venus. Compatible with ION's routine. screenBuf must be defined.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
iLargeSprite&lt;br /&gt;
	ld	h,0&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,screenBuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	and	7&lt;br /&gt;
	ld	e,a&lt;br /&gt;
iLargeSpriteLoop1&lt;br /&gt;
	push	bc&lt;br /&gt;
	push	hl&lt;br /&gt;
	ld	b,c&lt;br /&gt;
iLargeSpriteLoop2&lt;br /&gt;
	ld	c,(ix)&lt;br /&gt;
	ld	d,0&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ld	a,e&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,iLargeSprite1&lt;br /&gt;
iLargeSpriteLoop3&lt;br /&gt;
	srl	c&lt;br /&gt;
	rr	d&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,iLargeSpriteLoop3&lt;br /&gt;
iLargeSprite1&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,d&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	djnz	iLargeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	ld	c,12&lt;br /&gt;
	add	hl,bc&lt;br /&gt;
	pop	bc&lt;br /&gt;
	djnz	iLargeSpriteLoop1&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version with Clipping ==&lt;br /&gt;
This is a version that supports clipping the large sprite.  It's considerable larger and requires Self modifying code.  The SMC can be removed without to much difficulty. It has different inputs than ION's.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;--------------------------------&lt;br /&gt;
;Clip Big Sprite&lt;br /&gt;
;by James Montelongo&lt;br /&gt;
;MAX SIZE: 64x64&lt;br /&gt;
;ix - Sprite&lt;br /&gt;
;b  - height&lt;br /&gt;
;c  - width in bytes&lt;br /&gt;
;d  - x&lt;br /&gt;
;e  - y&lt;br /&gt;
&lt;br /&gt;
ClipBigSprite:&lt;br /&gt;
; Early out, Check if its even remotely on screen&lt;br /&gt;
	ld a,e&lt;br /&gt;
	cp 64&lt;br /&gt;
	ret p&lt;br /&gt;
	add a,b&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
	ld a,d&lt;br /&gt;
	cp 96&lt;br /&gt;
	ret p&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_bottom&lt;br /&gt;
	neg&lt;br /&gt;
	push de&lt;br /&gt;
	ld hl,0&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,a&lt;br /&gt;
	bit 2,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 1,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 0,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	pop de&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	add ix,de		;Here you can save the top offset&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	ld e,0&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_bottom:&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	add a,b&lt;br /&gt;
	sub 64&lt;br /&gt;
	jp m,Check_clip_Left&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_Left:&lt;br /&gt;
				; at this point you may want to save b&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ld a,Clipleftsize &lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_right&lt;br /&gt;
	cpl&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ex de,hl		;save the clipped left offset&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld d,0&lt;br /&gt;
	add ix,de&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	inc a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	and $07&lt;br /&gt;
	ld d,a&lt;br /&gt;
Check_clip_right:&lt;br /&gt;
&lt;br /&gt;
	ld a,Cliprightsize &lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	sub 96&lt;br /&gt;
	jp m,Check_clip_middle&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld a,(bigskip)&lt;br /&gt;
	add a,l&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
Check_clip_middle:&lt;br /&gt;
				; This is where C should be saved.&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	or a&lt;br /&gt;
	jp nz,dontskipmiddle&lt;br /&gt;
	ld a,ClipMiddlesize &lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
dontskipmiddle:&lt;br /&gt;
	ld l,e&lt;br /&gt;
	ld a,d	&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	ld e,a&lt;br /&gt;
	and $07&lt;br /&gt;
	xor 7&lt;br /&gt;
	ld (BigRot1),a&lt;br /&gt;
	ld (BigRot2),a&lt;br /&gt;
	ld (BigRot3),a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld (clipbigrot1),a&lt;br /&gt;
	ld a,$ff&lt;br /&gt;
clipbigrot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	add hl,de&lt;br /&gt;
	ld de,gbuf&lt;br /&gt;
	add hl,de&lt;br /&gt;
				; This is where gbuf offset should be saved.&lt;br /&gt;
	ld d,a&lt;br /&gt;
	cpl&lt;br /&gt;
	ld e,a&lt;br /&gt;
				;masks should be saved to&lt;br /&gt;
BigSpriteRow:&lt;br /&gt;
	push bc&lt;br /&gt;
	push hl&lt;br /&gt;
	ld b,c&lt;br /&gt;
Do_Clipleft = $+1&lt;br /&gt;
	jr Clipleft&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask0:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
Clipleft:&lt;br /&gt;
Clipleftsize = Clipleft-(Do_Clipleft+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipMiddle = $+1&lt;br /&gt;
	jr $+2&lt;br /&gt;
BigSpriteloop:&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot2 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	ld c,a&lt;br /&gt;
BigMask1:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	inc hl&lt;br /&gt;
	ld a,c&lt;br /&gt;
BigMask2:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	djnz BigSpriteloop&lt;br /&gt;
ClipMiddle:&lt;br /&gt;
ClipMiddlesize = ClipMiddle-(Do_ClipMiddle+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipRight = $+1&lt;br /&gt;
	jr ClipRight&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
BigRot3 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask3:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
ClipRight:&lt;br /&gt;
Cliprightsize = ClipRight-(Do_ClipRight+1)&lt;br /&gt;
	pop hl&lt;br /&gt;
&lt;br /&gt;
	ld bc,12			;width of the screen&lt;br /&gt;
	add hl,bc&lt;br /&gt;
&lt;br /&gt;
bigskip = $+1&lt;br /&gt;
	ld bc,0&lt;br /&gt;
	add ix,bc&lt;br /&gt;
	pop bc&lt;br /&gt;
	djnz BigSpriteRow&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite</id>
		<title>Z80 Routines:Graphic:putLargeSprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite"/>
				<updated>2010-06-25T09:14:41Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: category position&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Graphic|putLargeSprite]]&lt;br /&gt;
[[Category:Z80 Routines|putLargeSprite]]&lt;br /&gt;
The '''Largesprite''' routine is used to copy the contents of a variable sized sprite to the Graph Buffer.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
Here is Joe Wingbermuehle's version, which is the one used in ION. Gbuf must be defined before its use.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;=======================&lt;br /&gt;
;LargeSprite&lt;br /&gt;
;by Joe Wingbermuehle&lt;br /&gt;
;=======================&lt;br /&gt;
;Does:   Copy a sprite to the gbuf&lt;br /&gt;
;Input:  ix=sprite address, a='x', l='y', b='height' (in pixels), c='width' (in bytes, e.g. 2 would be 16)&lt;br /&gt;
;Output: The sprite is copied to the gbuf&lt;br /&gt;
;-----------------------&lt;br /&gt;
largeSprite:&lt;br /&gt;
   di                                 ;turn interrupts off (we want to use shadow registers)&lt;br /&gt;
   ex   af,af'&lt;br /&gt;
                                      ;exchange af with af'     \&lt;br /&gt;
   ld   a,c                           ;ld c in a (a = 'width')  | for not destroying a ('x')&lt;br /&gt;
   push   af                          ;push a                   |&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;exchange back            | and 'width' is now in a' (saved)&lt;br /&gt;
      ld   e,l                        ;e = 'y'&lt;br /&gt;
      ld   h,$00                      ;h =  0&lt;br /&gt;
      ld   d,h                        ;d =  0&lt;br /&gt;
      add   hl,de                     ;'y' *2  \&lt;br /&gt;
      add   hl,de                     ;    *3  | calculate 'y' *12 because 'y' is 'in rows'&lt;br /&gt;
      add   hl,hl                     ;    *6  |   (screen is 12 bytes in length)&lt;br /&gt;
      add   hl,hl                     ;    *12 /&lt;br /&gt;
      ld   e,a                        ;e = 'x'&lt;br /&gt;
      and   $07                       ;and %00000111&lt;br /&gt;
      ld   c,a                        ;last 3 bits in c (amount of bits to shift all bytes)&lt;br /&gt;
      srl   e                         ;e/2   | shifting e ('x') 3 bits to the right&lt;br /&gt;
      srl   e                         ; /4   |   %11111111 becomes %00011111 for example&lt;br /&gt;
      srl   e                         ; /8   /&lt;br /&gt;
      add   hl,de                     ;hl = 'y'; de = 'x' (rounded) | add them&lt;br /&gt;
      ld   de, gbuf                   ;de = the adress of graph buffer&lt;br /&gt;
      add   hl,de                     ;add hl to the adress of the gbuf&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
      push   hl                       ;save adress&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
         ld   d,(ix)                  ;first sprite data in d&lt;br /&gt;
         ld   e,$00                   ;e = 0&lt;br /&gt;
         ld   a,c                     ;a = c (to not destroy c)&lt;br /&gt;
         or   a                       ;is a = 0? (same as cp 0)&lt;br /&gt;
         jr   z,largeSpriteSkip1      ;if theres nothing to shift (a = 0) loop it&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
         srl   d                      ;shift one bit to the right; put the destroyed bit in the carry flag&lt;br /&gt;
         rr   e                       ;put the carry flag in e (%00000000 becomes %10000000 if carry flag = 1)&lt;br /&gt;
         dec   a                      ;decrease counter (with was 'the amount of bits to shift')&lt;br /&gt;
         jr   nz,largeSpriteLoop3     ;if the counter is not 0 loop back&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   d                      ;xor first byte of sprite (that can be changed to 'or d' if you want a OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   hl                     ;increase pointer&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   e                      ;xor with shifted sprite byte (change to 'or e' for OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   ix                     ;increase sprite adress&lt;br /&gt;
         ex   af,af'&lt;br /&gt;
                                      ;exchange af with af' ( a is now the 'width' from the first line)&lt;br /&gt;
         dec   a                      ;decrease 'width'&lt;br /&gt;
         push   af                    ;push the 'width'&lt;br /&gt;
            ex   af,af'&lt;br /&gt;
                                      ;exchange back&lt;br /&gt;
         pop   af                     ;pop the 'width'&lt;br /&gt;
         jr   nz,largeSpriteLoop2     ;if a is not 0 (if a = 0 then we would be done) loop it&lt;br /&gt;
      pop   hl                        ;pop gbuf adress (search the last push hl!)&lt;br /&gt;
   pop   af                           ;pop  | to restore the real 'width'&lt;br /&gt;
   push   af                          ;push /&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;af' must be the original 'width' when loop 'largeSpriteLoop1'&lt;br /&gt;
      ld   de,$0C                     ;ld de,12&lt;br /&gt;
      add   hl,de                     ;next line&lt;br /&gt;
      djnz   largeSpriteLoop1         ;if not b = 0 loop (b = height of sprite)&lt;br /&gt;
   pop   af                           ;pop because we dont want a stack problem :)&lt;br /&gt;
   ret                                ;return&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   l,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,8   ;height&lt;br /&gt;
   ld   c,2   ;width in bytes&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call largesprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %11111111,%11111111&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version without shadow registers ==&lt;br /&gt;
&lt;br /&gt;
by Tijl Coosemans, made for Venus. Compatible with ION's routine. screenBuf must be defined.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
iLargeSprite&lt;br /&gt;
	ld	h,0&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,screenBuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	and	7&lt;br /&gt;
	ld	e,a&lt;br /&gt;
iLargeSpriteLoop1&lt;br /&gt;
	push	bc&lt;br /&gt;
	push	hl&lt;br /&gt;
	ld	b,c&lt;br /&gt;
iLargeSpriteLoop2&lt;br /&gt;
	ld	c,(ix)&lt;br /&gt;
	ld	d,0&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ld	a,e&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,iLargeSprite1&lt;br /&gt;
iLargeSpriteLoop3&lt;br /&gt;
	srl	c&lt;br /&gt;
	rr	d&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,iLargeSpriteLoop3&lt;br /&gt;
iLargeSprite1&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,d&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	djnz	iLargeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	ld	c,12&lt;br /&gt;
	add	hl,bc&lt;br /&gt;
	pop	bc&lt;br /&gt;
	djnz	iLargeSpriteLoop1&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version with Clipping ==&lt;br /&gt;
This is a version that supports clipping the large sprite.  It's considerable larger and requires Self modifying code.  The SMC can be removed without to much difficulty. It has different inputs than ION's.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;--------------------------------&lt;br /&gt;
;Clip Big Sprite&lt;br /&gt;
;by James Montelongo&lt;br /&gt;
;MAX SIZE: 64x64&lt;br /&gt;
;ix - Sprite&lt;br /&gt;
;b  - height&lt;br /&gt;
;c  - width in bytes&lt;br /&gt;
;d  - x&lt;br /&gt;
;e  - y&lt;br /&gt;
&lt;br /&gt;
ClipBigSprite:&lt;br /&gt;
; Early out, Check if its even remotely on screen&lt;br /&gt;
	ld a,e&lt;br /&gt;
	cp 64&lt;br /&gt;
	ret p&lt;br /&gt;
	add a,b&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
	ld a,d&lt;br /&gt;
	cp 96&lt;br /&gt;
	ret p&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_bottom&lt;br /&gt;
	neg&lt;br /&gt;
	push de&lt;br /&gt;
	ld hl,0&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,a&lt;br /&gt;
	bit 2,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 1,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 0,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	pop de&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	add ix,de		;Here you can save the top offset&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	ld e,0&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_bottom:&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	add a,b&lt;br /&gt;
	sub 64&lt;br /&gt;
	jp m,Check_clip_Left&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_Left:&lt;br /&gt;
				; at this point you may want to save b&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ld a,Clipleftsize &lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_right&lt;br /&gt;
	cpl&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ex de,hl		;save the clipped left offset&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld d,0&lt;br /&gt;
	add ix,de&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	inc a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	and $07&lt;br /&gt;
	ld d,a&lt;br /&gt;
Check_clip_right:&lt;br /&gt;
&lt;br /&gt;
	ld a,Cliprightsize &lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	sub 96&lt;br /&gt;
	jp m,Check_clip_middle&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld a,(bigskip)&lt;br /&gt;
	add a,l&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
Check_clip_middle:&lt;br /&gt;
				; This is where C should be saved.&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	or a&lt;br /&gt;
	jp nz,dontskipmiddle&lt;br /&gt;
	ld a,ClipMiddlesize &lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
dontskipmiddle:&lt;br /&gt;
	ld l,e&lt;br /&gt;
	ld a,d	&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	ld e,a&lt;br /&gt;
	and $07&lt;br /&gt;
	xor 7&lt;br /&gt;
	ld (BigRot1),a&lt;br /&gt;
	ld (BigRot2),a&lt;br /&gt;
	ld (BigRot3),a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld (clipbigrot1),a&lt;br /&gt;
	ld a,$ff&lt;br /&gt;
clipbigrot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	add hl,de&lt;br /&gt;
	ld de,gbuf&lt;br /&gt;
	add hl,de&lt;br /&gt;
				; This is where gbuf offset should be saved.&lt;br /&gt;
	ld d,a&lt;br /&gt;
	cpl&lt;br /&gt;
	ld e,a&lt;br /&gt;
				;masks should be saved to&lt;br /&gt;
BigSpriteRow:&lt;br /&gt;
	push bc&lt;br /&gt;
	push hl&lt;br /&gt;
	ld b,c&lt;br /&gt;
Do_Clipleft = $+1&lt;br /&gt;
	jr Clipleft&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask0:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
Clipleft:&lt;br /&gt;
Clipleftsize = Clipleft-(Do_Clipleft+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipMiddle = $+1&lt;br /&gt;
	jr $+2&lt;br /&gt;
BigSpriteloop:&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot2 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	ld c,a&lt;br /&gt;
BigMask1:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	inc hl&lt;br /&gt;
	ld a,c&lt;br /&gt;
BigMask2:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	djnz BigSpriteloop&lt;br /&gt;
ClipMiddle:&lt;br /&gt;
ClipMiddlesize = ClipMiddle-(Do_ClipMiddle+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipRight = $+1&lt;br /&gt;
	jr ClipRight&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
BigRot3 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask3:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
ClipRight:&lt;br /&gt;
Cliprightsize = ClipRight-(Do_ClipRight+1)&lt;br /&gt;
	pop hl&lt;br /&gt;
&lt;br /&gt;
	ld bc,12			;width of the screen&lt;br /&gt;
	add hl,bc&lt;br /&gt;
&lt;br /&gt;
bigskip = $+1&lt;br /&gt;
	ld bc,0&lt;br /&gt;
	add ix,bc&lt;br /&gt;
	pop bc&lt;br /&gt;
	djnz BigSpriteRow&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:largesprite</id>
		<title>Z80 Routines:Graphic:largesprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:largesprite"/>
				<updated>2010-06-25T09:13:36Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: redirect&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Z80_Routines:Graphic:putLargeSprite]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:largesprite</id>
		<title>Z80 Routines:Graphic:largesprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:largesprite"/>
				<updated>2010-06-25T09:12:25Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: delete&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite</id>
		<title>Z80 Routines:Graphic:putLargeSprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite"/>
				<updated>2010-06-25T09:12:13Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: moved&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Graphic|LargeSprite]][[Category:Z80 Routines|LargeSprite]]&lt;br /&gt;
The '''Largesprite''' routine is used to copy the contents of a variable sized sprite to the Graph Buffer.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
Here is Joe Wingbermuehle's version, which is the one used in ION. Gbuf must be defined before its use.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;=======================&lt;br /&gt;
;LargeSprite&lt;br /&gt;
;by Joe Wingbermuehle&lt;br /&gt;
;=======================&lt;br /&gt;
;Does:   Copy a sprite to the gbuf&lt;br /&gt;
;Input:  ix=sprite address, a='x', l='y', b='height' (in pixels), c='width' (in bytes, e.g. 2 would be 16)&lt;br /&gt;
;Output: The sprite is copied to the gbuf&lt;br /&gt;
;-----------------------&lt;br /&gt;
largeSprite:&lt;br /&gt;
   di                                 ;turn interrupts off (we want to use shadow registers)&lt;br /&gt;
   ex   af,af'&lt;br /&gt;
                                      ;exchange af with af'     \&lt;br /&gt;
   ld   a,c                           ;ld c in a (a = 'width')  | for not destroying a ('x')&lt;br /&gt;
   push   af                          ;push a                   |&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;exchange back            | and 'width' is now in a' (saved)&lt;br /&gt;
      ld   e,l                        ;e = 'y'&lt;br /&gt;
      ld   h,$00                      ;h =  0&lt;br /&gt;
      ld   d,h                        ;d =  0&lt;br /&gt;
      add   hl,de                     ;'y' *2  \&lt;br /&gt;
      add   hl,de                     ;    *3  | calculate 'y' *12 because 'y' is 'in rows'&lt;br /&gt;
      add   hl,hl                     ;    *6  |   (screen is 12 bytes in length)&lt;br /&gt;
      add   hl,hl                     ;    *12 /&lt;br /&gt;
      ld   e,a                        ;e = 'x'&lt;br /&gt;
      and   $07                       ;and %00000111&lt;br /&gt;
      ld   c,a                        ;last 3 bits in c (amount of bits to shift all bytes)&lt;br /&gt;
      srl   e                         ;e/2   | shifting e ('x') 3 bits to the right&lt;br /&gt;
      srl   e                         ; /4   |   %11111111 becomes %00011111 for example&lt;br /&gt;
      srl   e                         ; /8   /&lt;br /&gt;
      add   hl,de                     ;hl = 'y'; de = 'x' (rounded) | add them&lt;br /&gt;
      ld   de, gbuf                   ;de = the adress of graph buffer&lt;br /&gt;
      add   hl,de                     ;add hl to the adress of the gbuf&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
      push   hl                       ;save adress&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
         ld   d,(ix)                  ;first sprite data in d&lt;br /&gt;
         ld   e,$00                   ;e = 0&lt;br /&gt;
         ld   a,c                     ;a = c (to not destroy c)&lt;br /&gt;
         or   a                       ;is a = 0? (same as cp 0)&lt;br /&gt;
         jr   z,largeSpriteSkip1      ;if theres nothing to shift (a = 0) loop it&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
         srl   d                      ;shift one bit to the right; put the destroyed bit in the carry flag&lt;br /&gt;
         rr   e                       ;put the carry flag in e (%00000000 becomes %10000000 if carry flag = 1)&lt;br /&gt;
         dec   a                      ;decrease counter (with was 'the amount of bits to shift')&lt;br /&gt;
         jr   nz,largeSpriteLoop3     ;if the counter is not 0 loop back&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   d                      ;xor first byte of sprite (that can be changed to 'or d' if you want a OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   hl                     ;increase pointer&lt;br /&gt;
         ld   a,(hl)                  ;graphbyte in a&lt;br /&gt;
         xor   e                      ;xor with shifted sprite byte (change to 'or e' for OR-routine)&lt;br /&gt;
         ld   (hl),a                  ;back to buffer&lt;br /&gt;
         inc   ix                     ;increase sprite adress&lt;br /&gt;
         ex   af,af'&lt;br /&gt;
                                      ;exchange af with af' ( a is now the 'width' from the first line)&lt;br /&gt;
         dec   a                      ;decrease 'width'&lt;br /&gt;
         push   af                    ;push the 'width'&lt;br /&gt;
            ex   af,af'&lt;br /&gt;
                                      ;exchange back&lt;br /&gt;
         pop   af                     ;pop the 'width'&lt;br /&gt;
         jr   nz,largeSpriteLoop2     ;if a is not 0 (if a = 0 then we would be done) loop it&lt;br /&gt;
      pop   hl                        ;pop gbuf adress (search the last push hl!)&lt;br /&gt;
   pop   af                           ;pop  | to restore the real 'width'&lt;br /&gt;
   push   af                          ;push /&lt;br /&gt;
      ex   af,af'&lt;br /&gt;
                                      ;af' must be the original 'width' when loop 'largeSpriteLoop1'&lt;br /&gt;
      ld   de,$0C                     ;ld de,12&lt;br /&gt;
      add   hl,de                     ;next line&lt;br /&gt;
      djnz   largeSpriteLoop1         ;if not b = 0 loop (b = height of sprite)&lt;br /&gt;
   pop   af                           ;pop because we dont want a stack problem :)&lt;br /&gt;
   ret                                ;return&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   l,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,8   ;height&lt;br /&gt;
   ld   c,2   ;width in bytes&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call largesprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %10000000,%00000001&lt;br /&gt;
   .db %11111111,%11111111&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version without shadow registers ==&lt;br /&gt;
&lt;br /&gt;
by Tijl Coosemans, made for Venus. Compatible with ION's routine. screenBuf must be defined.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
iLargeSprite&lt;br /&gt;
	ld	h,0&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,screenBuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	and	7&lt;br /&gt;
	ld	e,a&lt;br /&gt;
iLargeSpriteLoop1&lt;br /&gt;
	push	bc&lt;br /&gt;
	push	hl&lt;br /&gt;
	ld	b,c&lt;br /&gt;
iLargeSpriteLoop2&lt;br /&gt;
	ld	c,(ix)&lt;br /&gt;
	ld	d,0&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ld	a,e&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,iLargeSprite1&lt;br /&gt;
iLargeSpriteLoop3&lt;br /&gt;
	srl	c&lt;br /&gt;
	rr	d&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,iLargeSpriteLoop3&lt;br /&gt;
iLargeSprite1&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,d&lt;br /&gt;
	xor	(hl)&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	djnz	iLargeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	ld	c,12&lt;br /&gt;
	add	hl,bc&lt;br /&gt;
	pop	bc&lt;br /&gt;
	djnz	iLargeSpriteLoop1&lt;br /&gt;
	ret&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Version with Clipping ==&lt;br /&gt;
This is a version that supports clipping the large sprite.  It's considerable larger and requires Self modifying code.  The SMC can be removed without to much difficulty. It has different inputs than ION's.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;--------------------------------&lt;br /&gt;
;Clip Big Sprite&lt;br /&gt;
;by James Montelongo&lt;br /&gt;
;MAX SIZE: 64x64&lt;br /&gt;
;ix - Sprite&lt;br /&gt;
;b  - height&lt;br /&gt;
;c  - width in bytes&lt;br /&gt;
;d  - x&lt;br /&gt;
;e  - y&lt;br /&gt;
&lt;br /&gt;
ClipBigSprite:&lt;br /&gt;
; Early out, Check if its even remotely on screen&lt;br /&gt;
	ld a,e&lt;br /&gt;
	cp 64&lt;br /&gt;
	ret p&lt;br /&gt;
	add a,b&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
	ld a,d&lt;br /&gt;
	cp 96&lt;br /&gt;
	ret p&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	ret m&lt;br /&gt;
	ret z&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_bottom&lt;br /&gt;
	neg&lt;br /&gt;
	push de&lt;br /&gt;
	ld hl,0&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,a&lt;br /&gt;
	bit 2,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 1,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl	&lt;br /&gt;
	bit 0,c&lt;br /&gt;
	jr z,$+2+1&lt;br /&gt;
	add hl,de&lt;br /&gt;
	pop de&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	add ix,de		;Here you can save the top offset&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	ld e,0&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_bottom:&lt;br /&gt;
&lt;br /&gt;
	ld a,e&lt;br /&gt;
	add a,b&lt;br /&gt;
	sub 64&lt;br /&gt;
	jp m,Check_clip_Left&lt;br /&gt;
	neg&lt;br /&gt;
	add a,b&lt;br /&gt;
	ld b,a&lt;br /&gt;
Check_clip_Left:&lt;br /&gt;
				; at this point you may want to save b&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ld a,Clipleftsize &lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	or a&lt;br /&gt;
	jp p,Check_clip_right&lt;br /&gt;
	cpl&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ex de,hl		;save the clipped left offset&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld d,0&lt;br /&gt;
	add ix,de&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	ex de,hl&lt;br /&gt;
	inc a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipleft),a&lt;br /&gt;
	ld a,d&lt;br /&gt;
	and $07&lt;br /&gt;
	ld d,a&lt;br /&gt;
Check_clip_right:&lt;br /&gt;
&lt;br /&gt;
	ld a,Cliprightsize &lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,d&lt;br /&gt;
	sub 96&lt;br /&gt;
	jp m,Check_clip_middle&lt;br /&gt;
	and $F8&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	rra&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld a,(bigskip)&lt;br /&gt;
	add a,l&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (bigskip),a&lt;br /&gt;
	neg&lt;br /&gt;
	add a,c&lt;br /&gt;
	ld c,a&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_Clipright),a&lt;br /&gt;
Check_clip_middle:&lt;br /&gt;
				; This is where C should be saved.&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
	ld a,c&lt;br /&gt;
	or a&lt;br /&gt;
	jp nz,dontskipmiddle&lt;br /&gt;
	ld a,ClipMiddlesize &lt;br /&gt;
	ld (Do_ClipMiddle),a&lt;br /&gt;
dontskipmiddle:&lt;br /&gt;
	ld l,e&lt;br /&gt;
	ld a,d	&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	ld e,a&lt;br /&gt;
	and $07&lt;br /&gt;
	xor 7&lt;br /&gt;
	ld (BigRot1),a&lt;br /&gt;
	ld (BigRot2),a&lt;br /&gt;
	ld (BigRot3),a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld (clipbigrot1),a&lt;br /&gt;
	ld a,$ff&lt;br /&gt;
clipbigrot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	srl e&lt;br /&gt;
	add hl,de&lt;br /&gt;
	ld de,gbuf&lt;br /&gt;
	add hl,de&lt;br /&gt;
				; This is where gbuf offset should be saved.&lt;br /&gt;
	ld d,a&lt;br /&gt;
	cpl&lt;br /&gt;
	ld e,a&lt;br /&gt;
				;masks should be saved to&lt;br /&gt;
BigSpriteRow:&lt;br /&gt;
	push bc&lt;br /&gt;
	push hl&lt;br /&gt;
	ld b,c&lt;br /&gt;
Do_Clipleft = $+1&lt;br /&gt;
	jr Clipleft&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot1 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask0:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
Clipleft:&lt;br /&gt;
Clipleftsize = Clipleft-(Do_Clipleft+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipMiddle = $+1&lt;br /&gt;
	jr $+2&lt;br /&gt;
BigSpriteloop:&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
	inc ix&lt;br /&gt;
BigRot2 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	ld c,a&lt;br /&gt;
BigMask1:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	inc hl&lt;br /&gt;
	ld a,c&lt;br /&gt;
BigMask2:&lt;br /&gt;
	and e&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	djnz BigSpriteloop&lt;br /&gt;
ClipMiddle:&lt;br /&gt;
ClipMiddlesize = ClipMiddle-(Do_ClipMiddle+1)&lt;br /&gt;
&lt;br /&gt;
Do_ClipRight = $+1&lt;br /&gt;
	jr ClipRight&lt;br /&gt;
	ld a,(ix)&lt;br /&gt;
BigRot3 = $+1&lt;br /&gt;
	jr $&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
BigMask3:&lt;br /&gt;
	and d&lt;br /&gt;
	or (hl)&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
ClipRight:&lt;br /&gt;
Cliprightsize = ClipRight-(Do_ClipRight+1)&lt;br /&gt;
	pop hl&lt;br /&gt;
&lt;br /&gt;
	ld bc,12			;width of the screen&lt;br /&gt;
	add hl,bc&lt;br /&gt;
&lt;br /&gt;
bigskip = $+1&lt;br /&gt;
	ld bc,0&lt;br /&gt;
	add ix,bc&lt;br /&gt;
	pop bc&lt;br /&gt;
	djnz BigSpriteRow&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2</id>
		<title>Z80 Routines:Graphic:putLargeSprite2</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2"/>
				<updated>2010-06-25T09:11:03Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: moved Z80 Routines:Graphic:putLargeSprite to Z80 Routines:Graphic:putLargeSprite2:&amp;amp;#32;mistake&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;br /&gt;
The '''put16xBsprite''' routine is used to plot a large sized sprite given its height and width.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;-----&amp;gt; Draw a picture&lt;br /&gt;
;Input:	ix-&amp;gt;sprite&lt;br /&gt;
;	a=x&lt;br /&gt;
;	l=y&lt;br /&gt;
;	b=height	(in pixels)&lt;br /&gt;
;	c=width		(in bytes, e.g. 2 would be 16 pixels)&lt;br /&gt;
;Output: nothing&lt;br /&gt;
; All registers are destroyed except bc', de', hl'&lt;br /&gt;
largeSprite:&lt;br /&gt;
	di&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	ld	h,$00&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	and	$07&lt;br /&gt;
	ld	c,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,gbuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
	push	hl&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
	ld	d,(ix)&lt;br /&gt;
	ld	e,$00&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,largeSpriteSkip1&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
	srl	d&lt;br /&gt;
	rr	e&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,largeSpriteLoop3&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	d&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	e&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	dec	a&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	pop	af&lt;br /&gt;
	jr	nz,largeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	pop	af&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	de,$0C&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	djnz	largeSpriteLoop1&lt;br /&gt;
	pop	af&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   e,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,10  ;height&lt;br /&gt;
   ld   c,3   ;width&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call largeSprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite</id>
		<title>Z80 Routines:Graphic:putLargeSprite</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite"/>
				<updated>2010-06-25T09:11:03Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: moved Z80 Routines:Graphic:putLargeSprite to Z80 Routines:Graphic:putLargeSprite2:&amp;amp;#32;mistake&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Z80 Routines:Graphic:putLargeSprite2]]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2</id>
		<title>Z80 Routines:Graphic:putLargeSprite2</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2"/>
				<updated>2010-06-25T09:10:48Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: mistake&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;br /&gt;
The '''put16xBsprite''' routine is used to plot a large sized sprite given its height and width.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;-----&amp;gt; Draw a picture&lt;br /&gt;
;Input:	ix-&amp;gt;sprite&lt;br /&gt;
;	a=x&lt;br /&gt;
;	l=y&lt;br /&gt;
;	b=height	(in pixels)&lt;br /&gt;
;	c=width		(in bytes, e.g. 2 would be 16 pixels)&lt;br /&gt;
;Output: nothing&lt;br /&gt;
; All registers are destroyed except bc', de', hl'&lt;br /&gt;
largeSprite:&lt;br /&gt;
	di&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	ld	h,$00&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	and	$07&lt;br /&gt;
	ld	c,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,gbuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
	push	hl&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
	ld	d,(ix)&lt;br /&gt;
	ld	e,$00&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,largeSpriteSkip1&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
	srl	d&lt;br /&gt;
	rr	e&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,largeSpriteLoop3&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	d&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	e&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	dec	a&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	pop	af&lt;br /&gt;
	jr	nz,largeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	pop	af&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	de,$0C&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	djnz	largeSpriteLoop1&lt;br /&gt;
	pop	af&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   e,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,10  ;height&lt;br /&gt;
   ld   c,3   ;width&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call largeSprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2</id>
		<title>Z80 Routines:Graphic:putLargeSprite2</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Graphic:putLargeSprite2"/>
				<updated>2010-06-25T09:06:39Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: created&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Graphic|PutLargeSprite]]&lt;br /&gt;
[[Category:Z80 Routines|PutLargeSprite]]&lt;br /&gt;
&lt;br /&gt;
The '''put16xBsprite''' routine is used to plot a large sized sprite given its height and width.&lt;br /&gt;
&lt;br /&gt;
== Code ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;-----&amp;gt; Draw a picture&lt;br /&gt;
;Input:	ix-&amp;gt;sprite&lt;br /&gt;
;	a=x&lt;br /&gt;
;	l=y&lt;br /&gt;
;	b=height	(in pixels)&lt;br /&gt;
;	c=width		(in bytes, e.g. 2 would be 16 pixels)&lt;br /&gt;
;Output: nothing&lt;br /&gt;
; All registers are destroyed except bc', de', hl'&lt;br /&gt;
largeSprite:&lt;br /&gt;
	di&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	e,l&lt;br /&gt;
	ld	h,$00&lt;br /&gt;
	ld	d,h&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	add	hl,hl&lt;br /&gt;
	ld	e,a&lt;br /&gt;
	and	$07&lt;br /&gt;
	ld	c,a&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	srl	e&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	ld	de,gbuf&lt;br /&gt;
	add	hl,de&lt;br /&gt;
largeSpriteLoop1:&lt;br /&gt;
	push	hl&lt;br /&gt;
largeSpriteLoop2:&lt;br /&gt;
	ld	d,(ix)&lt;br /&gt;
	ld	e,$00&lt;br /&gt;
	ld	a,c&lt;br /&gt;
	or	a&lt;br /&gt;
	jr	z,largeSpriteSkip1&lt;br /&gt;
largeSpriteLoop3:&lt;br /&gt;
	srl	d&lt;br /&gt;
	rr	e&lt;br /&gt;
	dec	a&lt;br /&gt;
	jr	nz,largeSpriteLoop3&lt;br /&gt;
largeSpriteSkip1:&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	d&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	hl&lt;br /&gt;
	ld	a,(hl)&lt;br /&gt;
	xor	e&lt;br /&gt;
	ld	(hl),a&lt;br /&gt;
	inc	ix&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	dec	a&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	pop	af&lt;br /&gt;
	jr	nz,largeSpriteLoop2&lt;br /&gt;
	pop	hl&lt;br /&gt;
	pop	af&lt;br /&gt;
	push	af&lt;br /&gt;
	ex	af,af'&lt;br /&gt;
	ld	de,$0C&lt;br /&gt;
	add	hl,de&lt;br /&gt;
	djnz	largeSpriteLoop1&lt;br /&gt;
	pop	af&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Example ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
   ;...&lt;br /&gt;
   ld   e,8   ;y&lt;br /&gt;
   ld   a,16  ;x&lt;br /&gt;
   ld   b,10  ;height&lt;br /&gt;
   ld   c,3   ;width&lt;br /&gt;
   ld   ix,sprite&lt;br /&gt;
   call putsprite&lt;br /&gt;
   call fastcopy&lt;br /&gt;
   ;...&lt;br /&gt;
&lt;br /&gt;
sprite:&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %10000001,%10000001,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
   .db %11111111,%11111111,%11111111&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-06-18T13:36:53Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Others */ small typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you are out of registers, try using ixh/ixl/iyh/iyl  and even the i register for loop counters instead of maintaining a counter in memory or pushing/popping an already used register to the stack inside a loop. Using ixh/ixl/iyh/iyl will break compatibility with the TI-84+SE emulated by the Nspire. You can only use i register for other purposes if you disable interrupts first (di).&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Code Flow ====&lt;br /&gt;
&lt;br /&gt;
Almost never call and return...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
Fallthrough looping&lt;br /&gt;
If you need to repeat a routine several times but can't spare registers for a loop counter or unroll the routine, try structuring the routine so it can call itself several times and fall through at the end. For example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
foo:&lt;br /&gt;
  ld hl, data&lt;br /&gt;
  call bar      ; Run routine once&lt;br /&gt;
  call bar      ; .. twice&lt;br /&gt;
  call bar      ; .. three times&lt;br /&gt;
bar:&lt;br /&gt;
  ld a, (hl)    ; .. fourth and final time&lt;br /&gt;
  inc l&lt;br /&gt;
  and $0F&lt;br /&gt;
  out (c), a&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Toggling values in loops.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:Table alignment&lt;br /&gt;
&lt;br /&gt;
If you align tables to a 256-byte boundary, you can access the contents by placing the index in a register such as l and the table address in h. This is faster than loading the full unaligned 16-bit address and adding a 16-bit index to it, and makes accessing tables with a size of 256 bytes or less very convenient: &lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld h, (sineTable &amp;gt;&amp;gt; 8) &amp;amp; $FF    ; Get MSB of table&lt;br /&gt;
 ld a, (frame_count)             ; Get index&lt;br /&gt;
 ld l, a&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld hl, sineTable                ; Get address of table&lt;br /&gt;
 xor a&lt;br /&gt;
 ld d, a                         ; Set index high byte to zero&lt;br /&gt;
 ld a, (frame_count)&lt;br /&gt;
 ld e, a                         ; Set index low byte&lt;br /&gt;
 add hl, de                      ; Add offset to base&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Crazy, &amp;quot;magick&amp;quot;, hacks and obscure optimization's tricks ==&lt;br /&gt;
&lt;br /&gt;
These are not normally recommend for use because some disturb disassembly and even coders understanding the code.&lt;br /&gt;
&lt;br /&gt;
=== Better else ===&lt;br /&gt;
So you normally have an if-else-endif block like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
jr nz,else    ;the IF&lt;br /&gt;
;some code&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
;some code&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
But here's a crazy trick for when the Else code is a single 2-byte instruction:&lt;br /&gt;
You use the first byte of a 3 byte instruction with no side effects instead of the &amp;quot;jr endif&amp;quot; line!&lt;br /&gt;
So if you had code like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You could replace it with this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
.db $C2  ;jp nz,xxxx&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of branching over the ld a,4 instruction, it now executes a jp nz,XXXX instruction where the XXXX is the two bytes of the next instruction. You already know what the flags will be here, so you can make the jump never taken. You can use this to skip the next two bytes of execution! Who needs to branch over it?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
This only takes 28 T-states for if. A small saving, but could be useful in tight loops, and saves 2 bytes!&lt;br /&gt;
The only reason not to use this for 1-byte instructions would be code readability and bug safety. Watch those flags!&lt;br /&gt;
&lt;br /&gt;
=== Conditional rst ===&lt;br /&gt;
&lt;br /&gt;
For a smaller conditional rst $38, use jr cc, -1. This will cause a conditional jump to the displacement byte ($FF) which is the rst $38 opcode. &lt;br /&gt;
&lt;br /&gt;
=== DAA trick ===&lt;br /&gt;
&lt;br /&gt;
Normally DAA instruction is used for BCD math but can be used for converting (?) ASCII integer.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
	cp 10&lt;br /&gt;
	ccf&lt;br /&gt;
	adc a, 30h&lt;br /&gt;
	daa&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
* [http://www.smspower.org/dev/docs/wiki/?n=Z80.ProgrammingTechniques SMS Power! dev wiki z80 Techniques]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* Dwedit for sharing in MaxCoderz the &amp;quot;Better else&amp;quot;&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;br /&gt;
* SMS Power wiki&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-06-18T13:36:17Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Others */ divided and added more tricks&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you are out of registers, try using ixh/ixl/iyh/iyl  and even the i register for loop counters instead of maintaining a counter in memory or pushing/popping an already used register to the stack inside a loop. Using ixh/ixl/iyh/iyl will break compatibility with the TI-84+SE emulated by the Nspire. You can only use i register for other purposes if you disable interrupts first (di).&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Code Flow ====&lt;br /&gt;
&lt;br /&gt;
Almost never call and return...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
Fallthrough looping&lt;br /&gt;
If you need to repeat a routine several times but can't spare registers for a loop counter or unroll the routine, try structuring the routine so it can call itself several times and fall through at the end. For example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
foo:&lt;br /&gt;
  ld hl, data&lt;br /&gt;
  call bar      ; Run routine once&lt;br /&gt;
  call bar      ; .. twice&lt;br /&gt;
  call bar      ; .. three times&lt;br /&gt;
bar:&lt;br /&gt;
  ld a, (hl)    ; .. fourth and final time&lt;br /&gt;
  inc l&lt;br /&gt;
  and $0F&lt;br /&gt;
  out (c), a&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Others ===&lt;br /&gt;
&lt;br /&gt;
Toggling values in loops.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:Table alignment&lt;br /&gt;
&lt;br /&gt;
If you align tables to a 256-byte boundary, you can access the contents by placing the index in a register such as l and the table address in h. This is faster than loading the full unaligned 16-bit address and adding a 16-bit index to it, and makes accessing tables with a size of 256 bytes or less very convenient: &lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld h, (sineTable &amp;gt;&amp;gt; 8) &amp;amp; $FF    ; Get MSB of table&lt;br /&gt;
 ld a, (frame_count)             ; Get index&lt;br /&gt;
 ld l, a&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld hl, sineTable                ; Get address of table&lt;br /&gt;
 xor a&lt;br /&gt;
 ld d, a                         ; Set index high byte to zero&lt;br /&gt;
 ld a, (frame_count)&lt;br /&gt;
 ld e, a                         ; Set index low byte&lt;br /&gt;
 add hl, de                      ; Add offset to base&lt;br /&gt;
 ld a, (hl)                      ; Look up value&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Crazy, &amp;quot;magick&amp;quot;, hacks and obscure optimization's tricks ==&lt;br /&gt;
&lt;br /&gt;
These are not normally recommend for use because some disturb disassembly and even coders understanding the code.&lt;br /&gt;
&lt;br /&gt;
=== Better else ===&lt;br /&gt;
So you normally have an if-else-endif block like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
jr nz,else    ;the IF&lt;br /&gt;
;some code&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
;some code&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
But here's a crazy trick for when the Else code is a single 2-byte instruction:&lt;br /&gt;
You use the first byte of a 3 byte instruction with no side effects instead of the &amp;quot;jr endif&amp;quot; line!&lt;br /&gt;
So if you had code like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You could replace it with this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
.db $C2  ;jp nz,xxxx&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of branching over the ld a,4 instruction, it now executes a jp nz,XXXX instruction where the XXXX is the two bytes of the next instruction. You already know what the flags will be here, so you can make the jump never taken. You can use this to skip the next two bytes of execution! Who needs to branch over it?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
This only takes 28 T-states for if. A small saving, but could be useful in tight loops, and saves 2 bytes!&lt;br /&gt;
The only reason not to use this for 1-byte instructions would be code readability and bug safety. Watch those flags!&lt;br /&gt;
&lt;br /&gt;
=== Conditional rst ===&lt;br /&gt;
&lt;br /&gt;
For a smaller conditional rst $38, use jr cc, -1. This will cause a conditional jump to the displacement byte ($FF) which is the rst $38 opcode. &lt;br /&gt;
&lt;br /&gt;
=== DAA trick ===&lt;br /&gt;
&lt;br /&gt;
Normally DAA instruction is used for BCD math but can be used for converting (?) ASCII integer.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
	cp 10&lt;br /&gt;
	ccf&lt;br /&gt;
	adc a, 30h&lt;br /&gt;
	daa&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
* [http://www.smspower.org/dev/docs/wiki/?n=Z80.ProgrammingTechniques SMS Power! dev wiki z80 Techniques]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* Dwedit for sharing in MaxCoderz the &amp;quot;Better else&amp;quot;&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;br /&gt;
* SMS Power wiki&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-06-18T13:30:36Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: update with what I found&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If you are out of registers, try using ixh/ixl/iyh/iyl  and even the i register for loop counters instead of maintaining a counter in memory or pushing/popping an already used register to the stack inside a loop. Using ixh/ixl/iyh/iyl will break compatibility with the TI-84+SE emulated by the Nspire. You can only use i register for other purposes if you disable interrupts first (di).&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Calling and returning...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Crazy, &amp;quot;magick&amp;quot;, hacks and obscure optimization's tricks ==&lt;br /&gt;
&lt;br /&gt;
These are not normally recommend for use because some disturb disassembly and even coders understanding the code.&lt;br /&gt;
&lt;br /&gt;
=== Better else ===&lt;br /&gt;
So you normally have an if-else-endif block like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
jr nz,else    ;the IF&lt;br /&gt;
;some code&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
;some code&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
But here's a crazy trick for when the Else code is a single 2-byte instruction:&lt;br /&gt;
You use the first byte of a 3 byte instruction with no side effects instead of the &amp;quot;jr endif&amp;quot; line!&lt;br /&gt;
So if you had code like this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
jr endif&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You could replace it with this:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
cp 7&lt;br /&gt;
jr nz,else&lt;br /&gt;
ld a,3&lt;br /&gt;
.db $C2  ;jp nz,xxxx&lt;br /&gt;
else:&lt;br /&gt;
ld a,4&lt;br /&gt;
endif:&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Instead of branching over the ld a,4 instruction, it now executes a jp nz,XXXX instruction where the XXXX is the two bytes of the next instruction. You already know what the flags will be here, so you can make the jump never taken. You can use this to skip the next two bytes of execution! Who needs to branch over it?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
This only takes 28 T-states for if. A small saving, but could be useful in tight loops, and saves 2 bytes!&lt;br /&gt;
The only reason not to use this for 1-byte instructions would be code readability and bug safety. Watch those flags!&lt;br /&gt;
&lt;br /&gt;
=== Conditional rst ===&lt;br /&gt;
&lt;br /&gt;
For a smaller conditional rst $38, use jr cc, -1. This will cause a conditional jump to the displacement byte ($FF) which is the rst $38 opcode. &lt;br /&gt;
&lt;br /&gt;
=== DAA trick ===&lt;br /&gt;
&lt;br /&gt;
Normally DAA instruction is used for BCD math but can be used for converting (?) ASCII integer.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
	cp 10&lt;br /&gt;
	ccf&lt;br /&gt;
	adc a, 30h&lt;br /&gt;
	daa&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
* [http://www.smspower.org/dev/docs/wiki/?n=Z80.ProgrammingTechniques SMS Power! dev wiki z80 Techniques]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* Dwedit for sharing in MaxCoderz the &amp;quot;Better else&amp;quot;&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;br /&gt;
* SMS Power wiki&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Category:Z80_Routines</id>
		<title>Category:Z80 Routines</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Category:Z80_Routines"/>
				<updated>2010-06-17T21:49:16Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: one more link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Calculator Documentation|General Z80 Calculator Routines]]&lt;br /&gt;
This Category contains routines that can be used on many Z80-based calculators. Please consult each example for which calculators they apply to.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
When you paste code, there is a simple step you can do to format the code properly. Start the first line of code with &amp;amp;lt;nowiki&amp;amp;gt;, and put a space in front of the &amp;amp;lt;. On the last line of your code, end it with &amp;amp;lt;/nowiki&amp;amp;gt;. This allows for code to be formatted like this (click edit on this page to see how it's done):&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;cp 10&lt;br /&gt;
ccf&lt;br /&gt;
adc a, 30h&lt;br /&gt;
daa&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Also, there are a bunch of routines on:&lt;br /&gt;
* [http://www.detachedsolutions.com/forum/viewtopic.php?t=1154 DS forums]&lt;br /&gt;
* [http://baze.au.com/misc/z80bits.html Z80 Bits]&lt;br /&gt;
* [http://www.smspower.org/dev/docs/wiki/CodeSnippets/CodeSnippets SMS Power z80 code snippets]&lt;br /&gt;
* [http://www.unitedti.org/index.php?showtopic=1279 UTI forums]&lt;br /&gt;
* [http://www.revsoft.org/phpBB2/viewtopic.php?t=354 RS forums]&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=1940 MC forums]&lt;br /&gt;
* [http://www.cemetech.net/forum/viewtopic.php?t=1449 Cemetech forums]&lt;br /&gt;
* [http://www.ticalc.org/pub/83plus/asm/source/ ticalc archive]&lt;br /&gt;
* [http://www.ticalc.org/archives/files/fileinfo/130/13059.html Ion shell source code]&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-06-17T21:49:10Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Related topics */ sms link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Calling and returning...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
* [http://www.smspower.org/dev/docs/wiki/?n=Z80.ProgrammingTechniques SMS Power! dev wiki z80 Techniques]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=User_talk:Galandros</id>
		<title>User talk:Galandros</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=User_talk:Galandros"/>
				<updated>2010-06-15T09:24:09Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: notes&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Anything, leave a word.&lt;br /&gt;
&lt;br /&gt;
== Spam ==&lt;br /&gt;
&lt;br /&gt;
You want a thorough list of the spam bots? Check out [[Special:ListUsers]]. There's more spam there then we have time to remove. We're currently running on a policy of &amp;quot;just don't try&amp;quot;. (But hey, if you've got time, you could ask [[User:Brandonw]] for Sysops.) At any rate, the Wiki software can't delete accounts, only block them, and most of those bots seem to be dead. We seem to be more interested in blocking any new bots that appear than purging them altogether. [[User:Dr. D&amp;amp;#39;nar|Dr. D&amp;amp;#39;nar]] 16:56, 25 October 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
-----------------------Answer-----------------------&lt;br /&gt;
lol Alright I agree with &amp;quot;just don't try&amp;quot; policy O_O&lt;br /&gt;
&lt;br /&gt;
We can always block users that contain certain keywords:&lt;br /&gt;
buy&lt;br /&gt;
mp3&lt;br /&gt;
download&lt;br /&gt;
(products name)&lt;br /&gt;
etc..&lt;br /&gt;
&lt;br /&gt;
About pages with strange names, the policy mantains...&lt;br /&gt;
&lt;br /&gt;
[[User:Galandros|Galandros]] 19:03, 25 October 2009 (UTC)&lt;br /&gt;
&lt;br /&gt;
= Notes =&lt;br /&gt;
(personal use)&lt;br /&gt;
&lt;br /&gt;
history:&lt;br /&gt;
MaxCoderz&lt;br /&gt;
Greenlights&lt;br /&gt;
&lt;br /&gt;
http://www.ticalc.org/pub/text/z80/z80instrset.txt&lt;br /&gt;
&lt;br /&gt;
Work on:&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:SpecialPages&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:DeadendPages&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:DoubleRedirects&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:LonelyPages&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:ShortPages&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:UncategorizedCategories&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:UncategorizedPages&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:UncategorizedTemplates&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:UnusedCategories&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:WantedCategories&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:WantedPages&lt;br /&gt;
&lt;br /&gt;
Update do WikiTI:&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Meta-tutorial&lt;br /&gt;
&lt;br /&gt;
delete:&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Category:Z80_Routines:Input:DetectKeyPress&lt;br /&gt;
&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:WhatLinksHere&amp;amp;target=155&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=Special:WhatLinksHere&amp;amp;target=241&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%89%C6%92%C5%BD%C3%A5%C2%B9%C2%B3%C3%A6%C2%BC%E2%80%9D%C3%A7%C2%BB%C5%BD%C3%A7%C2%BE%C5%BD%C3%A5%E2%80%BA%C2%BD%C3%A7%C2%A5%C5%BE%C3%A8%C2%AF%C2%9D&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%87%C5%93%E2%80%B9%C3%A8%C2%BF%E2%80%A1%C3%A7%C5%A1%E2%80%9E%C3%A6%C5%93%E2%82%AC%C3%A5%C2%A5%C2%BD%C3%A7%C5%A1%E2%80%9E%C3%A4%C2%B8%E2%82%AC%C3%A5%C2%B1%C5%A0%C3%A5%C2%A5%C2%A5%C3%A8%C2%BF%C2%90%C3%A4%C2%BC%C5%A1&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%85%C5%92%E2%80%94%C3%A4%C2%BA%C2%AC%C3%A5%C2%A5%C2%A5%C3%A8%C2%BF%C2%90%C3%A7%E2%80%9D%C2%B7%C3%A5%C2%AD%C2%90110%C3%A7%C2%B1%C2%B3%C3%A6%C2%A0%C2%8F%C3%A9%C2%A6%E2%80%93%C3%A8%C2%BD%C2%AE&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%85%C2%A4%E2%80%93%C3%A8%C2%B4%C2%B8%C3%A6%C5%93%C2%8D%C3%A8%C2%A3%E2%80%A6%C3%A6%E2%80%B0%C2%B9%C3%A5%C2%8F%E2%80%98---%C3%A6%C5%93%C2%8D%C3%A8%C2%A3%E2%80%A6%C3%A6%E2%80%B0%C2%B9%C3%A5%C2%8F%E2%80%98&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%87%E2%80%93%C2%B1%C3%A7%E2%80%93%C2%B9--%C3%A6%E2%82%AC%C2%A7%C3%A7%E2%80%94%E2%80%A6--%C3%A7%E2%80%9D%C5%B8%C3%A6%C2%AE%E2%80%93%C3%A5%E2%84%A2%C2%A8%C3%A7%E2%80%93%C2%B1%C3%A7%E2%80%93%C2%B9&lt;br /&gt;
http://wikiti.brandonw.net/index.php?title=%C3%84%C2%B8%C2%AD%C3%A5%E2%80%BA%C2%BD%C3%A5%C2%A5%C2%B3%C3%A5%C2%AD%C2%90%C3%A6%C2%B2%E2%84%A2%C3%A6%C2%BB%C2%A9%C3%A6%C5%BD%E2%80%99%C3%A7%C2%90%C6%92%C3%A9%E2%80%9D%C2%81%C3%A5%C2%AE%C5%A1%C3%A5%C2%A5%C2%A5%C3%A8%C2%BF%C2%90%C3%A4%C2%BC%C5%A1%C3%A9%E2%80%9C%C2%B6%C3%A7%E2%80%B0%C5%92&lt;br /&gt;
&lt;br /&gt;
;bcalls or ram&lt;br /&gt;
;-------------&lt;br /&gt;
usermemoff	= $89EC&lt;br /&gt;
_JForceCmd	= $402A&lt;br /&gt;
_homeup		= $4558&lt;br /&gt;
LCD_BUSY_QUICK	= $000B&lt;br /&gt;
_lcd_busy	= $4051 ;wait till bit 1 of port 2 is set&lt;br /&gt;
_EraseEOL		equ 4552h	;falta na wikiTI ; apaga desde o cursor até ao fim da linha&lt;br /&gt;
freeRAM		equ 9815h ;pretty sure this is the amount of RAM free, valid in Mem Mgmt/Del anyway&lt;br /&gt;
&lt;br /&gt;
;Display tokens:&lt;br /&gt;
;Get_Tok_Strng, and _PutTokString presumably calls that and then _VPutS&lt;br /&gt;
&lt;br /&gt;
(localLanguage): two bytes. (localLanguage) contains the language number as follows:&lt;br /&gt;
0Ah - Spanish&lt;br /&gt;
0Ch - French&lt;br /&gt;
07h - German&lt;br /&gt;
16h - Portugese&lt;br /&gt;
09h - English&lt;br /&gt;
(localLanguage+1) is some sort of version number...1 has been seen with all but 16h, which has 2. 3 is apparently possible. Evidently not all that important.&lt;br /&gt;
&lt;br /&gt;
_ErrNotEnoughMem 	equ 448Ch ;only if not HL bytes free&lt;br /&gt;
_GetDispRowOffset	equ 4D59h ;HL=A*12 (intended for A to be row and HL becomes offset into plotSScreen)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;Official Name: A2PointHLind&lt;br /&gt;
;BCALL Address: 4036&lt;br /&gt;
;This routine adds two times A to HL and then jumps to LdHLind. It can be used to get an address from a pointer table.&lt;br /&gt;
;&lt;br /&gt;
;[edit] Inputs&lt;br /&gt;
;    * hl = pointer table base&lt;br /&gt;
;    * a = entry in table to grab &lt;br /&gt;
;&lt;br /&gt;
;[edit] Outputs&lt;br /&gt;
;    * hl = (hl+2a)&lt;br /&gt;
;    * a = (hl+2a)&lt;br /&gt;
;    * de = 2a&lt;br /&gt;
;    * bc preserved&lt;br /&gt;
;    * f destroyed&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;This will get the free archive in OP3:&lt;br /&gt;
Label101:&lt;br /&gt;
	bcall(5014h)     &lt;br /&gt;
	ld bc,(839Fh) &lt;br /&gt;
	ld (OP1),bc &lt;br /&gt;
	ld bc,(83A1h) &lt;br /&gt;
	ld (OP1M),bc &lt;br /&gt;
	ld hl,8493h&lt;br /&gt;
	ld b,06h&lt;br /&gt;
Label730:&lt;br /&gt;
	ld de,000Ah&lt;br /&gt;
	push hl      &lt;br /&gt;
	push bc      &lt;br /&gt;
	bcall(80B1h)     &lt;br /&gt;
	pop bc      &lt;br /&gt;
	pop hl      &lt;br /&gt;
	ld a,(8486h)  &lt;br /&gt;
	add a,30h&lt;br /&gt;
	ld (hl),a &lt;br /&gt;
	dec hl      &lt;br /&gt;
	djnz Label730&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;If your program runs with Asm(), then the amount of free RAM will be smaller because a copy of your program was made.&lt;br /&gt;
;Or use the VAT if your program needs to expand and shrink itself on the fly. Find yourself in the VAT, locate your data, read your twobyte size prefix, and add that to the value returned from MemCheck.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;You can get the full calc ID:&lt;br /&gt;
&lt;br /&gt;
;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&lt;br /&gt;
;Undocumented rom-calls&lt;br /&gt;
;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&lt;br /&gt;
;bcall 807E&lt;br /&gt;
;Stores the first five bytes of the ID to OP4. You will have to convert them to ascii hex&lt;br /&gt;
;if you want to display them. 1 byte = 2 hex ascii chars, you know.&lt;br /&gt;
&lt;br /&gt;
;call 3c85h&lt;br /&gt;
;This returns HL as a pointer to the remaining two bytes of the ID.&lt;br /&gt;
;WILL ONLY WORK ON ROM VERSION 1.14 (don't know about higher versions, but who uses them anyway)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Input:GetCSC</id>
		<title>Z80 Routines:Input:GetCSC</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Input:GetCSC"/>
				<updated>2010-06-15T09:21:15Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: added another routine&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Input|GetCSC]]&lt;br /&gt;
[[Category:Z80 Routines|GetCSC]]&lt;br /&gt;
&lt;br /&gt;
This is a replacement for the GetCSC routine. Its returns are exactly the same. You need to have a variable called lastKey so keys won't repeat. Unless you want that. Feel free to add versions with fun stuff like controlled repeating.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;====== GetCSC clone ===========================================================&lt;br /&gt;
; This routine is a replacement for the GetCSC bcall.  Its returns are the same.&lt;br /&gt;
; Inputs:&lt;br /&gt;
;  - None&lt;br /&gt;
; Outputs:&lt;br /&gt;
;  - A: Keycode&lt;br /&gt;
; Destroys:&lt;br /&gt;
;  - AF, BC&lt;br /&gt;
&lt;br /&gt;
; To do: Add debouncing&lt;br /&gt;
&lt;br /&gt;
GetCloneSC:&lt;br /&gt;
;	push bc			; uncomment for preserving bc&lt;br /&gt;
	ld	c, 0BFh&lt;br /&gt;
	ld	b, 7&lt;br /&gt;
getCSCloop:&lt;br /&gt;
	ld	a, c&lt;br /&gt;
	out	(1), a&lt;br /&gt;
	nop&lt;br /&gt;
	nop&lt;br /&gt;
	nop&lt;br /&gt;
	rrca&lt;br /&gt;
	ld	c, a&lt;br /&gt;
	in	a, (1)&lt;br /&gt;
	cp	0ffh&lt;br /&gt;
	jr	nz, getCSCgotCSC&lt;br /&gt;
	djnz	getCSCloop&lt;br /&gt;
	xor	a&lt;br /&gt;
	ld	(lastKey), a&lt;br /&gt;
	ret&lt;br /&gt;
getCSCgotCSC:&lt;br /&gt;
	dec	b&lt;br /&gt;
	ld	c, b&lt;br /&gt;
	call	getResetBit&lt;br /&gt;
	ld	a, b&lt;br /&gt;
	sla	c&lt;br /&gt;
	sla	c&lt;br /&gt;
	sla	c&lt;br /&gt;
	add	a, c&lt;br /&gt;
	ld	b, a		; This dance ensures that&lt;br /&gt;
	ld	a, (lastKey)	; the keycode is returned in A&lt;br /&gt;
	ld	c, a&lt;br /&gt;
	ld	a, b&lt;br /&gt;
	cp	c&lt;br /&gt;
	ld	(lastKey), a&lt;br /&gt;
	jr	nz, getCSCgoodCSC&lt;br /&gt;
	xor	a&lt;br /&gt;
getCSCgoodCSC:&lt;br /&gt;
;	pop bc&lt;br /&gt;
	ret&lt;br /&gt;
			&lt;br /&gt;
getResetBit:&lt;br /&gt;
	cp	$FF&lt;br /&gt;
	ret	z&lt;br /&gt;
	ld	b, 0&lt;br /&gt;
getResetBitLoop:&lt;br /&gt;
	rrca&lt;br /&gt;
	inc	b&lt;br /&gt;
	jr	c, getResetBitLoop&lt;br /&gt;
	ret&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Alternative Version ====&lt;br /&gt;
&lt;br /&gt;
This has debouncing?&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;;Getcsc replacement by James Montelongo &lt;br /&gt;
; Outputs:&lt;br /&gt;
;  - A: Keycode&lt;br /&gt;
; Destroys:&lt;br /&gt;
;  - AF&lt;br /&gt;
gsGetK:&lt;br /&gt;
gsGetCSC:&lt;br /&gt;
	push hl&lt;br /&gt;
	push de&lt;br /&gt;
	push bc&lt;br /&gt;
	ld e,$fe		;frist group&lt;br /&gt;
	ld c,$01		;key port&lt;br /&gt;
	ld l,0		;l holds key pressed&lt;br /&gt;
cscloop:&lt;br /&gt;
	ld a,$ff		;For some reason emulator really wants it in the loop&lt;br /&gt;
	out (1),a		;reset keyport&lt;br /&gt;
	ld h,$fe&lt;br /&gt;
	out (c),e		;set keygroup&lt;br /&gt;
	ld b,8		;loop, Delay needed when work with key driver&lt;br /&gt;
	in a,(c)		;read key&lt;br /&gt;
cscbit:&lt;br /&gt;
	inc l			;inc to get key pressed&lt;br /&gt;
	rra 			; if key pressed done&lt;br /&gt;
	jp nc,donecsc&lt;br /&gt;
	rlc h&lt;br /&gt;
	djnz cscbit 	;loop 8&lt;br /&gt;
	rlc e			;next key group&lt;br /&gt;
	jp m,cscloop	;if bit 7 set loop&lt;br /&gt;
	ld l,0		;if no key pressed 0&lt;br /&gt;
donecsc:&lt;br /&gt;
	ld a,$ff&lt;br /&gt;
	out (1),a&lt;br /&gt;
	ld a,e&lt;br /&gt;
	cpl&lt;br /&gt;
	out (1),a&lt;br /&gt;
	nop&lt;br /&gt;
	nop&lt;br /&gt;
	in a,(1)&lt;br /&gt;
	inc a&lt;br /&gt;
	jp z,nootherkeypressed&lt;br /&gt;
	ld l,0&lt;br /&gt;
nootherkeypressed:&lt;br /&gt;
	ld a,$ff&lt;br /&gt;
	out (1),a&lt;br /&gt;
	nop&lt;br /&gt;
	ld a,e&lt;br /&gt;
	out (1),a&lt;br /&gt;
	nop&lt;br /&gt;
	nop&lt;br /&gt;
	in a,(1)&lt;br /&gt;
	cp h&lt;br /&gt;
	jr z,only1key&lt;br /&gt;
	ld l,0&lt;br /&gt;
only1key:&lt;br /&gt;
	ld a,l		;&lt;br /&gt;
	or a&lt;br /&gt;
	ld (gs_keymem),a&lt;br /&gt;
	pop bc&lt;br /&gt;
	pop de&lt;br /&gt;
	pop hl&lt;br /&gt;
	ret&amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-06-13T14:25:41Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Self Modifying Code */ minor edits and added another great example of SMC by Quigibo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$0000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Another SMC is modifying load instructions with (ix+0) and change the 0 to other values to really quickly read and write to the nth element of a list without using any extra registers.&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Calling and returning...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=User_talk:Bzimmerly</id>
		<title>User talk:Bzimmerly</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=User_talk:Bzimmerly"/>
				<updated>2010-05-30T21:09:45Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: welcoming&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;:Welcome!&lt;br /&gt;
:It is not habit to welcome in this particular wiki (I don't know about others) but you have an impressive story.&lt;br /&gt;
:Have fun coding, it is nice to see more coders!&lt;br /&gt;
:For sharing TI calculator projects almost everyone release in ticalc.org, for forums there is unitedti.org and you can also try omnimaga.org. This are the most active forums recently. If you like experiments there is a lot to see in the TI-84+SE around the Internet.&lt;br /&gt;
[[User:Galandros|Galandros]] 21:09, 30 May 2010 (UTC)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-05-29T11:15:57Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Shadow registers */ error&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
: Note that this produces ugly and very hard code to follow, so comment it very well for understanding and debugging later.&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Calling and returning...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Optimization</id>
		<title>Z80 Optimization</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Optimization"/>
				<updated>2010-05-29T11:13:50Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: /* Shadow registers */ ,more tips&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Introduction ==&lt;br /&gt;
Sometimes it is needed some extra speed in ASM or make your game smaller to fit on the calculator. Examples: consuming graphics/data programs and graphics code of mapping, grayscale and 3D graphics.&lt;br /&gt;
&lt;br /&gt;
If you are just looking for cutting some bytes go straight to small tricks in this topic.&lt;br /&gt;
&lt;br /&gt;
== Registers and Memory ==&lt;br /&gt;
Generally good algorithms on z80 use registers in a appropriate form.&lt;br /&gt;
It is also a good practise to keep a convention and plan how you are going to use the registers.&lt;br /&gt;
&lt;br /&gt;
General use of registers:&lt;br /&gt;
* a - 8-bit accumulator&lt;br /&gt;
* b - counter&lt;br /&gt;
* c,d,e,h,l auxiliary to accumulator and copy of b or a&lt;br /&gt;
&lt;br /&gt;
* hl - 16-bit accumulator/pointer of a address memory&lt;br /&gt;
* de - pointer of a destination address memory&lt;br /&gt;
* bc - 16-bit counter&lt;br /&gt;
* ix - index register/pointer to table in memory/save copy of hl/pointer to memory when hl and de are being used&lt;br /&gt;
* iy - index register/pointer to table in memory (use when there is no other option or need optimal execution) (disable interrupts and on exit restore the original value because TI-OS uses)&lt;br /&gt;
&lt;br /&gt;
=== 8-bit vs. 16-bit Operations ===&lt;br /&gt;
&lt;br /&gt;
The z80 processor makes faster operations on 8-bit values.&lt;br /&gt;
Code dealing with 16-bit register tends to be bigger and slower because of the equivalent 16-bit instruction is slower or it does not exist and needs to be replaced with more instructions. And sometimes the equivalent 16-bit instruction is 1 more byte.&lt;br /&gt;
If you use ix or iy registers operations are even slower and always are 1 byte bigger for each instruction. So try to convert your code to use hl and de instead of ix and iy.&lt;br /&gt;
&lt;br /&gt;
In a practical example, imagine:&lt;br /&gt;
- you pass through the accumulator a value to a routine&lt;br /&gt;
- if the only valid values of the accumulator range from 0 to 63 and if in that routine you need to multiply the accumulator by, say 12, it has to be stored in a 16-bit pair register.&lt;br /&gt;
- but you can multiply a by 4 before overflowing (63*4 = 252 which is smaller than 255) and take advantage of this to optimize&lt;br /&gt;
&lt;br /&gt;
Now on the code:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; The most usual way is pass A (the accumulator) right in the start to HL&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld l,a&lt;br /&gt;
	add a,a&lt;br /&gt;
	ld d,h&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	add hl,hl	; hl=a*12&lt;br /&gt;
; 9 bytes, 56 clocks&lt;br /&gt;
&lt;br /&gt;
; But given a is between 0 and 63 you can multiply by 4 without overflowing the 8-bit limit (255)&lt;br /&gt;
	add a,a&lt;br /&gt;
	add a,a		; a*4&lt;br /&gt;
	ld l,a&lt;br /&gt;
	ld e,a&lt;br /&gt;
	ld h,0&lt;br /&gt;
	ld d,h		; hl=a*4 and de=a*4&lt;br /&gt;
	add hl,hl	; hl=a*8&lt;br /&gt;
	add hl,de	; hl=a*12&lt;br /&gt;
; 9 bytes, 49 clocks&lt;br /&gt;
&lt;br /&gt;
; hey, minus 7 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example you only shaved a few clock cycles but sometimes you can save some bytes, too.&lt;br /&gt;
You can do this for other registers than A accumulator.&lt;br /&gt;
&lt;br /&gt;
For example if passed in l and l is always lower than 64, you can do &amp;quot; sla l \ sla l \ ld h,0	&amp;quot; to multiply l by four and use hl for 16-bit operations. In this case you are exchanging size with speed increase. Each sla instruction is 2 bytes and add hl,hl is only 1 byte.&lt;br /&gt;
&lt;br /&gt;
Mind this optimizations can produce bugs and somewhat hard code to follow, so comment them.&lt;br /&gt;
I recommend to proceed to this optimization only when you really need speed and the code is bug free.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One common trick with multiplication by 256 is just load around the low byte register to the high byte register. This works because in binary a multiplication by 256 is like shifting 8 bits left, entering zeros. Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; multiply a by 256 and store in hl&lt;br /&gt;
	ld h,a&lt;br /&gt;
	ld l,0&lt;br /&gt;
; multiply hl by 256 and store in ade (pseudo 24-bit pair register)&lt;br /&gt;
	ld a,h&lt;br /&gt;
	ld d,l&lt;br /&gt;
	ld e,0&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Stack ===&lt;br /&gt;
&lt;br /&gt;
When you run out of registers, stack may offer an interesting alternative to fixed RAM location for temporary storage.&lt;br /&gt;
&lt;br /&gt;
==== Allocation ====&lt;br /&gt;
&lt;br /&gt;
You can either allocate stack space with repeated push, which allows to initialize the data but restricts the allocated space to multiples of 2.&lt;br /&gt;
An alternate way is to allocate uninitialized stack space (hl may be replaced with an index register) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; allocates 7 bytes of stack space : 5 bytes, 27 T-states instead of 4 bytes, 44 T-states with 4 push which would have forced the alloc of 8 bytes&lt;br /&gt;
 ld hl, -7&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Access ====&lt;br /&gt;
&lt;br /&gt;
The most common way of accessing data allocated on stack is to use an index register since all allocated &amp;quot;variables&amp;quot; can be accessed without having to use inc/dec but this is obviously not a strict requirement. Beware though, using stack space is not always optimal in terms of speed, depending (among other things) on your register allocation strategy :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 4 bytes, 19 T-states&lt;br /&gt;
 ld c, (ix + n)   ; n is an immediate value in -128..127&lt;br /&gt;
 &lt;br /&gt;
 ; 4 bytes, 17 T-states, destroys a&lt;br /&gt;
 ld a, (somelocation)&lt;br /&gt;
 ld c, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If your needs go beyond simple load/store however, this method start to show its real power since it vastly simplify some operations that are complicated to do with fixed storage location (and generally screw up register in the process).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; 3 bytes, 19 T-states&lt;br /&gt;
 cp (ix + n)&lt;br /&gt;
&lt;br /&gt;
 sub (ix + n)&lt;br /&gt;
 sbc a, (ix + n)&lt;br /&gt;
 add a, (ix + n)&lt;br /&gt;
 adc a, (ix + n)&lt;br /&gt;
&lt;br /&gt;
 inc (ix + n)&lt;br /&gt;
 dec (ix + n)&lt;br /&gt;
&lt;br /&gt;
 and (ix + n)&lt;br /&gt;
 or (ix + n)&lt;br /&gt;
 xor (ix + n)&lt;br /&gt;
&lt;br /&gt;
 ; 4 bytes, 23 T-states&lt;br /&gt;
 rl (ix + n)&lt;br /&gt;
 rr (ix + n)&lt;br /&gt;
 rlc (ix + n)&lt;br /&gt;
 rrc (ix + n)&lt;br /&gt;
 sla (ix + n)&lt;br /&gt;
 sra (ix + n)&lt;br /&gt;
 sll (ix + n)&lt;br /&gt;
 srl (ix + n)&lt;br /&gt;
 bit k, (ix + n)   ; k is an immediate value in 0..7&lt;br /&gt;
 set k, (ix + n)&lt;br /&gt;
 res k, (ix + n)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Again, choose wisely between hl and an index register depending on the structure of your data the smallest/fastest allocation solution may vary (hl equivalent instructions are generally 2 bytes smaller and 12 T-states faster but do not allow indexing so may require intermediate inc/dec).&lt;br /&gt;
&lt;br /&gt;
==== Deallocation ====&lt;br /&gt;
&lt;br /&gt;
If you want need to pop an entry from the stack but need to preserve all registers remember that sp can be incremented/decremented like any 16bit register :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drops the top stack entry : waste 1 byte and 2 T-states but may enable better register allocation...&lt;br /&gt;
 inc sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have a large amount of stack space to drop and a spare 16 bit register (hl, index, or de that you can easily swap with hl) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; drop 16 bytes of stack space : 5 bytes, 27 T-states instead of 8 bytes, 80 T-states for 8 pop&lt;br /&gt;
 ld hl, 16&lt;br /&gt;
 add hl, sp&lt;br /&gt;
 ld sp, hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
The larger the space to drop the more T-states you will save, and at some point you'll start saving space as well (beyond 8 bytes)&lt;br /&gt;
&lt;br /&gt;
=== Shadow registers ===&lt;br /&gt;
&lt;br /&gt;
In some rare cases, when you run out of registers and cannot to either refactor your algorithm(s) or to rely on RAM storage you may want to use the shadow registers : af', bc', de' and hl'&lt;br /&gt;
&lt;br /&gt;
These registers behave like their &amp;quot;standard&amp;quot; counterparts (af, bc, de, hl) and you can swap the two register sets at using the following instructions :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ex af, af'  ; swaps af and af' as the mnemonic indicates&lt;br /&gt;
&lt;br /&gt;
 exx         ; swaps bc, de, hl and bc', de', hl'&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers are somewhat common for doing arithmetic operations on some big integers (16-bit to 32-bit) or BCD operations without rely on RAM storage or pushing and popping to the stack. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
MUL32:&lt;br /&gt;
        DI&lt;br /&gt;
        AND     A               ; RESET CARRY FLAG&lt;br /&gt;
        SBC     HL,HL           ; LOWER RESULT = 0&lt;br /&gt;
        EXX&lt;br /&gt;
        SBC     HL,HL           ; HIGHER RESULT = 0&lt;br /&gt;
        LD      A,B             ; MPR IS AC'BC&lt;br /&gt;
        LD      B,32            ; INITIALIZE LOOP COUNTER&lt;br /&gt;
MUL32LOOP:&lt;br /&gt;
        SRA     A               ; RIGHT SHIFT MPR&lt;br /&gt;
        RR      C&lt;br /&gt;
        EXX&lt;br /&gt;
        RR      B&lt;br /&gt;
        RR      C               ; LOWEST BIT INTO CARRY&lt;br /&gt;
        JR      NC,MUL32NOADD&lt;br /&gt;
        ADD     HL,DE           ; RESULT += MPD&lt;br /&gt;
        EXX&lt;br /&gt;
        ADC     HL,DE&lt;br /&gt;
        EXX&lt;br /&gt;
MUL32NOADD:&lt;br /&gt;
        SLA     E               ; LEFT SHIFT MPD&lt;br /&gt;
        RL      D&lt;br /&gt;
        EXX&lt;br /&gt;
        RL      E&lt;br /&gt;
        RL      D&lt;br /&gt;
        DJNZ    MUL32LOOP&lt;br /&gt;
        EXX&lt;br /&gt;
       &lt;br /&gt;
; RESULT IN H'L'HL&lt;br /&gt;
        RET&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Shadow registers can be of a great help but they come with two drawbacks :&lt;br /&gt;
&lt;br /&gt;
* they cannot coexist with the &amp;quot;standard&amp;quot; registers : you cannot use ld to assign from a standard to a shadow or vice-versa. Instead you must use nasty constructs such as :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ; loads hl' with the contents of hl&lt;br /&gt;
 push hl&lt;br /&gt;
 exx&lt;br /&gt;
 pop hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* they require interrupts to be disabled since they are originally intended for use in Interrupt Service Routine. There are situations where it is affordable and others where it isn't. Regardless, it is generally a good policy to restore the previous interrupt status (enabled/disabled) upon return instead of letting it up to the caller. Hopefully it s relatively easy to do (though it does add 4 bytes and 29/33 T-states to the routine) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld a, i  ; this is the core of the trick, it sets P/V to the value of IFF so P/V is set iff interrupts were enabled at that point&lt;br /&gt;
  push af  ; save flags&lt;br /&gt;
  di       ; disable interrupts&lt;br /&gt;
  &lt;br /&gt;
  ; do something with shadow registers here&lt;br /&gt;
&lt;br /&gt;
  pop af   ; get back flags&lt;br /&gt;
  ret po   ; po = P/V reset so in this case it means interrupts were disabled before the routine was called&lt;br /&gt;
  ei       ; re-enable interrupts&lt;br /&gt;
  ret&lt;br /&gt;
&lt;br /&gt;
* finally they make ugly and very hard code to follow, so comment it well&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SP register ===&lt;br /&gt;
&lt;br /&gt;
This register is used in desperate situations generally during an interrupt loop demanding as much speed as possible and the normal registers are used. (remarkably used in James Montelongo 4 lvl grayscale interlace in graylib2.inc)&lt;br /&gt;
You need to know these valid and not generally known instructions:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld sp,6&lt;br /&gt;
 add hl,sp&lt;br /&gt;
 sbc hl,sp&lt;br /&gt;
 inc sp&lt;br /&gt;
 dec sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now a example of such situation:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;init hl,de,bc,a&lt;br /&gt;
 ld sp,6&lt;br /&gt;
loop:&lt;br /&gt;
;code&lt;br /&gt;
 add hl,sp  ;get next row of a table for example&lt;br /&gt;
;code using bc,de,ix,a&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 jp nz,loop:&lt;br /&gt;
;code&lt;br /&gt;
 ld sp,(saveSP)&lt;br /&gt;
 ret    ;finish interrupt&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt; &lt;br /&gt;
&lt;br /&gt;
When you use sp in this way this means you can not push/pop registers and no calls are allowed.&lt;br /&gt;
Mind again that this is only used as last resource. Don't forget to save and restore sp like the example shows.&lt;br /&gt;
&lt;br /&gt;
== General Algorithms ==&lt;br /&gt;
&lt;br /&gt;
Registers and Memory use is very important in writing concise and fast z80 code. Then comes the general optimization.&lt;br /&gt;
&lt;br /&gt;
First, try to optimize the more used code in subroutines and large loops. Finding the bottleneck and solving it, is enough to many programs.&lt;br /&gt;
&lt;br /&gt;
Do not forget that in z80 assembly vector tables (or look up tables) gives smaller and faster code than blocks of comparisons and jumps. Other times using a chunk of data for a task is better than a more usual programming method (notably in graphics screen effects).&lt;br /&gt;
See [[Z80 Good Programming Practices]] for examples.&lt;br /&gt;
&lt;br /&gt;
Look up in a complete instruction set for searching some instruction that can optimize somewhere in the code.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A list of things to keep in mind:&lt;br /&gt;
* Rework conditionals to be more efficient.&lt;br /&gt;
* Make sure the most common checks come first. Or said in other way, the more special and rare cases check in last.&lt;br /&gt;
* Get out of the main loop special cases check if they aren't needed there.&lt;br /&gt;
* Rearrange program flow&lt;br /&gt;
* When possible, if you can afford to have a bigger overhead and get code out of the main loop do it.&lt;br /&gt;
* When your code seems that even with optimization won't be efficient enough, try another approach or algorithm. Search other algorithms in Wikipedia, for instance.&lt;br /&gt;
* Rewriting code from scratch can bring new ideas (use in desperate situations because of all work needed to write it)&lt;br /&gt;
* Remember almost all times is better to leave optimization to the end. Optimization can bring too early headaches with crashes and debugging. And because ASM is very fast and sometimes even smaller than higher level languages, it may not be needed further optimization.&lt;br /&gt;
* Document wacky optimizations to understand the code later (z80 optimization leads to very hard code to understand)&lt;br /&gt;
&lt;br /&gt;
== Self Modifying Code ==&lt;br /&gt;
&lt;br /&gt;
If your code is in ram, writes can be done to change the code. Having a instruction set that explains the opcodes is useful.&lt;br /&gt;
Despite the self modifying code can be used in any instruction, it is very common with loading constants to registers.&lt;br /&gt;
&lt;br /&gt;
Generally it is used to save any value to be used later (usually seen in masks). Examples:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (savemask),a&lt;br /&gt;
;...code...&lt;br /&gt;
savemask = $+1&lt;br /&gt;
 ld a,$00   ; $00 is just a placeholder&lt;br /&gt;
&lt;br /&gt;
 ld (something),hl&lt;br /&gt;
;... code&lt;br /&gt;
something = $+1&lt;br /&gt;
 ld de,$000&lt;br /&gt;
&lt;br /&gt;
 ld (saveSP),sp&lt;br /&gt;
;... code ...&lt;br /&gt;
saveSP = $+1&lt;br /&gt;
 ld sp,$0000  ; restore sp&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SMC (Self Modifying Code) is quite used with unrolling and relative jumps. Example:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
 ld (jpmodify),a&lt;br /&gt;
;...&lt;br /&gt;
jpmodify = $+1&lt;br /&gt;
 jr $00&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 rrca&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Small Tricks ==&lt;br /&gt;
&lt;br /&gt;
Note that the following tricks act much like a peep-hole optimizer and are the last optimization step : remember to first optimize your algorithm and register allocation before applying any of the following if you really want the fastest speed and the smallest code.&lt;br /&gt;
&lt;br /&gt;
Also note that near every trick turn the code less understandable and documenting them is a good idea. You can easily forgot after a while without reading parts of the code.&lt;br /&gt;
&lt;br /&gt;
Be warned that some tricks are not exactly equivalent to the normal way and may have exceptions on its use, comments warn about them. Some tricks apply to other cases, but again you have to be careful.&lt;br /&gt;
&lt;br /&gt;
There are some tricks that are nothing more than the correct use of the available instructions on the z80. Keeping an instruction set summary, help to visualize what you can do during coding.&lt;br /&gt;
&lt;br /&gt;
=== Optimize size and speed ===&lt;br /&gt;
&lt;br /&gt;
==== Loading stuff ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 ld a,0&lt;br /&gt;
;Try this:&lt;br /&gt;
 xor a    ;disadvantages: changes flags&lt;br /&gt;
;or&lt;br /&gt;
 sub a    ;disadvantages: changes flags&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld b,$20&lt;br /&gt;
	ld c,$30&lt;br /&gt;
;try this&lt;br /&gt;
	ld bc,$2030&lt;br /&gt;
;or this&lt;br /&gt;
	ld bc,(b_num * 256) + c_num		;where b_num goes to b register and c_num to c register&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
  ld a,$42&lt;br /&gt;
  ld (hl),a&lt;br /&gt;
;try this&lt;br /&gt;
  ld (hl),$42&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (data1),a&lt;br /&gt;
	ld (data2),a&lt;br /&gt;
	ld (data3),a&lt;br /&gt;
	ld (data4),a&lt;br /&gt;
	ld (data5),a	;if data1 to data5 are one after the other&lt;br /&gt;
;try this&lt;br /&gt;
	ld hl,data1&lt;br /&gt;
	ld de,data1+1&lt;br /&gt;
	xor a&lt;br /&gt;
	ld (hl),a&lt;br /&gt;
	ld bc,4&lt;br /&gt;
	ldir&lt;br /&gt;
; -&amp;gt; save 3 bytes for every ld (dataX), after passing the initial overhead&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	ld a,(var)&lt;br /&gt;
	inc a&lt;br /&gt;
	ld (var),a&lt;br /&gt;
;try this	;Note: if hl is not tied up, use indirection:&lt;br /&gt;
	ld hl,var&lt;br /&gt;
	inc (hl)&lt;br /&gt;
	ld a,(hl) ;if you don't need (hl) in a, delete this line&lt;br /&gt;
; -&amp;gt; save 2 bytes and 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of :&lt;br /&gt;
 ld a, (hl)&lt;br /&gt;
 ld (de), a&lt;br /&gt;
 inc hl&lt;br /&gt;
 inc de&lt;br /&gt;
; Use :&lt;br /&gt;
 ldi&lt;br /&gt;
 inc bc&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop BC&lt;br /&gt;
    ld D,B&lt;br /&gt;
    ld E,C&lt;br /&gt;
;Use instead:&lt;br /&gt;
    push BC&lt;br /&gt;
;    ...&lt;br /&gt;
    pop DE      ;we only want to DE hold pushed BC (no need for a copy of DE in BC)&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Math and Logic tricks ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of:&lt;br /&gt;
 cp 0&lt;br /&gt;
;Use&lt;br /&gt;
 or a&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 1&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  dec a   ;changes a!&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  xor %11111111&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cpl&lt;br /&gt;
; -&amp;gt; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,767&lt;br /&gt;
    or a       ;reset carry so sbc works as a sub&lt;br /&gt;
    sbc hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    ld de,-767 ;negation of de&lt;br /&gt;
    add hl,de&lt;br /&gt;
; -&amp;gt; 2 bytes and 8 T-states !&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
    ld de,-767&lt;br /&gt;
    add hl,de&lt;br /&gt;
;try this&lt;br /&gt;
    dec h  ; -256&lt;br /&gt;
    dec h  ; -512&lt;br /&gt;
    dec h  ; -768&lt;br /&gt;
    inc hl  ; -767&lt;br /&gt;
;Note that works in many other cases&lt;br /&gt;
; -&amp;gt; save 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
	srl a&lt;br /&gt;
;try this&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	rrca&lt;br /&gt;
	and %00011111&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
	neg&lt;br /&gt;
	add a,N   ;you want to calculate N-A&lt;br /&gt;
;Do it this way:&lt;br /&gt;
	cpl&lt;br /&gt;
	add a,N+1    ;neg is practically equivalent to cpl \ inc a&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,B&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    xor A&lt;br /&gt;
    sub B&lt;br /&gt;
; -&amp;gt; save 1 byte and 4 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    ld A,D&lt;br /&gt;
    sub $D3&lt;br /&gt;
    neg&lt;br /&gt;
;Instead use:&lt;br /&gt;
    ld A,$D3&lt;br /&gt;
    sub D&lt;br /&gt;
; -&amp;gt; save 2 bytes and 8 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  sla l&lt;br /&gt;
  rl h         ; I've actually seen this!&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  add hl,hl&lt;br /&gt;
; -&amp;gt; save 1 byte and 5 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Conditionals ====&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and 1         ;and sets zero flag, no need for cp&lt;br /&gt;
  jr nz,foo&lt;br /&gt;
; -&amp;gt; save 2 bytes and 7 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  and 1&lt;br /&gt;
  cp 1         ;a not needed after this&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  jr c,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 0,a&lt;br /&gt;
  call z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rra&lt;br /&gt;
  call nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 7,a&lt;br /&gt;
  jr z,foo&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  rla&lt;br /&gt;
  jr nc,foo&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  bit 2,a&lt;br /&gt;
  ret nz&lt;br /&gt;
  xor a&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  and %100&lt;br /&gt;
  ret nz&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  cp 9        ;if a&amp;gt;=9 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
  jp z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
  cp 9+1      ;;if a&amp;gt;10 then goto label&lt;br /&gt;
  jp nc,label&lt;br /&gt;
; -&amp;gt; save 3 bytes and 10 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Others ====&lt;br /&gt;
&lt;br /&gt;
Calling and returning...&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 call xxxx&lt;br /&gt;
 ret&lt;br /&gt;
;try this&lt;br /&gt;
 jp xxxx&lt;br /&gt;
;only do this if the pushed pc to stack is not passed to the call. Example: some kind of inline vputs.&lt;br /&gt;
; -&amp;gt; save 1 byte and 17 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Never use:&lt;br /&gt;
    dec B&lt;br /&gt;
    jr NZ,loop    ;I have seen this...&lt;br /&gt;
;Use:&lt;br /&gt;
    djnz loop&lt;br /&gt;
; save 1 byte and 3 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
loop:&lt;br /&gt;
 ld a,2&lt;br /&gt;
;code1&lt;br /&gt;
 ld a,0&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
&lt;br /&gt;
;try this&lt;br /&gt;
 ld a,2&lt;br /&gt;
loop:&lt;br /&gt;
;code1&lt;br /&gt;
 xor $01   ; the trick is xor logic make a register alternate between two values&lt;br /&gt;
;code2&lt;br /&gt;
 djnz loop&lt;br /&gt;
; -&amp;gt; save size and time depending on its use&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; Instead of&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 cp 0&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 cp 1&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 cp 2&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 cp 3&lt;br /&gt;
 jp z,A_is_3&lt;br /&gt;
 cp 4&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 cp 5&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; This is a little better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 or a&lt;br /&gt;
 jp z,A_is_0&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_1&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_2&lt;br /&gt;
 sub 2&lt;br /&gt;
 jp z,A_is_4&lt;br /&gt;
 dec a&lt;br /&gt;
 jp z,A_is_5&lt;br /&gt;
&lt;br /&gt;
; Even better&lt;br /&gt;
 ld a,(Number)&lt;br /&gt;
 add a,a   ; a*2 (limits Number to 128) &lt;br /&gt;
 ld h,0 &lt;br /&gt;
 ld l,a &lt;br /&gt;
 ld de,VectorTable&lt;br /&gt;
 add hl,de&lt;br /&gt;
 ld a,(hl)&lt;br /&gt;
 inc hl&lt;br /&gt;
 ld h,(hl)&lt;br /&gt;
 ld l,a&lt;br /&gt;
 jp (hl)&lt;br /&gt;
&lt;br /&gt;
VectorTable:&lt;br /&gt;
 .dw A_is_1&lt;br /&gt;
 .dw A_is_2&lt;br /&gt;
 .dw A_is_3&lt;br /&gt;
 .dw A_is_4&lt;br /&gt;
 .dw A_is_5&lt;br /&gt;
 .dw A_is_6&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Also see [[Z80 Good Programming Practices]]&lt;br /&gt;
&lt;br /&gt;
=== Size vs. Speed ===&lt;br /&gt;
&lt;br /&gt;
The classical problem of optimization in computer programming, Z80 is no exception.&lt;br /&gt;
In ASM most frequently size is what matters because generally ASM is fast enough and it is nice to give a user a smaller program that doesn't use up most RAM memory.&lt;br /&gt;
&lt;br /&gt;
==== For the sake of size ====&lt;br /&gt;
&lt;br /&gt;
* Use relative jumps (jr label) whenever possible. When relative jump is out of reach (out of -128 to 127 bytes) and there is a jp near, do a relative jump to the absolute one. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;lots of code (more that 128 bytes worth of code)&lt;br /&gt;
somelabel2:&lt;br /&gt;
 jp somelabel&lt;br /&gt;
;less than 128 bytes&lt;br /&gt;
 jr somelabel2   ;instead of a absolute jump directly to somelabel, jump to a jump to somelabel.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Relative jumps are 2 bytes and absolute jumps 3. In terms of speed jp is faster when a jump occurs (10 T-states) and jr is faster when it doesn't occur.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 dec bc&lt;br /&gt;
 ld a,b&lt;br /&gt;
 or c&lt;br /&gt;
 ret z&lt;br /&gt;
;try this&lt;br /&gt;
 cpi              ;increments HL&lt;br /&gt;
 ret po&lt;br /&gt;
; save 1 byte at the cost of 2 T-states&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''Passing inline data'''&lt;br /&gt;
&lt;br /&gt;
When you call, the pc + 3 (after the call) is pushed. You can pop it and use as a pointer to data. A very nifty use is with strings. To return, pass the data and jp (hl).&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
Instead of:&lt;br /&gt;
 ld hl,string&lt;br /&gt;
 bcall(_vputs)&lt;br /&gt;
 ret&lt;br /&gt;
;Try this:&lt;br /&gt;
  call Disp&lt;br /&gt;
  .db &amp;quot;This is some text&amp;quot;,0&lt;br /&gt;
  ret&lt;br /&gt;
;Not a speed optimization, but it eliminates 2-byte pointers, since it just uses the call's return address.&lt;br /&gt;
;It also heavily disturbs disassembly.&lt;br /&gt;
Disp:&lt;br /&gt;
  pop hl&lt;br /&gt;
  bcall(_vputs)&lt;br /&gt;
  jp (hl)&lt;br /&gt;
; -&amp;gt; save 2 bytes for each use, but 4 bytes of overhead (Disp routine)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This routine can be expanded to pass the coordinates where the text should appear.&lt;br /&gt;
&lt;br /&gt;
'''Wasting time to delay'''&lt;br /&gt;
&lt;br /&gt;
There are those funny times that you need some delay between operations like reads/writes to ports '''''and there is nothing useful to do'''''. And because nop's are not very size friendly, think of other slower but smaller instructions. Example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Instead of&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
 in a,(1)&lt;br /&gt;
;Try this:&lt;br /&gt;
 ld a,KEY_GROUP&lt;br /&gt;
 out (1),a&lt;br /&gt;
 ld a,(de)    ;a doesn't need to be preserved because it will hold what the port has.&lt;br /&gt;
 in a,(1)&lt;br /&gt;
; -&amp;gt; save 1 byte and 1 T-state (well 1 T-state less is almost the same time)&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When you need to delay and cannot afford to alter registers or flags there are still ways to delay that waste less size than nop's :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; 2 bytes, 8 T-states&lt;br /&gt;
 nop&lt;br /&gt;
 nop&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 inc hl&lt;br /&gt;
 dec hl&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 12 T-states&lt;br /&gt;
 jr $+2&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 21 T-states&lt;br /&gt;
 push af&lt;br /&gt;
 pop af&lt;br /&gt;
&lt;br /&gt;
; 2 bytes, 38 T-states&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 ex (sp), hl&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you need a small adjustable delay:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;4 bytes, b*13+8 T-states (variable)&lt;br /&gt;
	ld b,255	; initial delay&lt;br /&gt;
	djnz $		; do it&lt;br /&gt;
;b=0 on exit&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* There are many other instructions that you can use&lt;br /&gt;
* Beware that not all instructions preserve registers or flags&lt;br /&gt;
* For delay between frames of games or other longer delays, you can use the 'halt' instruction if there are interrupts enabled. It make the calculator enter low power mode until an interrupt is triggered. To fine-tune the effect of this delay mechanism you can alter interrupt mask and interrupt time speed beforehand (and possibly restore their values afterwards).&lt;br /&gt;
&lt;br /&gt;
==== Unrolling code ====&lt;br /&gt;
&lt;br /&gt;
'''General Unrolling'''&lt;br /&gt;
You can unroll some loop several times instead of looping, this is used frequently on math routines of multiplication.&lt;br /&gt;
This means you are wasting memory to gain speed. Most times you are preferring size to speed.&lt;br /&gt;
&lt;br /&gt;
'''Unroll commands'''&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; &amp;quot;Classic&amp;quot; way : ~21 T-states per byte copied&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size&lt;br /&gt;
 ldir&lt;br /&gt;
&lt;br /&gt;
; Unrolled : (16 * size + 10) / n -&amp;gt; ~18 T-states per byte copied when unrolling 8 times&lt;br /&gt;
 ld hl,src&lt;br /&gt;
 ld de,dest&lt;br /&gt;
 ld bc,size  ; if the size is not a multiple of the number of unrolled ldi then a small trick must be used to jump appropriately inside the loop for the first iteration&lt;br /&gt;
loopldi:    ;you can use this entry for a call&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 ldi&lt;br /&gt;
 jp pe, loopldi    ; jp used as it is faster and in the case of a loop unrolling we assume speed matters more than size&lt;br /&gt;
; ret if this is a subroutine and use the unrolled ldi's with a call.&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
This unroll of ldi also works with outi and ldr.&lt;br /&gt;
&lt;br /&gt;
==== Looping with 16 bit counter ====&lt;br /&gt;
There are two ways to make loops with a 16bit counter :&lt;br /&gt;
* the naive one, which results in smaller code but increased loop overhead (24 * n T-states) and destroys a&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  ld bc, ...&lt;br /&gt;
loop:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
 &lt;br /&gt;
  dec bc&lt;br /&gt;
  ld  a, b&lt;br /&gt;
  or  c&lt;br /&gt;
  jp  nz,loop&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* the slightly trickier one, which takes a couple more bytes but has a much lower overhead (12 * n + 14 * (n / 16) T-states)&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  dec  de&lt;br /&gt;
  ld  b, e&lt;br /&gt;
  inc  b&lt;br /&gt;
  inc  d&lt;br /&gt;
loop2:&lt;br /&gt;
  ; loop body here&lt;br /&gt;
  &lt;br /&gt;
  djnz loop2&lt;br /&gt;
  dec  d&lt;br /&gt;
  jp  nz,loop2&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
The rationale behind the second method is to reduce the overhead of the &amp;quot;inner&amp;quot; loop as much as possible and to use the fact that when b gets down to zero it will be treated as 256 by djnz. &lt;br /&gt;
&lt;br /&gt;
You can therefore use the following macros for setting proper values of 8bit loop counters given a 16bit counter in case you want to do the conversion at compile time :&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  #define inner_counter8(counter16) (((counter16) - 1) &amp;amp; 0xff) + 1&lt;br /&gt;
  #define outer_counter8(counter16) (((counter16) - 1) &amp;gt;&amp;gt; 8) + 1&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Preserve Registers ===&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: both routines compare b to 0, same size and speed but the second preserves accumulator&lt;br /&gt;
; remarks: - inc/dec doesn't affect carry flag&lt;br /&gt;
;          - inc/dec doesn't affect any flags on 16-bit registers, so do not extrapolate to 16-bit registers.&lt;br /&gt;
	ld a,b&lt;br /&gt;
	or b&lt;br /&gt;
	jr z,label&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	inc b&lt;br /&gt;
	dec b&lt;br /&gt;
	jr z,label&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; description: add a to hl without using a 16-bit register&lt;br /&gt;
;normal way:&lt;br /&gt;
	ld d,$00&lt;br /&gt;
	ld e,a&lt;br /&gt;
	add hl,de&lt;br /&gt;
;4 bytes and 22 clock cycles&lt;br /&gt;
; &amp;gt;&lt;br /&gt;
	add a,l&lt;br /&gt;
	ld l,a&lt;br /&gt;
	jr nc, $+3&lt;br /&gt;
	inc h&lt;br /&gt;
;5 bytes, 19/20 clock cycles&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting flags ==&lt;br /&gt;
In some occasion you might want to selectively set/reset a flag.&lt;br /&gt;
&lt;br /&gt;
Here are the most common uses :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; set Carry flag&lt;br /&gt;
 scf&lt;br /&gt;
&lt;br /&gt;
; reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
; alternate reset Carry flag (alters Sign and Zero flags as defined)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
; set Zero flag (resets Carry flag, alters Sign flag as defined)&lt;br /&gt;
 cp a&lt;br /&gt;
&lt;br /&gt;
; reset Zero flag (alters a, reset Carry flag, alters Sign flag as defined)&lt;br /&gt;
 or 1&lt;br /&gt;
&lt;br /&gt;
; set Sign flag (negative) (alters a, reset Zero and Carry flags)&lt;br /&gt;
 or $80&lt;br /&gt;
&lt;br /&gt;
; reset Sign flag (positive) (set a to zero, set Zero flag, reset Carry flag)&lt;br /&gt;
 xor a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Other possible uses (much rarer) :&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;Set parity/overflow (even):&lt;br /&gt;
 xor a&lt;br /&gt;
&lt;br /&gt;
;Reset parity/overflow (odd):&lt;br /&gt;
 sub a&lt;br /&gt;
&lt;br /&gt;
;Set half carry (hardly ever useful but still...)&lt;br /&gt;
 and a&lt;br /&gt;
&lt;br /&gt;
;Reset half carry (hardly ever useful but still...)&lt;br /&gt;
 or a&lt;br /&gt;
&lt;br /&gt;
;Set bit 5 of f:&lt;br /&gt;
 or %00100000&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
As you can see these are extremely simple, small and fast ways to alter flags&lt;br /&gt;
which make them interesting as output of routines to indicate error/success or&lt;br /&gt;
other status bits that do not require a full register.&lt;br /&gt;
&lt;br /&gt;
Were you to use this, remember that these flag (re)setting tricks frequently&lt;br /&gt;
overlap so if you need a special combination of flags it might require slightly&lt;br /&gt;
more elaborate tricks. As a rule of a thumb, always alter the carry last in&lt;br /&gt;
such cases because the scf and ccf instructions do not have side effects.&lt;br /&gt;
&lt;br /&gt;
More advance ways of manipulating flags follow:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;get the zero flag in carry &lt;br /&gt;
	scf&lt;br /&gt;
	jr z,$+3&lt;br /&gt;
	ccf&lt;br /&gt;
&lt;br /&gt;
;Put carry flag into zero flag.&lt;br /&gt;
	ccf&lt;br /&gt;
	sbc a, a&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Tools of the job ==&lt;br /&gt;
&lt;br /&gt;
Want to try test your optimization or test new ones? Then you have to check this:&lt;br /&gt;
* Keep a z80 instruction set to not forget a useful instruction and flags affected. (see [[Z80_Instruction_Set|Z80_Instruction_Set]])&lt;br /&gt;
* Use an assembler that has &amp;quot;.echo&amp;quot; and use this in the source to count size: (see [[Assemblers|Assemblers]])&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;SomeCodeorData:&lt;br /&gt;
;code or data goes here&lt;br /&gt;
End:&lt;br /&gt;
 .echo &amp;quot;size of the code/data:&amp;quot;&lt;br /&gt;
 .echo End-SomeCodeorData&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* Get a nice IDE of z80 that counts code ([[IDEs|IDE's]])&lt;br /&gt;
* Make use of the counting capabilities of an emulator ([[:Category:Emulators|Emulators]])&lt;br /&gt;
&lt;br /&gt;
== Related topics ==&lt;br /&gt;
* [http://www.junemann.nl/maxcoderz/viewtopic.php?f=5&amp;amp;t=675 MaxCodez TI-ASM optimization]&lt;br /&gt;
* ticalc archives: [http://www.ticalc.org/archives/files/fileinfo/108/10821.html 1] [http://www.ticalc.org/archives/files/fileinfo/285/28502.html 2]&lt;br /&gt;
* [http://www.ballyalley.com/ml/z80_docs/z80_docs.html Balley Alley Z80 Machine Language Documentation]&lt;br /&gt;
* [http://map.grauw.nl/articles/fast_loops.php Fast loops in MSX Assembly Page]&lt;br /&gt;
* [http://shiar.nl/calc/z80/optimize Shiar z80 optimization page]&lt;br /&gt;
&lt;br /&gt;
== Acknowledgements ==&lt;br /&gt;
* fullmetalcoder&lt;br /&gt;
* Galandros&lt;br /&gt;
* MaxCoderz participants in assembly optimizing topic (Jim e,CoBB,...)&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	<entry>
		<id>https://wikiti.brandonw.net/index.php?title=Z80_Routines:Math:Logarithm</id>
		<title>Z80 Routines:Math:Logarithm</title>
		<link rel="alternate" type="text/html" href="https://wikiti.brandonw.net/index.php?title=Z80_Routines:Math:Logarithm"/>
				<updated>2010-05-29T10:26:46Z</updated>
		
		<summary type="html">&lt;p&gt;Galandros: fix formatting&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Z80 Routines:Math|Logarithm]]&lt;br /&gt;
[[Category:Z80 Routines|Logarithm]]&lt;br /&gt;
&lt;br /&gt;
= Introduction =&lt;br /&gt;
&lt;br /&gt;
== Integer Log of base 2 ==&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
; input: hl (16-bit integer unsigned)&lt;br /&gt;
; output: a = log2(hl) (rounded down and from -1 to 15) (8-bit integer signed)&lt;br /&gt;
log2:&lt;br /&gt;
	ld a,h&lt;br /&gt;
	or l&lt;br /&gt;
	ld a,-1&lt;br /&gt;
	ret z     ; return -1 if hl=0&lt;br /&gt;
	ld b,15   ; logarithm in base 2 is the number of significant bits for a integer, i.e. number of bits after the first 1 including it&lt;br /&gt;
log2loop:&lt;br /&gt;
	add hl,hl&lt;br /&gt;
	jr c,log2end&lt;br /&gt;
	djnz log2loop&lt;br /&gt;
log2end:&lt;br /&gt;
	ld a,b&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Integer Log of base 10 ==&lt;br /&gt;
&lt;br /&gt;
Since log10=log2(hl)/log2(10).&lt;br /&gt;
We can multiply by 1/log2(10).&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Integer Log of base B ==&lt;br /&gt;
&lt;br /&gt;
The same trick as above.&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
;unfinished&lt;br /&gt;
logB:&lt;br /&gt;
; input: hl = number&lt;br /&gt;
;	 b = base&lt;br /&gt;
; output: a = log hl base b&lt;br /&gt;
	&lt;br /&gt;
	ret&lt;br /&gt;
 &amp;lt;/nowiki&amp;gt;&lt;/div&gt;</summary>
		<author><name>Galandros</name></author>	</entry>

	</feed>