View UPD30181AY_6367774.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

v r 4100 series? 64-/32-bit microprocessor architecture ? ? ? ? nec corporation 2002 ? ? ? ? mips technologies, inc. 1997, 2001 printed in japan document no. u15509ej2v0um00 (2nd edition) date published june 2002 ns cp(k) user?s manual target device pd30121 (v r 4121 tm ) pd30122 (v r 4122 tm ) pd30131 (v r 4131 tm ) pd30181 (v r 4181 tm ) pd30181a, 30181ay (v r 4181a tm )
user?s manual u15509ej2v0um 2 [memo]
user?s manual u15509ej2v0um 3 notes for cmos devices 1 precaution against esd for semiconductors note: strong electric field, when exposed to a mos device, can cause destruction of the gate oxide and ultimately degrade the device operation. steps must be taken to stop generation of static electricity as much as possible, and quickly dissipate it once, when it has occurred. environmental control must be adequate. when it is dry, humidifier should be used. it is recommended to avoid using insulators that easily build static electricity. semiconductor devices must be stored and transported in an anti-static container, static shielding bag or conductive material. all test and measurement tools including work bench and floor should be grounded. the operator should be grounded using wrist strap. semiconductor devices must not be touched with bare hands. similar precautions need to be taken for pw boards with semiconductor devices on it. 2 handling of unused input pins for cmos note: no connection for cmos device inputs can be cause of malfunction. if no connection is provided to the input pins, it is possible that an internal input level may be generated due to noise, etc., hence causing malfunction. cmos devices behave differently than bipolar or nmos devices. input levels of cmos devices must be fixed high or low by using a pull-up or pull-down circuitry. each unused pin should be connected to v dd or gnd with a resistor, if it is considered to have a possibility of being an output pin. all handling related to the unused pins must be judged device by device and related specifications governing the devices. 3 status before initialization of mos devices note: power-on does not necessarily define initial status of mos device. production process of mos does not define the initial operation status of the device. immediately after the power source is turned on, the devices with reset function have not yet been initialized. hence, power-on does not guarantee out-pin levels, i/o settings or contents of registers. device is not initialized until the reset signal is received. reset operation must be executed immediately after power-on for devices having reset function. v r 10000, v r 12000, v r 4000, v r 4000 series, v r 4100, v r 4100 series, v r 4110, v r 4120, v r 4121, v r 4122, v r 4130, v r 4131, v r 4181, v r 4181a, v r 4300, v r 4305, v r 4310, v r 4400, v r 5000, v r 5000a, v r 5432, v r 5500, and v r series are trademarks of nec corporation. mips is a registered trademark of mips technologies, inc. in the united states. mc68000 is a trademark of motorola inc. ibm370 is a trademark of ibm corp. pentium is a trademark of intel corp. dec vax is a trademark of digital equipment corp. unix is a registered trademark in the united states and other countries, licensed exclusively through x/open company, ltd.
user?s manual u15509ej2v0um 4 purchase of nec i 2 c components conveys a license under the philips i 2 c patent rights to use these components in an i 2 c system, provided that the system conforms to the i 2 c standard specification as defined by philips. exporting this product or equipment that includes this product may require a governmental license from the u.s.a. for some countries because this product utilizes technologies limited by the export control regulations of the u.s.a. m8e 00. 4 the information in this document is current as of april, 2002. the information is subject to change without notice. for actual design-in, refer to the latest publications of nec's data sheets or data books, etc., for the most up-to-date specifications of nec semiconductor products. not all products and/or types are available in every country. please check with an nec sales representative for availability and additional information. no part of this document may be copied or reproduced in any form or by any means without prior written consent of nec. nec assumes no responsibility for any errors that may appear in this document. nec does not assume any liability for infringement of patents, copyrights or other intellectual property rights of third parties by or arising from the use of nec semiconductor products listed in this document or any other liability arising from the use of such products. no license, express, implied or otherwise, is granted under any patents, copyrights or other intellectual property rights of nec or others. descriptions of circuits, software and other related information in this document are provided for illustrative purposes in semiconductor product operation and application examples. the incorporation of these circuits, software and information in the design of customer's equipment shall be done under the full responsibility of customer. nec assumes no responsibility for any losses incurred by customers or third parties arising from the use of these circuits, software and information. while nec endeavours to enhance the quality, reliability and safety of nec semiconductor products, customers agree and acknowledge that the possibility of defects thereof cannot be eliminated entirely. to minimize risks of damage to property or injury (including death) to persons arising from defects in nec semiconductor products, customers must incorporate sufficient safety measures in their design, such as redundancy, fire-containment, and anti-failure features. nec semiconductor products are classified into the following three quality grades: "standard", "special" and "specific". the "specific" quality grade applies only to semiconductor products developed based on a customer-designated "quality assurance program" for a specific application. the recommended applications of a semiconductor product depend on its quality grade, as indicated below. customers must check the quality grade of each semiconductor product before using it in a particular application. "standard": computers, office equipment, communications equipment, test and measurement equipment, audio and visual equipment, home electronic appliances, machine tools, personal electronic equipment and industrial robots "special": transportation equipment (automobiles, trains, ships, etc.), traffic control systems, anti-disaster systems, anti-crime systems, safety equipment and medical equipment (not specifically designed for life support) "specific": aircraft, aerospace equipment, submersible repeaters, nuclear reactor control systems, life support systems and medical equipment for life support, etc. the quality grade of nec semiconductor products is "standard" unless otherwise expressly specified in nec's data sheets or data books, etc. if customers wish to use nec semiconductor products in applications not intended by nec, they must contact an nec sales representative in advance to determine nec's willingness to support a given application. (note) (1) "nec" as used in this statement means nec corporation and also includes its majority-owned subsidiaries. (2) "nec semiconductor products" means any semiconductor product developed or manufactured by or for nec (as defined above). ? ? ? ? ? ?
user?s manual u15509ej2v0um 5 regional information some information contained in this document may vary from country to country. before using any nec product in your application, piease contact the nec office in your country to obtain a list of authorized representatives and distributors. they will verify: ? device availability ? ordering information ? product release schedule ? availability of related technical literature ? development environment specifications (for example, specifications for third-party tools and components, host computers, power plugs, ac supply voltages, and so forth) ? network requirements in addition, trademarks, registered trademarks, export restrictions, and other legal issues may also vary from country to country. nec electronics inc. (u.s.) santa clara, california tel: 408-588-6000 800-366-9782 fax: 408-588-6130 800-729-9288 nec electronics hong kong ltd. hong kong tel: 2886-9318 fax: 2886-9022/9044 nec electronics hong kong ltd. seoul branch seoul, korea tel: 02-528-0303 fax: 02-528-4411 nec electronics shanghai, ltd. shanghai, p.r. china tel: 021-6841-1138 fax: 021-6841-1137 nec electronics taiwan ltd. taipei, taiwan tel: 02-2719-2377 fax: 02-2719-5951 nec electronics singapore pte. ltd. novena square, singapore tel: 253-8311 fax: 250-3583 nec do brasil s.a. electron devices division guarulhos-sp, brasil tel: 11-6462-6810 fax: 11-6462-6829 j02.4 nec electronics (europe) gmbh duesseldorf, germany tel: 0211-65 03 01 fax: 0211-65 03 327 ?sucursal en espa?a madrid, spain tel: 091-504 27 87 fax: 091-504 28 60 v?lizy-villacoublay, france tel: 01-30-67 58 00 fax: 01-30-67 58 99 ?succursale fran?aise ?filiale italiana milano, italy tel: 02-66 75 41 fax: 02-66 75 42 99 ?branch the netherlands eindhoven, the netherlands tel: 040-244 58 45 fax: 040-244 45 80 ?branch sweden taeby, sweden tel: 08-63 80 820 fax: 08-63 80 388 ?united kingdom branch milton keynes, uk tel: 01908-691-133 fax: 01908-670-290
user?s manual u15509ej2v0um 6 preface readers this manual targets users who intend to understand the functions of the v r 4100 series, the risc microprocessors, and to design application systems using them. purpose this manual introduces the architecture of the v r 4100 series to users, following the organization described below. organization two manuals are available for the v r 4100 series: architecture user?s manual (this manual) and hardware user?s manual of each product. architecture user's manual hardware user's manual ? pipeline operation ? cache organization and memory management system ? exception processing ? interrupts ? instruction set ? pin functions ? physical address space ? function of coprocessor 0 ? initialization interface ? peripheral units how to read this manual it is assumed that the reader of this manual has general knowledge in the fields of electric engineering, logic circuits, and microcomputers. in this manual, the following products are referred to as the v r 4100 series. descriptions that differ between these products are explained individually, and common parts are explained as for the v r 4100 series. v r 4121 ( pd30121) v r 4122 ( pd30122) v r 4131 ( pd30131) v r 4181 ( pd30181) v r 4181a ( pd30181a, 30181ay) to learn in detail about the function of a specific instruction, read chapter 2 cpu instruction set summary , chapter 3 mips16 instruction set , chapter 9 cpu instruction set details , and chapter 10 mips16 instruction set format . to learn about the overall functions of the v r 4100 series, read this manual in sequential order. to learn about hardware functions, refer to hardware user's manual which is separately available. to learn about electrical specifications, refer to data sheet which is separately available.
user?s manual u15509ej2v0um 7 conventions data significance: higher on left and lower on right active low: xxx# (tra iling # after pin and signal names) note : description of item marked with note in the text caution : information requiring particular attention remark : supplementary information numeric representation: binary/decimal ... xxxx hexadecimal ... 0x xxxx prefixes representing an exponent of 2 (for address space or memory capacity): k (kilo) ... 2 10 = 1024 m (mega) ... 2 20 = 1024 2 g (giga) ... 2 30 = 1024 3 t (tera) ... 2 40 = 1024 4 p (peta) ... 2 50 = 1024 5 e (exa) ... 2 60 = 1024 6 related documents the related documents indicated here may include preliminary version. however, preliminary versions are not marked as such. document name document number v r 4100 series architecture user?s manual this manual v r 4121 user?s manual u13569e pd30121 (v r 4121) data sheet u14691e v r 4122 user?s manual u14327e pd30122 (v r 4122) data sheet u16219e v r 4131 hardware user?s manual u15350e pd30131 (v r 4131) data sheet to be prepared v r 4181 hardware user?s manual u14272e pd30181 (v r 4181) data sheet u14273e v r 4181a hardware user?s manual to be prepared pd30181a, 30181ay (v r 4181a) data sheet to be prepared v r series tm programming guide application note u10710e
user?s manual u15509ej2v0um 8 contents chapter 1 introduction ..................................................................................................... ......... 17 1.1 features ................................................................................................................. ................... 17 1.2 cpu core ................................................................................................................. ................. 19 1.2.1 cpu registers .......................................................................................................... .................... 20 1.2.2 coprocessors ........................................................................................................... ................... 21 1.2.3 system control coprocessor (cp0) ....................................................................................... ....... 21 1.2.4 floating-point unit (fpu) .............................................................................................. ............... 23 1.2.5 cache memory ........................................................................................................... ................. 23 1.3 cpu instruction set overview ............................................................................................. ... 23 1.4 data formats and addressing .............................................................................................. .. 26 1.5 memory management system ................................................................................................ 3 0 1.5.1 translation lookaside buffer (tlb) ..................................................................................... ......... 30 1.5.2 processor modes ........................................................................................................ ................. 30 1.6 instruction pipeline ..................................................................................................... ............. 31 1.6.1 branch prediction ...................................................................................................... ................... 31 1.7 code compatibility ....................................................................................................... ........... 32 chapter 2 cpu instruction set summary ......................................................................... 33 2.1 instruction set architecture ............................................................................................. ....... 33 2.2 cpu instruction formats .................................................................................................. ....... 34 2.3 instructions added in the v r 4100 series ............................................................................... 35 2.3.1 product-sum operation instructions ..................................................................................... ........ 35 2.3.2 power mode instructions ................................................................................................ ............. 35 2.4 instruction overview ..................................................................................................... .......... 36 2.4.1 load and store instructions ............................................................................................ ............. 36 2.4.2 computational instructions ............................................................................................. ............. 40 2.4.3 jump and branch instructions ........................................................................................... ........... 47 2.4.4 special instructions ................................................................................................... ................... 51 2.4.5 system control coprocessor (cp0) instructions .......................................................................... .52 chapter 3 mips16 instruction set ......................................................................................... 54 3.1 outline .................................................................................................................. ..................... 54 3.2 features ................................................................................................................. ................... 54 3.3 register set ............................................................................................................. ................. 55 3.4 isa mode ................................................................................................................. .................. 56 3.4.1 changing isa mode bit by software ...................................................................................... ...... 56 3.4.2 changing isa mode bit by exception ..................................................................................... ...... 56 3.4.3 enabling change isa mode bit ........................................................................................... ......... 57 3.5 types of instructions .................................................................................................... ........... 57 3.6 instruction format ....................................................................................................... ............ 59 3.7 mips16 operation code bit encoding ................................................................................... 64 3.8 outline of instructions .................................................................................................. ........... 67
user?s manual u15509ej2v0um 9 3.8.1 pc-relative instructions ............................................................................................... ................ 67 3.8.2 extend instruction ..................................................................................................... ................... 68 3.8.3 delay slots ............................................................................................................ ....................... 70 3.8.4 instruction details .................................................................................................... .................... 71 chapter 4 pipeline ......................................................................................................... ................ 84 4.1 pipeline stages .......................................................................................................... .............. 84 4.1.1 v r 4121, v r 4122, v r 4181a ......................................................................................................... 84 4.1.2 v r 4131 .......................................................................................................................... .............. 87 4.1.3 v r 4181 .......................................................................................................................... .............. 89 4.2 branch delay ............................................................................................................. ............... 90 4.2.1 v r 4121, v r 4122, v r 4181a ......................................................................................................... 90 4.2.2 v r 4131 .......................................................................................................................... .............. 91 4.2.3 v r 4181 .......................................................................................................................... .............. 93 4.3 branch prediction ........................................................................................................ ............ 94 4.3.1 v r 4122, v r 4181a ....................................................................................................................... 95 4.3.2 v r 4131 .......................................................................................................................... .............. 97 4.4 load delay ............................................................................................................... ................. 101 4.5 instruction streaming .................................................................................................... .......... 101 4.6 pipeline activities ...................................................................................................... .............. 102 4.7 interlock and exception .................................................................................................. ........ 116 4.7.1 exception conditions ................................................................................................... ................ 119 4.7.2 stall conditions ....................................................................................................... ..................... 120 4.7.3 slip conditions ........................................................................................................ ..................... 121 4.7.4 bypassing .............................................................................................................. ...................... 123 chapter 5 memory management system ............................................................................ 124 5.1 processor modes .......................................................................................................... ........... 124 5.1.1 operating mode ......................................................................................................... .................. 124 5.1.2 addressing mode ........................................................................................................ ................ 124 5.2 translation lookaside buffer (tlb) ...................................................................................... 1 25 5.2.1 format of a tlb entry .................................................................................................. ............... 125 5.2.2 manipulation of tlb .................................................................................................... ................ 126 5.2.3 tlb instructions ....................................................................................................... .................... 127 5.2.4 tlb exceptions ......................................................................................................... ................... 127 5.3 virtual-to-physical address translation ............................................................................... 128 5.3.1 32-bit mode address translation ........................................................................................ .......... 131 5.3.2 64-bit mode address translation ........................................................................................ .......... 132 5.4 address space ........................................................................................................... ............. 133 5.4.1 user mode virtual address space ........................................................................................ ........ 133 5.4.2 supervisor mode virtual address space .................................................................................. .... 135 5.4.3 kernel mode virtual address space ...................................................................................... ....... 138 5.5 memory management registers ............................................................................................ 14 6 5.5.1 index register (0) ..................................................................................................... .................... 147 5.5.2 random register (1) .................................................................................................... ................ 147 5.5.3 entrylo0 (2) and entrylo1 (3) registers ................................................................................ ...... 148
user?s manual u15509ej2v0um 10 5.5.4 pagemask register (5) .................................................................................................. ............... 149 5.5.5 wired register (6) ..................................................................................................... .................... 150 5.5.6 entryhi register (10) .................................................................................................. .................. 151 5.5.7 processor revision identifier (prid) register (15) ..................................................................... .. 152 5.5.8 config register (16) ................................................................................................... ................... 153 5.5.9 load linked address (lladdr) register (17) ............................................................................. .. 155 5.5.10 taglo (28) and taghi (29) registers ................................................................................... ...... 156 chapter 6 exception processing ........................................................................................... 1 57 6.1 exception processing overview ............................................................................................ 157 6.1.1 precision of exceptions ................................................................................................ ................ 157 6.2 exception processing registers ........................................................................................... . 158 6.2.1 context register (4) ................................................................................................... ................... 159 6.2.2 badvaddr register (8) .................................................................................................. ................ 160 6.2.3 count register (9) ..................................................................................................... .................... 160 6.2.4 compare register (11) .................................................................................................. ............... 161 6.2.5 status register (12) ................................................................................................... ................... 161 6.2.6 cause register (13) .................................................................................................... .................. 165 6.2.7 exception program counter (epc) register (14) ......................................................................... 1 67 6.2.8 watchlo (18) and watchhi (19) registers ................................................................................ ... 168 6.2.9 xcontext register (20) ................................................................................................. ................. 169 6.2.10 parity error register (26) ............................................................................................ ................ 170 6.2.11 cache error register (27) ............................................................................................. .............. 170 6.2.12 errorepc register (30) ................................................................................................ ............... 171 6.3 overview of exceptions ................................................................................................... ....... 173 6.3.1 exception types ........................................................................................................ ................... 173 6.3.2 exception vector locations ............................................................................................. .............. 173 6.3.3 priority of exceptions ................................................................................................. .................. 175 6.4 details of exceptions .................................................................................................... ........... 176 6.4.1 cold reset exception ................................................................................................... ............... 176 6.4.2 soft reset exception ................................................................................................... ................ 177 6.4.3 nmi exception .......................................................................................................... .................... 178 6.4.4 address error exception ................................................................................................ .............. 179 6.4.5 tlb exceptions ......................................................................................................... ................... 180 6.4.6 bus error exception .................................................................................................... ................. 183 6.4.7 system call exception .................................................................................................. ............... 184 6.4.8 breakpoint exception ................................................................................................... ................ 185 6.4.9 coprocessor unusable exception ......................................................................................... ....... 186 6.4.10 reserved instruction exception ........................................................................................ ......... 187 6.4.11 trap exception ........................................................................................................ ................... 188 6.4.12 integer overflow exception ............................................................................................ ............ 188 6.4.13 watch exception ....................................................................................................... ................. 189 6.4.14 interrupt exception ................................................................................................... .................. 190 6.5 exception processing and servicing flowcharts ................................................................. 191
user?s manual u15509ej2v0um 11 chapter 7 cache memory .................................................................................................... ...... 198 7.1 memory organization ...................................................................................................... ........ 198 7.1.1 on-chip caches ......................................................................................................... .................. 199 7.2 cache organization ....................................................................................................... .......... 200 7.2.1 instruction cache line ................................................................................................. ................. 200 7.2.2 data cache line ........................................................................................................ .................... 201 7.2.3 placement of cache data ................................................................................................ ............. 202 7.3 cache operations ......................................................................................................... ........... 202 7.3.1 cache data coherency ................................................................................................... ............. 203 7.3.2 replacement of cache line .............................................................................................. ............ 203 7.3.3 accessing the caches ................................................................................................... .............. 204 7.4 cache states ............................................................................................................. ............... 205 7.4.1 cache state transition diagrams ........................................................................................ .......... 206 7.5 cache access flow ........................................................................................................ ......... 207 7.6 manipulation of the caches by an external agent ............................................................... 220 7.7 initialization of the caches ............................................................................................. ........ 220 chapter 8 cpu core interrupts ............................................................................................ 221 8.1 types of interrupt request ............................................................................................... ...... 221 8.1.1 non-maskable interrupt (nmi) ........................................................................................... .......... 221 8.1.2 ordinary interrupts .................................................................................................... ................... 221 8.1.3 software interrupts generated in cpu core .............................................................................. ... 222 8.1.4 timer interrupt ........................................................................................................ ..................... 222 8.2 acknowledging interrupts ................................................................................................. ..... 222 8.2.1 detecting hardware interrupts .......................................................................................... ........... 222 8.2.2 masking interrupt signals .............................................................................................. ............... 223 chapter 9 cpu instruction set details ............................................................................ 224 9.1 instruction notation conventions ......................................................................................... . 224 9.2 notes on using cpu instructions .......................................................................................... 226 9.2.1 load and store instructions ............................................................................................ ............. 226 9.2.2 jump and branch instructions ........................................................................................... .......... 227 9.2.3 system control coprocessor (cp0) instructions .......................................................................... 228 9.3 cpu instructions ......................................................................................................... ............. 228 9.4 cpu instruction opcode bit encoding .................................................................................. 383 chapter 10 mips16 instruction set format ..................................................................... 386 chapter 11 coprocessor 0 hazards ................................................................................... 421 appendix index .............................................................................................................. ................... 427
user?s manual u15509ej2v0um 12 list of figures (1/3) fig. no. title page 1-1. cpu core internal block diagram ........................................................................................... ............. 19 1-2. cpu registers ............................................................................................................. ......................... 21 1-3. cpu instruction formats (32-bit length instruction) ....................................................................... ...... 23 1-4. cpu instruction formats (16-bit length instruction) ....................................................................... ...... 25 1-5. byte address in big-endian byte order ..................................................................................... ........... 27 1-6. byte address in little-endian byte order .................................................................................. ............ 28 1-7. misaligned word accessing (little-endian) ................................................................................. ......... 29 2-1. cpu instruction formats ................................................................................................... .................... 34 2-2. byte specification related to load and store instructions ................................................................. .. 37 4-1. pipeline stages (v r 4121, v r 4122, v r 4181a) ....................................................................................... 85 4-2. instruction execution in the pipeline (v r 4121, v r 4122, v r 4181a) ....................................................... 86 4-3. pipeline stages (v r 4131) ...................................................................................................................... 87 4-4. instruction execution in the pipeline (v r 4131) ...................................................................................... 88 4-5. pipeline stages (v r 4181) ...................................................................................................................... 89 4-6. instruction execution in the pipeline (v r 4181) ...................................................................................... 89 4-7. branch delay (v r 4121, v r 4122, v r 4181a) ........................................................................................... 90 4-8. branch delay (v r 4131, mips iii instruction mode) ............................................................................... 91 4-9. branch delay (v r 4131, mips16 instruction mode) ............................................................................... 92 4-10. branch delay (v r 4181) ......................................................................................................................... 93 4-11. pipeline on branch prediction (v r 4122, v r 4181a) ............................................................................... 95 4-12. pipeline on branch prediction (v r 4131, when the branch is in the lower address) ........................... 97 4-13. pipeline on branch prediction (v r 4131, when the branch is in the higher address) .......................... 99 4-14. pipeline activities ...................................................................................................... ............................ 102 4-15. add instruction pipeline activities (v r 4121, v r 4122, v r 4181a) .......................................................... 104 4-16. add instruction pipeline activities (v r 4131) ........................................................................................ 105 4-17. add instruction pipeline activities (v r 4181) ........................................................................................ 105 4-18. jalr instruction pipeline activities (v r 4121, v r 4122, v r 4181a) ......................................................... 106 4-19. jalr instruction pipeline activities (v r 4131) ....................................................................................... 107 4-20. jalr instruction pipeline activities (v r 4181) ....................................................................................... 107 4-21. beq instruction pipeline activities (v r 4121, v r 4122, v r 4181a) .......................................................... 108 4-22. beq instruction pipeline activities (v r 4131) ........................................................................................ 109 4-23. beq instruction pipeline activities (v r 4181) ........................................................................................ 109 4-24. tlt instruction pipeline activities (v r 4121, v r 4122, v r 4181a) ........................................................... 110 4-25. tlt instruction pipeline activities (v r 4131) .......................................................................................... 111 4-26. tlt instruction pipeline activities (v r 4181) .......................................................................................... 111 4-27. lw instruction pipeline activities (v r 4121, v r 4122, v r 4181a) ............................................................ 112 4-28. lw instruction pipeline activities (v r 4131) .......................................................................................... 113 4-29. lw instruction pipeline activities (v r 4181) .......................................................................................... 113 4-30. sw instruction pipeline activities (v r 4121, v r 4122, v r 4181a) ........................................................... 114 4-31. sw instruction pipeline activities (v r 4131) .......................................................................................... 115 4-32. sw instruction pipeline activities (v r 4181) .......................................................................................... 115 4-33. interlocks, exceptions, and faults ....................................................................................... ................. 116
user?s manual u15509ej2v0um 13 list of figures (2/3) fig. no. title page 4-34. exception detection ...................................................................................................... ......................... 119 4-35. data cache miss stall .................................................................................................... ........................ 120 4-36. cache instruction stall .................................................................................................. ....................... 120 4-37. load data interlock ...................................................................................................... .......................... 121 4-38. md busy interlock ........................................................................................................ .......................... 122 5-1. format of a tlb entry ..................................................................................................... ....................... 126 5-2. tlb manipulation overview ................................................................................................. .................. 127 5-3. virtual-to-physical address translation ................................................................................... .............. 129 5-4. address translation in tlb ................................................................................................ .................... 130 5-5. 32-bit mode virtual address translation ................................................................................... ............. 131 5-6. 64-bit mode virtual address translation ................................................................................... ............. 132 5-7. user mode address space ................................................................................................... ................. 134 5-8. supervisor mode address space ............................................................................................. ............. 136 5-9. kernel mode address space ................................................................................................. ................ 139 5-10. xkphys area address space ................................................................................................ .................. 140 5-11. index register ........................................................................................................... ............................. 147 5-12. random register .......................................................................................................... ......................... 147 5-13. entrylo0 and entrylo1 registers .......................................................................................... ................ 148 5-14. pagemask register ........................................................................................................ ........................ 149 5-15. positions indicated by the wired register ................................................................................ ............. 150 5-16. wired register ........................................................................................................... ............................ 150 5-17. entryhi register ......................................................................................................... ............................ 151 5-18. prid register ............................................................................................................ ............................. 152 5-19. config register .......................................................................................................... ............................ 153 5-20. lladdr register .......................................................................................................... ........................... 155 5-21. taglo register ........................................................................................................... ........................... 156 5-22. taghi register ........................................................................................................... ............................ 156 6-1. context register ........................................................................................................... .......................... 159 6-2. badvaddr register .......................................................................................................... ....................... 160 6-3. count register ............................................................................................................. ........................... 160 6-4. compare register........................................................................................................... ........................ 161 6-5. status register ............................................................................................................ ........................... 161 6-6. status register diagnostic status field ................................................................................... ............. 163 6-7. cause register ............................................................................................................. .......................... 165 6-8. epc register (when mips16 isa is disabled) ................................................................................ ..... 167 6-9. epc register (when mips16 isa is enabled) ................................................................................. ..... 168 6-10. watchlo register .......................................................................................................... ......................... 168 6-11. watchhi register.......................................................................................................... .......................... 168 6-12. xcontext register......................................................................................................... .......................... 169 6-13. parity error register ..................................................................................................... .......................... 170 6-14. cache error register ...................................................................................................... ........................ 170 6-15. errorepc register (when mips16 isa is disabled) .......................................................................... ... 172
user?s manual u15509ej2v0um 14 list of figures (3/3) fig. no. title page 6-16. errorepc register (when mips16 isa is enabled) ........................................................................... .. 172 6-17. common exception handling ................................................................................................ ................ 192 6-18. tlb/xtlb refill exception handling ....................................................................................... .............. 194 6-19. cold reset exception handling ............................................................................................ ................ 196 6-20. soft reset and nmi exception handling .................................................................................... ........... 197 7-1. logical hierarchy of memory ............................................................................................... ................. 198 7-2. on-chip caches and main memory ............................................................................................ ........... 199 7-3. instruction cache line format ............................................................................................. ................. 200 7-4. data cache line format .................................................................................................... ................... 201 7-5. cache index and data output ............................................................................................... ................ 204 7-6. instruction cache state diagram ........................................................................................... ............... 206 7-7. data cache state diagram .................................................................................................. ................. 206 7-8. flow on instruction fetch ................................................................................................. ..................... 207 7-9. flow on load operations ................................................................................................... ................... 208 7-10. flow on store operations ................................................................................................. .................... 209 7-11. flow on index_invalidate operations ...................................................................................... .............. 210 7-12. flow on index_writeback_invalidate operations ............................................................................ ...... 211 7-13. flow on index_load_tag operations ........................................................................................ ........... 211 7-14. flow on index_store_tag operations ....................................................................................... ............ 212 7-15. flow on create_dirty operations .......................................................................................... ................ 212 7-16. flow on hit_invalidate operations ........................................................................................ ................ 213 7-17. flow on hit_writeback_invalidate operations .............................................................................. ........ 214 7-18. flow on fill operations .................................................................................................. ....................... 215 7-19. flow on hit_writeback operations ......................................................................................... .............. 216 7-20. flow on fetch_and_lock operations (v r 4131 only) ............................................................................ 217 7-21. writeback flow ........................................................................................................... .......................... 218 7-22. refill flow .............................................................................................................. ................................ 218 7-23. writeback & refill flow .................................................................................................. ....................... 219 8-1. non-maskable interrupt signal ............................................................................................. ................. 221 8-2. hardware interrupt signals ................................................................................................ ................... 222 8-3. masking of the interrupt request signals .................................................................................. ........... 223 9-1. cpu instruction opcode bit encoding ....................................................................................... ........... 383
user?s manual u15509ej2v0um 15 list of tables (1/2) table no. title page 1-1. comparison of functions of v r 4100 series ........................................................................................... 18 1-2. cp0 registers ............................................................................................................. ........................... 22 1-3. list of instructions supported by v r series processors ........................................................................ 32 2-1. macc instructions (for v r 4121, v r 4122, v r 4131, and v r 4181a) ......................................................... 35 2-2. product-sum operation instructions (for v r 4181) ................................................................................. 35 2-3. power mode instructions ................................................................................................... .................... 35 2-4. number of delay slot cycles necessary for load and store instructions ............................................. 36 2-5. load/store instruction .................................................................................................... ........................ 38 2-6. load/store instruction (extended isa) ..................................................................................... ............. 39 2-7. alu immediate instruction ................................................................................................. ................... 40 2-8. alu immediate instruction (extended isa) .................................................................................. ......... 41 2-9. three-operand type instruction ............................................................................................ ................ 41 2-10. three-operand type instruction (extended isa) ............................................................................ ....... 42 2-11. shift instruction ........................................................................................................ .............................. 42 2-12. shift instruction (extended isa) ......................................................................................... .................... 43 2-13. multiply/divide instructions ............................................................................................. ....................... 44 2-14. multiply/divide instructions (extended isa) .............................................................................. ............. 44 2-15. product-sum operation instructions (for v r 4121, v r 4122, v r 4131, and v r 4181a) ............................. 45 2-16. product-sum operation instructions (for v r 4181) ................................................................................. 45 2-17. number of stall cycles in multiply and divide instructions ............................................................... ..... 46 2-18. jump instructions ........................................................................................................ ........................... 47 2-19. branch instructions ...................................................................................................... .......................... 48 2-20. branch instructions (extended isa) ....................................................................................... ................ 49 2-21. special instructions ..................................................................................................... ........................... 51 2-22. special instructions (extended isa) ...................................................................................... ................ 51 2-23. system control coprocessor (cp0) instructions ............................................................................ ....... 52 3-1. general-purpose registers ................................................................................................. ................... 55 3-2. special registers ......................................................................................................... .......................... 56 3-3. mips16 instruction set outline ............................................................................................ .................. 57 3-4. field definition .......................................................................................................... ............................. 59 3-5. bit encoding of major operation code (op) ................................................................................. .......... 64 3-6. rr minor operation code (rr-type instruction) ............................................................................. ..... 64 3-7. rrr minor operation code (rrr-type instruction) ........................................................................... .. 65 3-8. rri-a minor operation code (rri-type add instruction) .................................................................... 6 5 3-9. shift minor operation code (shift-type instruction) ....................................................................... .65 3-10. i8 minor operation code (i8-type instruction) ............................................................................ ........... 65 3-11. i64 minor operation code (64-bit only, i64-type instruction) ............................................................. .. 66 3-12. base pc address setting .................................................................................................. .................... 67 3-13. extendable mips16 instructions ........................................................................................... ................. 69 3-14. load and store instructions .............................................................................................. ..................... 71 3-15. alu immediate instructions ............................................................................................... .................... 74 3-16. two-/three-operand register type ......................................................................................... ............. 76 3-17. shift instructions ....................................................................................................... ............................. 78
user?s manual u15509ej2v0um 16 list of tables (2/2) table no. title page 3-18. multiply/divide instructions ............................................................................................. ....................... 80 3-19. jump and branch instructions ............................................................................................. ................. 82 3-20. special instructions ..................................................................................................... .......................... 83 4-1. description of pipeline activities during each stage ...................................................................... ....... 103 4-2. correspondence of pipeline stage to interlock and exception conditions ........................................... 117 4-3. pipeline interlock ........................................................................................................ ........................... 118 4-4. description of pipeline exception ......................................................................................... ................. 118 5-1. user mode segments ........................................................................................................ ................... 134 5-2. 32-bit and 64-bit supervisor mode segments ................................................................................ ....... 137 5-3. 32-bit kernel mode segments ............................................................................................... ................ 141 5-4. 64-bit kernel mode segments ............................................................................................... ................ 143 5-5. cacheability and the xkphys address space ................................................................................. ....... 144 5-6. cp0 registers ............................................................................................................. .......................... 146 5-7. cache algorithm ........................................................................................................... ......................... 149 5-8. mask values and page sizes ................................................................................................ ............... 149 5-9. system interface clock ratio (to pclock) .................................................................................. ........... 154 5-10. instruction cache sizes .................................................................................................. ...................... 155 5-11. data cache sizes ......................................................................................................... ........................ 155 6-1. cp0 registers ............................................................................................................. .......................... 158 6-2. cause register exception code field ....................................................................................... ........... 166 6-3. 32-bit mode exception vector base addresses ............................................................................... ..... 174 6-4. 64-bit mode exception vector base addresses ............................................................................... ..... 174 6-5. exception priority order .................................................................................................. ...................... 175 7-1. cache size, line size, and index .......................................................................................... ............... 204 9-1. cpu instruction operation notations ....................................................................................... ............. 225 9-2. load and store common functions ........................................................................................... .......... 226 9-3. access type specifications for loads/stores ............................................................................... ........ 227 11-1. coprocessor 0 hazards .................................................................................................... .................... 422 11-2. calculation example of cp0 hazard and number of instructions inserted ........................................... 426
user?s manual u15509ej2v0um 17 chapter 1 introduction this chapter gives an outline of the v r 4121 ( pd30121), the v r 4122 ( pd30122), the v r 4131 ( pd30131), the v r 4181 ( pd30181), and the v r 4181a ( pd30181a, 30181ay), which are 64-/32-bit risc microprocessors. in this manual, these products are referred to as the v r 4100 series. 1.1 features the v r 4100 series, which is a part of the risc microprocessor v r series, is a group of products developed for pdas. the v r series is high-performance 64-/32-bit microprocessors employing the risc (reduced instruction set computer) architecture developed by mips tm manufactured by nec. the v r 4100 series accommodates the ultra low power consumption cpu core provided with cache memory, a high-speed product-sum operation unit, and an address management unit. the v r 4100 series also has interface units for the peripheral circuits required for battery-driven portable information equipment (refer to hardware user's manual of each product for details about on-chip peripheral functions). the features of the v r 4100 series are described below. { employs 64-bit risc core as a cpu possible to operate in 32-bit mode { optimized instruction pipeline { on-chip cache memory { employs write-back cache reduces store operations using system bus { physical address space: 32 bits virtual address space: 40 bits { translation lookaside buffer (tlb) with 32-double entries { instruction set: mips iii (however, the fpu, ll, lld, sc, and scd instructions are removed), mips16 { supports high-speed product-sum operation instructions { effective power management features, which include the four modes of fullspeed, standby, suspend, and hibernate { on-chip pll and clock generator { variable on-chip peripheral functions ideal for potable information equipment the functions of the v r 4100 series are listed as follows.
chapter 1 introduction user?s manual u15509ej2v0um 18 table 1-1. comparison of functions of v r 4100 series item v r 4121 v r 4122 v r 4131 v r 4181 v r 4181a part number pd30121 pd30122 pd30131 pd30181 pd30181a, 30181ay cpu core v r 4120 tm core v r 4130 tm core v r 4110 tm core v r 4120 core instruction set mips i, ii, iii + high-speed product-sum (32-bit) + mips16 mips i, ii, iii + high-speed product- sum (16-bit) + mips16 mips i, ii, iii + high-speed product- sum (32-bit) + mips16 pipeline 5-/6-stage pipeline 2-way superscalar 6-/7-stage pipeline 5-stage pipeline 5-/6-stage pipeline on-chip cache memory ? instruction: 16kb ? data: 8kb ? direct map ? instruction: 32kb ? data: 16kb ? direct map ? instruction: 16kb ? data: 16kb ? 2-way set- associative ? with line lock function ? instruction: 4kb ? data: 4kb ? direct map ? instruction: 8kb ? data: 8kb ? direct map on-chip peripheral functions ? memory controller ? extension bus interface (isa) ? lcd interface ? touch panel interface ? keyboard interface ? communication interface (uart, csi, irda (sir, mir, fir)) ? modem interface ? audio interface ? led controller ? dma controller ? timer, counter ? watchdog timer ? general-purpose port ? clock generator ? power management unit ? a/d converter ? d/a converter ? memory controller ? extension bus interface (isa, pci) ? communication interface (uart, csi, irda (sir, mir, fir)) ? led controller ? timer, counter ? general-purpose port ? clock generator ? power management unit ? memory controller ? extension bus interface (isa) ? lcd interface ? touch panel interface ? keyboard interface ? communication interface (uart, csi, irda (sir)) ? compactflash interface ? audio interface ? led controller ? dma controller ? timer, counter ? watchdog timer ? general-purpose port ? clock generator ? power management unit ? a/d converter ? d/a converter ? memory controller ? extension bus interface (isa) ? lcd interface ? touch panel interface ? keyboard interface ? communication interface (uart, csi, i 2 c, irda (sir)) ? compactflash interface ? ac97/i 2 s audio interface ? dma controller ? usb host/function controller ? pwm generator ? timer, counter ? watchdog timer ? general-purpose port ? clock generator ? power management unit ? a/d converter ? d/a converter other functions ?? on-chip branch prediction function ? on-chip hardware debug function ?? on-chip branch prediction function ? on-chip hardware debug function
chapter 1 introduction user?s manual u15509ej2v0um 19 1.2 cpu core figure 1-1 shows the internal block diagram of the cpu core. in addition to the conventional high-performance integer operation units, this cpu core has a full-associative format translation lookaside buffer (tlb), which has 32 entries that provide mapping to 2-page pairs for one entry. moreover, it also has instruction and data caches, and a bus interface. figure 1-1. cpu core internal block diagram tlb virtual address bus internal data bus bus interface data cache instruction cache clock generator cp0 cpu control (o) control (i) address/data (o) address/data (i) internal clock (1) cpu cpu is a block that performs integer calculations. this block includes a 64-bit integer data path, and product- sum operator. (2) coprocessor 0 (cp0) cp0 incorporates a memory management unit (mmu) and exception handling function. the mmu checks whether there is an access between different memory segments (user, supervisor, and kernel) by executing address conversion. the translation lookaside buffer (tlb) converts virtual addresses to physical addresses. (3) instruction cache the instruction cache employs virtual index and physical tag formats. it is managed with direct mapping format in the v r 4121, v r 4122, v r 4181, and v r 4181a, or with 2-way set-associative format in the v r 4131. (4) data cache the data cache employs virtual index, physical tag, and writeback formats. it is managed with direct mapping format in the v r 4121, v r 4122, v r 4181, and v r 4181a, or with 2-way set-associative format in the v r 4131.
chapter 1 introduction user?s manual u15509ej2v0um 20 (5) cpu bus interface the bus interface controls data transmission/reception between the cpu core and peripheral units. the bus interface consists of two 32-bit multiplexed address/data buses (one for input, and the other for output), clock signals, interrupt request signals, and various other control signals. (6) clock generator the clock generator processes clock inputs and supplies them to internal units. 1.2.1 cpu registers the cpu core has thirty-two 64-bit general-purpose registers (gpr). in addition, it provides the following special registers: ? pc: program counter (64 bits) ? hi register: contains the integer multiply and divide higher doubleword result (64 bits) ? lo register: contains the integer multiply and divide lower doubleword result (64 bits) two of the general-purpose registers are assigned the following functions: ? r0 is fixed to 0, and can be used as the target register for any instruction whose result is to be discarded. r0 can also be used as a source register when a zero value is needed. ? r31 is the link register used by link instructions such as jal (jump and link) instructions. this register can be used for other instructions. however, be careful that use of the register by a link instruction will not coincide with use of the register for other operations. the register group is provided within the cp0 (system control coprocessor), to process exceptions and to manage addresses. cpu registers can operate as either 32-bit or 64-bit registers, depending on the processor operation mode. the operation of the cpu register differs depending on what instructions are executed: 32-bit instructions or mips16 instructions. for details, refer to chapter 3 mips16 instruction set . the v r 4100 series processors have no program status word (psw) register as such; this is covered by the status and cause registers incorporated within the system control coprocessor (cp0). for details of cp0 registers, refer to table 1-2 cp0 registers . figure 1-2 shows the cpu registers.
chapter 1 introduction user?s manual u15509ej2v0um 21 figure 1-2. cpu registers 0 63 hi 0 63 lo 0 pc general-purpose registers multiply and divide registers program counter 0 63 63 r2 r1 r0 = 0 r31 = link address r30 r29 1.2.2 coprocessors mips isa defines 4 types of coprocessors (cp0 to cp3). ? cp0 translates virtual addresses to physical addresses, switches the operating mode (kernel, supervisor, or user mode), and manages exceptions. it also controls the cache subsystem to analyze a cause and to return from the error state. ? cp1 is reserved for floating-point instructions. ? cp2 is reserved for future definition by mips. ? cp3 is no longer defined. cp3 instructions are reserved for future extensions. the v r 4100 series implements the cp0 only. 1.2.3 system control coprocessor (cp0) cp0 translates virtual addresses to physical addresses, switches the operating mode, controls the cache memory, and manages exceptions. for detailed descriptions of these functions, refer to chapter 5 memory management system and chapter 6 exception processing . cp0 has thirty-two registers that have corresponding register number. the register number is used as an operand of instructions to specify a cp0 register to be accessed. table 1-2 shows simple descriptions of each register.
chapter 1 introduction user?s manual u15509ej2v0um 22 table 1-2. cp0 registers register number register name usage description 0 index memory management programmable pointer to tlb array 1 random memory management pseudo-random pointer to tlb array (read only) 2 entrylo0 memory management lower half of tlb entry for even vpn 3 entrylo1 memory management lower half of tlb entry for odd vpn 4 context exception processing pointer to virtual pte table in 32-bit mode 5 pagemask memory management page size specification 6 wired memory management number of wired tlb entries 7 ?? reserved for future use 8 badvaddr exception processing virtual address where the most recent error occurred 9 count exception processing timer count 10 entryhi memory management upper half of tlb entry (including asid) 11 compare exception processing timer compare value 12 status exception processing operation status 13 cause exception processing cause of last exception 14 epc exception processing exception program counter 15 prid memory management processor revision identifier 16 config memory management memory mode system specification 17 lladdr note1 memory management physical address for diagnostic purpose 18 watchlo exception processing memory reference trap address lower bits 19 watchhi exception processing memory reference trap address higher bits 20 xcontext exception processing pointer to virtual pte table in 64-bit mode 21 to 25 ?? reserved for future use 26 parity error note2 exception processing cache parity bits 27 cache error note2 exception processing index and status of cache error 28 taglo memory management cache tag register (low) 29 taghi memory management cache tag register (high) 30 errorepc exception processing error exception program counter 31 ?? reserved for future use notes 1. this register is defined to maintain compatibility with the v r 4000 tm and v r 4400 tm . the contents of this register are meaningless in the normal operation. 2. this register is defined to maintain compatibility with the v r 4100 tm . this register is not used in the normal operation. caution when accessing the cp0 registers, some instructions require consideration of the interval time until the next instruction is executed, because there is a delay from when the contents of the cp0 register change to when this change is reflected in the cpu operation. this time lag is called a cp0 hazard. for details, refer to chapter 11 coprocessor 0 hazards.
chapter 1 introduction user?s manual u15509ej2v0um 23 1.2.4 floating-point unit (fpu) the v r 4100 series does not support the floating-point unit (fpu). a coprocessor unusable exception will occur if any fpu instructions are executed. if necessary, fpu instructions should be emulated by software in an exception handler. 1.2.5 cache memory the v r 4100 series incorporates instruction and data caches, which are independent of each other. this configuration enables high-performance pipeline operations. both caches have a 64-bit data bus, enabling a one- clock access. these buses can be accessed in parallel. the caches are managed with direct mapping format in the v r 4121, v r 4122, v r 4181, and v r 4181a, or with 2- way set-associative format in the v r 4131. the data cache of the v r 4131 has also the line lock function. a detailed description of caches is given in chapeter 7 cache memory . 1.3 cpu instruction set overview there are two types of cpu instructions: 32-bit length instructions (mips iii) and 16-bit length instructions (mips16). use of the mips16 instructions is enabled or disabled by setting mips16en pin during a reset. (1) mips iii instructions all the cpu instructions are 32-bit length when executing mips iii instructions, and they are classified into three instruction formats as shown in figure 1-3: immediate (i type), jump (j type), and register (r type). the fields of each instruction format are described in chapter 2 cpu instruction set summary . figure 1-3. cpu instruction formats (32-bit length instruction) 31 26 25 21 20 16 15 0 op rs rt immediate i - type (immediate) 31 26 25 0 op target j - type (jump) 31 26 25 21 20 16 15 0 op rs rt sa r - type (register) 11 10 6 5 rd funct
chapter 1 introduction user?s manual u15509ej2v0um 24 the instruction set can be further divided into the following five groupings: (a) load and store instructions move data between the memory and the general-purpose registers. they are all immediate (i-type) instructions, since the only addressing mode supported is base register plus 16-bit, signed immediate offset. (b) computational instructions perform arithmetic, logical, shift, and multiply and divide operations on values in registers. they include r-type (in which both the operands and the result are stored in registers) and i-type (in which one operand is a 16-bit signed immediate value) formats. (c) jump and branch instructions change the control flow of a program. jumps are made either to an absolute address formed by combining a 26-bit target address with the higher bits of the program counter (j-type format) or register-specified address (r-type format). the format of the branch instructions is i type. branches have 16-bit offsets relative to the program counter. jal instructions save their return address in register 31. (d) system control coprocessor (cp0) instructions perform operations on cp0 registers to control the memory- management and exception-handling facilities of the processor. (e) special instructions perform system calls and breakpoint exceptions, or cause a branch to the general exception-handling vector based upon the result of a comparison. these instructions occur in both r-type and i-type formats. for the operation of each instruction, refer to chapter 2 cpu instruction set summary and chapter 9 cpu instruction set details. (2) additional instructions all the sum-of-products instructions and power mode instructions are 32-bit length. (3) mips16 instructions all the cpu instructions except for jal and jalx are 16-bit length when executing mips16 instructions, and they are classified into thirteen instruction formats as shown in figure 1-4. the fields of each instruction format are described in chapter 3 mips 16 instruction set.
chapter 1 introduction user?s manual u15509ej2v0um 25 figure 1-4. cpu instruction formats (16-bit length instruction) op i-type ri-type rr-type rri-type rrr-type rri-a-type shift-type i8-type i8_movr32-type i8_mov32r-type i64-type ri64-type jal/jalx-type immediate 0 10 11 op immediate 0 11 15 15 rx 10 8 7 op funct 0 11 15 rx 10 8 7 ry 54 rri immediate 0 11 15 rx 10 8 7 ry 54 rrr f 0 11 15 rx 10 8 7 ry 54 5 4 rz 21 rri-a f 0 11 15 rx 10 8 7 ry immediate 3 shift f 0 11 15 rx 10 8 7 ry shamt 21 i8 immediate 0 11 15 funct 10 8 7 i8 r32(4:0) 0 11 15 funct 10 8 7 ry i8 r32(2:0) funct rz 0 11 15 10 8 7 3 2 i64 immediate 0 11 15 funct 10 8 7 i64 immediate 0 11 15 funct 10 8 7 ry 54 jal immediate(15:0) 0 31 immediate(25:21) x immediate(20:16) 11 10 9 5 4 54 54 54 r32(4:3) 16 15
chapter 1 introduction user?s manual u15509ej2v0um 26 the instruction set can be further divided into the following four groupings: (a) load and store instructions move data between memory and general-purpose registers. they include rri, ri, i8, and ri64 types. (b) computational instructions perform arithmetic, logical, shift, and multiply and divide operations on values in registers. they include ri-, rria, i8, ri64, i64, rr, rrr, i8_movr32, and i8_mov32r types. (c) jump and branch instructions change the control flow of a program. they include jal/jalx, rr, ri, i8, and i types. (d) special instructions are syscall, break, and extend instructions. the syscall and break instructions transfer control to an exception handler. the extend instruction extends the immediate field of the next instruction. they are rr and i types. when extending the immediate field of the next instruction by using the extend instruction, one cycle is needed for executing the extend instruction, and another cycle is needed for executing the next instruction. for more details of each instruction?s operation, refer to chapter 3 mips16 instruction set and chapter 10 mips16 instruction set format . 1.4 data formats and addressing the v r 4100 series uses the following four data formats: ? doubleword (64 bits) ? word (32 bits) ? halfword (16 bits) ? byte (8 bits) in the cpu core, if the data format is any one of halfword, word, or doubleword, the byte ordering can be set as either big endian or little endian. in the v r 4131, the setting of bigendian pin during a reset decides which byte order is used. the v r 4121, v r 4122, v r 4181, and v r 4181a only support the little-endian order. endianness refers to the location of byte 0 within the multi-byte data structure. figures 1-5 and 1-6 show the configuration. when configured as a big-endian system, byte 0 is always the most-significant (leftmost) byte, which is compatible with mc68000 tm and ibm370 tm conventions. when configured as a little-endian system, byte 0 is always the least-significant (rightmost) byte, which is compatible with pentium tm and dec vax tm conventions. in this manual, bit designations are always little endian.
chapter 1 introduction user?s manual u15509ej2v0um 27 figure 1-5. byte address in big-endian byte order (a) word data 12 8 4 0 13 9 5 1 14 10 6 2 15 11 7 3 31 24 23 16 15 8 7 0 12 8 4 0 word address high-order address low-order address (b) doubleword data 16 8 0 17 9 1 18 10 2 19 11 3 63 0 16 8 0 doubleword address high-order address low-order address 20 12 4 21 13 5 22 14 6 23 15 7 32 31 16 15 8 7 word halfword byte remarks 1. the highest byte is the lowest address. 2. the address of word data is specified by the highest byte?s address.
chapter 1 introduction user?s manual u15509ej2v0um 28 figure 1-6. byte address in little-endian byte order (a) word data 15 11 7 3 14 10 6 2 13 9 5 1 12 8 4 0 31 24 23 16 15 8 7 0 12 8 4 0 word address high-order address low-order address (b) doubleword data 23 15 7 22 14 6 21 13 5 20 12 4 16 8 0 doubleword address high-order address low-order address 19 11 3 18 10 2 17 9 1 16 8 0 63 0 32 31 16 15 8 7 word halfword byte remarks 1. the lowest byte is the lowest address. 2. the address of word data is specified by the lowest byte?s address.
chapter 1 introduction user?s manual u15509ej2v0um 29 the cpu core uses the following byte boundaries for halfword, word, and doubleword accesses: ? halfword: an even byte boundary (0, 2, 4...) ? word: a byte boundary divisible by four (0, 4, 8...) ? doubleword: a byte boundary divisible by eight (0, 8, 16...) the following special instructions are used to load and store data that are not aligned on 4-byte (word) or 8-byte (doubleword) boundaries: ? word access: lwl, lwr, swl, swr ? doubleword access: ldl, ldr, sdl, sdr these instructions are used in pairs of l and r. accessing misaligned data requires one additional instruction cycle (1 pcycle) over that required for accessing aligned data. figure 1-7 shows the access of a misaligned word that has byte address 3. figure 1-7. misaligned word accessing (little-endian) 31 24 23 16 15 8 7 0 65 4 3 high-order address low-order address caution in the v r 4131, data transfer to the internal i/o (register) space or to the pci bus is performed with data converted to little endian even during operation in big-endian mode. therefore, the following restrictions apply for access to these address spaces. ? ? ? ? do not perform 3-byte access. when 3-byte access is executed, data is undefined. ? ? ? ? when 8-byte access is executed, the order of higher word and lower word is reversed. ? ? ? ? do not use the lwr, lwl, ldr, and ldl instructions. access by the lwr, lwl, ldr, or ldl instruction causes erroneous data to be loaded.
chapter 1 introduction user?s manual u15509ej2v0um 30 1.5 memory management system the v r 4100 series has a 32-bit physical addressing range of 4 gb. however, since it is rare for systems to implement a physical memory space as large as that memory space, the cpu provides a logical expansion of memory space by translating addresses composed in the large virtual address space into available physical memory addresses. a detailed description of these address spaces is given in chapter 5 memory management system . 1.5.1 translation lookaside buffer (tlb) virtual memory mapping is performed using the translation lookaside buffer (tlb). the tlb converts virtual addresses to physical addresses. it runs by a full-associative method and has 32 entries, each mapping a pair of two consecutive pages. the page size is variable between 1 kb and 256 kb, in powers of 4. (1) joint tlb (jtlb) the jtlb holds both instruction and data addresses. for fast virtual-to-physical address decoding, the v r 4100 series uses a large, fully associative tlb (joint tlb) that translates 64 virtual pages to their corresponding physical addresses. the tlb is organized as 32 pairs of even-odd entries, and maps a virtual address and address space identifier (asid) into the 4 gb physical address space. the page size can be configured, on a per-entry basis, to map a page size of 1 kb to 256 kb. a cp0 register stores the size of the page to be mapped, and that size is entered into the tlb when a new entry is written. thus, operating systems can provide special purpose maps; for example, a typical frame buffer can be memory- mapped using only one tlb entry. translating a virtual address to a physical address begins by comparing the virtual address from the processor with the physical addresses in the tlb; there is a match when the virtual page number (vpn) of the address is the same as the vpn field of the entry, and either the global (g) bit of the tlb entry is set, or the asid field of the virtual address is the same as the asid field of the tlb entry. this match is referred to as a tlb hit. if there is no match, a tlb miss exception is taken by the processor and software is allowed to refill the tlb from a page table of virtual/physical addresses in memory. 1.5.2 processor modes (1) operating modes the v r 4100 series has three operating modes, user, supervisor, and kernel. the manner in which memory addresses are mapped depends on these operating modes. refer to chapter 5 memory management system for details. (2) addressing modes the v r 4100 series has two addressing modes, 64-bit and 32-bit. the manner in which memory addresses are translated or mapped depends on these operating modes. refer to chapter 5 memory management system for details.
chapter 1 introduction user?s manual u15509ej2v0um 31 1.6 instruction pipeline the v r 4100 series has a 5- to 7-stage instruction pipeline. in the v r 4121, v r 4122, v r 4181, and v r 4181a, one instruction is issued each cycle under normal circumstances. the v r 4131 employs a 2-way superscalar mechanism so that two instructions can be executed simultaneously. a detailed description of the pipeline is provided in chapter 4 pipeline . 1.6.1 branch prediction the v r 4122, v r 4131, and v r 4181a have a branch prediction mechanism to speed up branch operations. these processors have a branch prediction table that holds branch instructions whose conditions were satisfied in the past, and the target addresses of the instructions. if an instruction that is the same as the fetched instruction is in this table (hit), execution branches without delay. if the corresponding branch instruction is not in the branch prediction table (miss), the address of that instruction is loaded to the branch prediction table and then execution branches. for the operations when a hit or a miss occurs, refer to chapter 4 pipeline . if the bp bit of the config register of cp0 is cleared, branch prediction is performed. it is not performed if the bp bit is set (1) or in the mips16 instruction mode.
chapter 1 introduction user?s manual u15509ej2v0um 32 1.7 code compatibility the cpu cores of the v r 4100 series are designed in consideration of the program compatibility to other v r - series processors. however since they have some differences from other processors on their architecture, they cannot necessarily execute all programs that can be executed in other v r -series processors, and also other v r - series processors cannot necessarily execute all programs that can be executed in the v r 4100 series. matters that should be paid attention to when porting programs between the v r 4100 series and other v r -series processors are listed below. ? a 16-bit length mips16 instruction set is added in the v r 4100 series. ? multiply-add instructions are added in the v r 4100 series. ? instructions for power modes (hibernate, standby, suspend) are added in the v r 4100 series to support power modes. ? operations to lock a cache are added to the cache instruction in the v r 4131. ? the v r 4100 series does not support floating-point instructions since it has no floating-point unit (fpu). ? the v r 4100 series does not have the ll bit to perform synchronization of multiprocessing. therefore, it does not support instructions that manipulate the ll bit (ll, lld, sc, scd). ? the cp0 hazards of the v r 4100 series are equally or less stringent than those of the v r 4000 (see chapter 11 for details). for more information about each instruction, refer to chapters 9 and 3, and user's manuals of each product other than the v r 4100 series. instructions supported by each of the v r series processors are listed below. table 1-3. list of instructions supported by v r series processors products supported instructions v r 4121 v r 4122 v r 4181 v r 4181a v r 4131 v r 4300 tm v r 4305 tm v r 4310 tm v r 5000 tm v r 5000a tm v r 5432 tm v r 5500 tm v r 10000 tm v r 12000 tm mips i aaaaaa mips ii aaaaaa mips iii aaaaaa ll bit manipulation n/an/aaaaa mips iv n/a n/a n/a a a a mips16 a a n/a n/a n/a n/a multiply-add a a n/a n/a a n/a floating-point operation n/a n/a aaaa power mode transition a a n/a a a (v r 5500) n/a
user?s manual u15509ej2v0um 33 chapter 2 cpu instruction set summary this chapter is an overview of the cpu instruction set; refer to chapter 9 cpu instruction set details for detailed descriptions of individual cpu instructions. 2.1 instruction set architecture in the mips instruction set architecture (isa), five levels of instruction sets, from mips i through mips v, are currently defined. an instruction set of larger level number includes that of smaller level number. in other words, a processor implementing the mips iv instruction set is able to run mips i, mips ii, or mips iii binary programs without change. there are another instruction sets called ase, application-specific extension, that extend functions for specific applications and mips16 is the one currently defined (refer to chapter 3 mips16 instruction set for details). the v r 4100 series implements mips iii and mips16 instruction sets except for the following instructions: (1) synchronization support instructions the v r 4100 series does not support a multiprocessor operating environment. thus the instructions to support synchronization of memory update defined in the mips ii and mips iii isa - the load linked and store conditional instructions - cause reserved instruction exception. the load link (ll) bit is eliminated. remark the sync instruction is handled as a nop instruction since all load/store instructions in this processor are executed in program order. (2) floating-point operation instructions the v r 4100 series does not incorporate a floating-point unit (fpu). thus the fpu instructions cause a coprocessor unusable exception. fpu instructions should be emulated by software in an exception handler if necessary.
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 34 2.2 cpu instruction formats each mips iii isa cpu instruction consists of a single 32-bit word, aligned on a word boundary. there are three instruction formats - immediate (i-type), jump (j-type), and register (r-type) - as shown in figure 2-1. the use of a small number of instruction formats simplifies instruction decoding, allowing the compiler to synthesize more complicated and less frequently used instruction and addressing modes from these three formats as needed. figure 2-1. cpu instruction formats op immediate 0 15 16 20 21 25 26 31 op target 0 25 26 31 rt rs op i-type (immediate) j-type (jump) r-type (register) 0 15 16 20 21 25 26 31 sa rd 5 6 10 11 rt rs funct op: 6-bit operation code rs: 5-bit source register specifier rt: 5-bit target (source/destination) register specifier or branch condition immediate: 16-bit immediate value, branch displacement, or address displacement target: 26-bit unconditional branch target address rd: 5-bit destination register specifier sa: 5-bit shift amount func: 6-bit function field
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 35 2.3 instructions added in the v r 4100 series in the v r 4100 series, instructions such as power mode instructions or product-sum operation instructions, which are suitable for potable information equipment and multimedia field, are added. these instructions are not included in the standard mips iii instruction set. 2.3.1 product-sum operation instructions these instructions add a value in an accumulator to the result of multiplication and store it into a destination register, using the hi register and lo register as an accumulator. a 64-bit accumulator consists of the low-order 32 bits of the hi register as high-order bits and the low-order 32 bits of the lo register as low-order bits. no overflow or no underflow occurs by executing these instructions, and therefore, no exception occurs. of product-sum operation instructions, those that perform saturation processing or store data into a general- purpose register by specifying options are called macc instructions. table 2-1. macc instructions (for v r 4121, v r 4122, v r 4131, and v r 4181a) instruction definition macc multiply and add accumulate dmacc doubleword multiply and add accumulate table 2-2. product-sum operation instructions (for v r 4181) instruction definition madd16 multiply and add 16-bit integer dmadd16 doubleowrd multiply and add 16-bit integer 2.3.2 power mode instructions these instructions stop the internal clock of the processor and set the processor in a low power consumption mode. three low power consumption modes are available, each of which can be set by a dedicated instruction. table 2-3. power mode instructions instruction definition standby standby suspend suspend hibernate hibernate
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 36 2.4 instruction overview the cpu instructions are classified into five classes. the product-sum operation instructions and power mode instructions added in the v r 4100 series are also included in one of the five classes. 2.4.1 load and store instructions loads and stores are immediate (i-type) instructions that move data between memory and the general-purpose registers. the only addressing mode that load and store instructions directly support is base register plus 16-bit signed immediate offset. tables 2-5 and 2-6 list the isa-defined load/store instructions and extended-isa instructions, respectively. (1) scheduling a load delay slot a load instruction that does not allow its result to be used by the instruction immediately following is called a delayed load instruction. the instruction slot immediately following this delayed load instruction is referred to as the load delay slot. in the v r 4100 series, a load instruction can be followed directly by an instruction that accesses a register that is loaded by the load instruction. in this case, however, an interlock occurs for a necessary number of cycles. any instruction can follow a load instruction, but the load delay slot should be scheduled appropriately for both performance and compatibility with the v r series microprocessors. for detail, see chapter 4 pipeline . (2) store delay slot when a store instruction is writing data to a cache, the data cache is kept busy at the dc and wb stages. if an instruction (such as load) that follows directly the store instruction accesses the data cache in the dc stage, a hardware-driven interlock occurs. to overcome this problem, the store delay slot should be scheduled. table 2-4. number of delay slot cycles necessary for load and store instructions instruction necessary number of pcycles load 1 store 1 (3) defining access types access type indicates the size of a processor data item to be loaded or stored, set by the load or store instruction opcode. access types and accessed byte are shown in figure 2-2 . regardless of access type or byte ordering (endianness), the address given specifies the least significant byte in the addressed field. for a big-endian configuration, the high-order byte is the least-significant byte, and for a little-endian configuration the low-order byte. the access type, together with the three low-order bits of the address, defines the bytes accessed within the addressed doubleword (shown in figure 2-2). only the combinations shown in figure 2-2 are permissible; other combinations cause address error exceptions.
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 37 figure 2-2. byte specification related to load and store instructions access type (value) low-order address bits accessed byte (big-endian) accessed byte (little-endian) 21063 063 0 doubleword (7) 0 0 0 0 1 2 3 4 5 6 7 76543210 7-byte (6) 0000123456 6543210 001 12345677654321 6-byte (5) 000012345 543210 010 234567765432 5-byte (4) 00001234 43210 011 3456776543 word (3) 0000123 3210 100 45677654 triple byte (2) 0 0 0 0 1 2 2 1 0 001 123 321 100 456 654 101 567765 halfword (1) 0 0 0 0 1 1 0 010 23 32 1004554 110 6776 byte (0) 0 0 0 0 0 001 1 1 010 2 2 011 3 3 100 4 4 101 5 5 110 6 6 111 77 remark the big-endian order is supported by the v r 4131 only.
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 38 table 2-5. load/store instruction instruction format and description load byte lb rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the bytes of the memory location specified by the address are sign extended and loaded into register rt. load byte unsigned lbu rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the bytes of the memory location specified by the address are zero extended and loaded into register rt. load halfword lh rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the halfword of the memory location specified by the address is sign extended and loaded to register rt. load halfword unsigned lhu rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the halfword of the memory location specified by the address is zero extended and loaded to register rt. load word lw rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the word of the memory location specified by the address is sign extended and loaded to register rt. in the 64-bit mode, it is further sign extended to 64 bits. load word left lwl rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the left the word whose address is specified so that the address-specified byte is at the left- most position of the word. the result of the shift operation is merged with the contents of register rt and loaded to register rt. in the 64-bit mode, it is further sign extended to 64 bits. load word right lwr rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the right the word whose address is specified so that the address-specified byte is at the right- most position of the word. the result of the shift operation is merged with the contents of register rt and loaded to register rt. in the 64-bit mode, it is further sign extended to 64 bits. store byte sb rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the least significant byte of register rt is stored to the memory location specified by the address. store halfword sh rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the least significant halfword of register rt is stored to the memory location specified by the address. store word sw rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the lower word of register rt is stored to the memory location specified by the address. store word left swl rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the right the contents of register rt so that the left-most byte of the word is in the position of the address-specified byte. the result is stored to the lower word in memory. store word right swr rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the left the contents of register rt so that the right-most byte of the word is in the position of the address-specified byte. the result is stored to the upper word in memory. op b ase rt offset
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 39 table 2-6. load/store instruction (extended isa) instruction format and description load doubleword ld rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the doubleword of the memory location specified by the address are loaded into register rt. load doubleword left ldl rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the left the double word whose address is specified so that the address-specified byte is at the left-most position of the double word. the result of the shift operation is merged with the contents of register rt and loaded to register rt. load doubleword right ldr rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the right the double word whose address is specified so that the address-specified byte is at the right-most position of the double word. the result of the shift operation is merged with the contents of register rt and loaded to register rt. load word unsigned lwu rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the word of the memory location specified by the address are zero extended and loaded into register rt store doubleword sd rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. the contents of register rt are stored to the memory location specified by the address. store doubleword left sdl rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the right the contents of register rt so that the left-most byte of the double word is in the position of the address-specified byte. the result is stored to the lower doubleword in memory. store doubleword right sdr rt, offset (base) the offset is sign extended and then added to the contents of the register base to form the virtual address. shifts to the left the contents of register rt so that the right-most byte of the double word is in the position of the address-specified byte. the result is stored to the upper doubleword in memory. op b ase rt offset
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 40 2.4.2 computational instructions computational instructions perform arithmetic, logical, and shift operations on values in registers. computational instructions can be either in register (r-type) format, in which both operands are registers, or in immediate (i-type) format, in which one operand is a 16-bit immediate. computational instructions are classified as: (1) alu immediate instructions (2) three-operand type instructions (3) shift instructions (4) multiply/divide instructions in addition, product-sum operation instructions are added in the v r 4100 series. to maintain data compatibility between the 64- and 32-bit modes, it is necessary to sign-extend 32-bit operands correctly. if the sign extension is not correct, the 32-bit operation result is meaningless. table 2-7. alu immediate instruction instruction format and description add immediate addi rt, rs, immediate the 16-bit immediate is sign extended and then added to the contents of register rs to form a 32-bit result. the result is stored into register rt. in the 64-bit mode, the operand must be sign extended. an exception occurs on the generation of 2?s complement overflow. add immediate unsigned addiu rt, rs, immediate the 16-bit immediate is sign extended and then added to the contents of register rs to form a 32-bit result. the result is stored into register rt. in the 64-bit mode, the operand must be sign extended. no exception occurs on the generation of integer overflow. set on less than immediate slti rt, rs, immediate the 16-bit immediate is sign extended and then compared to the contents of register rt treating both operands as signed integers. if rs is less than the immediate, the result is set to 1; otherwise, the result is set to 0. the result is stored to register rt. set on less than immediate unsigned sltiu rt, rs, immediate the 16-bit immediate is sign extended and then compared to the contents of register rt treating both operands as unsigned integers. if rs is less than the immediate, the result is set to 1; otherwise, the result is set to 0. the result is stored to register rt. and immediate andi rt, rs, immediate the 16-bit immediate is zero extended and then anded with the contents of the register. the result is stored into register rt. or immediate ori rt, rs, immediate the 16-bit immediate is zero extended and then ored with the contents of the register. the result is stored into register rt. exclusive or immediate xori rt, rs, immediate the 16-bit immediate is zero extended and then ex-ored with the contents of the register. the result is stored into register rt. load upper immediate lui rt, immediate the 16-bit immediate is shifted left by 16 bits to set the lower 16 bits of word to 0. the result is stored into register rt. in the 64-bit mode, the operand must be sign extended. op rs rt immediate
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 41 table 2-8. alu immediate instruction (extended isa) instruction format and description doubleword add immediate daddi rt, rs, immediate the 16-bit immediate is sign extended to 64 bits and then added to the contents of register rs to form a 64-bit result. the result is stored into register rt. an exception occurs on the generation of integer overflow. doubleword add immediate unsigned daddiu rt, rs, immediate the 16-bit immediate is sign extended to 64 bits and then added to the contents of register rs to form a 64-bit result. the result is stored into register rt. no exception occurs on the generation of overflow. table 2-9. three-operand type instruction instruction format and description add add rd, rs, rt the contents of registers rs and rt are added together to form a 32-bit result. the result is stored into register rd. in the 64-bit mode, the operand must be sign extended. an exception occurs on the generation of integer overflow. add unsigned addu rd, rs, rt the contents of registers rs and rt are added together to form a 32-bit result. the result is stored into register rd. in the 64-bit mode, the operand must be sign extended. no exception occurs on the generation of integer overflow. subtract sub rd, rs, rt the contents of register rt are subtracted from the contents of register rs. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. an exception occurs on the generation of integer overflow. subtract unsigned subu rd, rs, rt the contents of register rt are subtracted from the contents of register rs. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. no exception occurs on the generation of integer overflow. set on less than slt rd, rs, rt the contents of registers rs and rt are compared, treating both operands as signed integers. if the contents of register rs is less than that of register rt, the result is set to 1; otherwise, the result is set to 0. the result is stored to register rd. set on less than unsigned sltu rd, rs, rt the contents of registers rs and rt are compared treating both operands as unsigned integers. if the contents of register rs is less than that of register rt, the result is set to 1; otherwise, the result is set to 0. the result is stored to register rd. and and rd, rt, rs the contents of register rs are logical anded with that of general register rt bit-wise. the result is stored to register rd. or or rd, rt, rs the contents of register rs are logical ored with that of general register rt bit-wise. the result is stored to register rd. exclusive or xor rd, rt, rs the contents of register rs are logical ex-ored with that of general register rt bit-wise. the result is stored to register rd. nor nor rd, rt, rs the contents of register rs are logical nored with that of general register rt bit-wise. the result is stored to register rd. op rs rt immediate o p rs rt funct r d sa
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 42 table 2-10. three-operand type instruction (extended isa) instruction format and description doubleword add dadd rd, rt, rs the contents of register rs are added to that of register rt. the 64-bit result is stored into register rd. an exception occurs on the generation of integer overflow. doubleword add unsigned daddu rd, rt, rs the contents of register rs are added to that of register rt. the 64-bit result is stored into register rd. no exception occurs on the generation of integer overflow. doubleword subtract dsub rd, rt, rs the contents of register rt are subtracted from that of register rs. the 64-bit result is stored into register rd. an exception occurs on the generation of integer overflow. doubleword subtract unsigned dsubu rd, rt, rs the contents of register rt are subtracted from that of register rs. the 64-bit result is stored into register rd. no exception occurs on the generation of integer overflow. table 2-11. shift instruction instruction format and description shift left logical sll rd, rs, sa the contents of register rt are shifted left by sa bits and zeros are inserted into the emptied lower bits. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. shift right logical srl rd, rs, sa the contents of register rt are shifted right by sa bits and zeros are inserted into the emptied higher bits. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. shift right arithmetic sra rd, rt, sa the contents of register rt are shifted right by sa bits and the emptied higher bits are sign extended. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. shift left logical variable sllv rd, rt, rs the contents of register rt are shifted left and zeros are inserted into the emptied lower bits. the lower five bits of register rs specify the shift count. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. shift right logical variable srlv rd, rt, rs the contents of register rt are shifted right and zeros are inserted into the emptied higher bits. the lower five bits of register rs specify the shift count. the 32-bit result is stored into register rd. in the 64- bit mode, the operand must be sign extended. shift right arithmetic variable srav rd, rt, rs the contents of register rt are shifted right and the emptied higher bits are sign extended. the lower five bits of register rs specify the shift count. the 32-bit result is stored into register rd. in the 64-bit mode, the operand must be sign extended. op rs rt funct r d sa o p rs rt funct r d sa
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 43 table 2-12. shift instruction (extended isa) instruction format and description doubleword shift left logical dsll rd, rs, sa the contents of register rt are shifted left by sa bits and zeros are inserted into the emptied lower bits. the 64-bit result is stored into register rd. doubleword shift right logical dsrl rd, rs, sa the contents of register rt are shifted right by sa bits and zeros are inserted into the emptied higher bits. the 64-bit result is stored into register rd. doubleword shift right arithmetic dsra rd, rt, sa the contents of register rt are shifted right by sa bits and the emptied higher bits are sign extended. the 64-bit result is stored into register rd. doubleword shift left logical variable dsllv rd, rt, rs the contents of register rt are shifted left and zeros are inserted into the emptied lower bits. the lower six bits of register rs specify the shift count. the 64-bit result is stored into register rd. doubleword shift right logical variable dsrlv rd, rt, rs the contents of register rt are shifted right and zeros are inserted into the emptied higher bits. the lower six bits of register rs specify the shift count. the 64-bit result is stored into register rd. doubleword shift right arithmetic variable dsrav rd, rt, rs the contents of register rt are shifted right and the emptied higher bits are sign extended. the lower six bits of register rs specify the shift count. the 64-bit result is stored into register rd. doubleword shift left logical + 32 dsll32 rd, rt, sa the contents of register rt are shifted left by 32 + sa bits and zeros are inserted into the emptied lower bits. the 64-bit result is stored into register rd. doubleword shift right logical + 32 dsrl32 rd, rt, sa the contents of register rt are shifted right by 32 + sa bits and zeros are inserted into the emptied higher bits. the 64-bit result is stored into register rd. doubleword shift right arithmetic + 32 dsra32 rd, rt, sa the contents of register rt are shifted right by 32 + sa bits and the emptied higher bits are sign extended. the 64-bit result is stored into register rd. op rs rt funct r d sa
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 44 table 2-13. multiply/divide instructions instruction format and description multiply mult rs, rt the contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. the 64-bit result is stored into special registers hi and lo. in the 64-bit mode, the operand must be sign extended. multiply unsigned multu rs, rt the contents of registers rt and rs are multiplied, treating both operands as 32-bit unsigned integers. the 64-bit result is stored into special registers hi and lo. in the 64-bit mode, the operand must be sign extended. divide div rs, rt the contents of register rs are divided by that of register rt, treating both operands as 32-bit signed integers. the 32-bit quotient is stored into special register lo, and the 32-bit remainder is stored into special register hi. in the 64-bit mode, the operand must be sign extended. divide unsigned divu rs, rt the contents of register rs are divided by that of register rt, treating both operands as 32-bit unsigned integers. the 32-bit quotient is stored into special register lo, and the 32-bit remainder is stored into special register hi. in the 64-bit mode, the operand must be sign extended. move from hi mfhi rd the contents of special register hi are loaded into register rd. move from lo mflo rd the contents of special register lo are loaded into register rd. move to hi mthi rs the contents of register rs are loaded into special register hi. move to lo mtlo rs the contents of register rs are loaded into special register lo. table 2-14. multiply/divide instructions (extended isa) instruction format and description doubleword multiply dmult rs, rt the contents of registers rt and rs are multiplied, treating both operands as signed integers. the 128- bit result is stored into special registers hi and lo. doubleword multiply unsigned dmultu rs, rt the contents of registers rt and rs are multiplied, treating both operands as unsigned integers. the 128-bit result is stored into special registers hi and lo. doubleword divide ddiv rs, rt the contents of register rs are divided by that of register rt, treating both operands as signed integers. the 64-bit quotient is stored into special register lo, and the 64-bit remainder is stored into special register hi. doubleword divide unsigned ddivu rs, rt the contents of register rs are divided by that of register rt, treating both operands as unsigned integers. the 64-bit quotient is stored into special register lo, and the 64-bit remainder is stored into special register hi. op rs rt funct r d sa op rs rt funct r d sa
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 45 table 2-15. product-sum operation instructions (for v r 4121, v r 4122, v r 4131, and v r 4181a) instruction format and description multiply and add accumulate macc{h}{u}{s} rd, rs, rt the contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. the result is added to the combined value of special registers hi and lo. the 64-bit result is stored into special registers hi and lo. if h=0, the same data as that stored in register lo is also stored in register rd; if h=1, the same data as that stored in register hi is also stored in register rd. if u is specified, the operand is treated as unsigned data. if s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and the value obtained by combining registers hi and lo is treated as a 32-bit value (64 bits sign- or zero- extended). moreover, saturation processing is performed for the operation result in the format specified with u. doubleword multiply and add accumulate dmacc{h}{u}{s} rd, rs, rt the contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. the result is added to value of special register lo. the 64-bit result is stored into special register lo. if h=0, the same data as that stored in register lo is also stored in register rd; if h=1, undefined data is stored in register rd. if u is specified, the operand is treated as unsigned data. if s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and register lo is treated as a 32-bit value (64 bits sign- or zero-extended). moreover, saturation processing is performed for the operation result in the format specified with u. table 2-16. product-sum operation instructions (for v r 4181) instruction format and description multiply and add 16- bit integer madd16 rs, rt the contents of registers rt and rs are multiplied, treating both operands as 16-bit signed integers (by sign extending to 64 bits). the result is added to the combined value of special registers hi and lo. the 64-bit result is stored into special registers hi and lo. doubleword multiply and add 16-bit integer dmadd16 rs, rt the contents of registers rt and rs are multiplied, treating both operands as 16-bit signed integers (by sign extending to 64 bits). the result is added to value of special register lo. the 64-bit result is stored into special register lo. funct o p rs rt funct r d sa op rs rt r d
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 46 mfhi and mflo instructions after a multiply or divide instruction generate interlocks to delay execution of the next instruction, inhibiting the result from being read until the multiply or divide instruction completes. table 2-17 gives the number of processor cycles (pcycles) required to resolve interlock or stall between various multiply or divide instructions and a subsequent mfhi or mflo instruction. table 2-17. number of stall cycles in multiply and divide instructions instruction number of instruction cycles mult 1 multu 1 div 35 divu 35 dmult 4 dmultu 4 ddiv 67 ddivu 67 macc 0 dmacc 0 madd16 1 dmadd16 1
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 47 2.4.3 jump and branch instructions jump and branch instructions change the control flow of a program. all jump and branch instructions occur with a delay of one instruction: that is, the instruction immediately following the jump or branch instruction (this is known as the instruction in the delay slot) always executes while the target instruction is being fetched from memory. for instructions involving a link (such as jal and bltzal), the return address is saved in register r31. (1)overview of jump instructions subroutine calls in high-level languages are usually implemented with j or jal instructions, both of which are j- type instructions. in j-type format, the 26-bit target address shifts left 2 bits and combines with the high-order 4 bits of the current program counter to form a 32-bit or 64-bit absolute address. returns, dispatches, and cross-page jumps are usually implemented with the jr or jalr instructions. both are r-type instructions that take the 32-bit or 64-bit byte address contained in one of the general-purpose registers. table 2-18. jump instructions instruction format and description jump j target the contents of 26-bit target address is shifted left by two bits and combined with the high-order four bits of the pc. the program jumps to this calculated address with a delay of one instruction. jump and link jal target the contents of 26-bit target address is shifted left by two bits and combined with the high-order four bits of the pc. the program jumps to this calculated address with a delay of one instruction. the address of the instruction following the delay slot is stored into r31 (link register). instruction format and description jump and link exchange jalx target the contents of 26-bit target address is shifted left by two bits and combined with the high-order four bits of the pc. the program jumps to this calculated address with a delay of one instruction, and then the isa mode bit is reversed. the address of the instruction following the delay slot is stored into r31 (link register). instruction format and description jump register jr rs the program jumps to the address specified in register rs with a delay of one instruction. jump snd link register jalr rs, rd the program jumps to the address specified in register rs with a delay of one instruction. the address of the instruction following the delay slot is stored into rd. o p rs rt funct r d sa o p target o p target
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 48 (2) overview of branch instructions a branch instruction has a pc-related signed 16-bit offset. all branch instruction target addresses are computed by adding the address of the instruction in the delay slot to the 16-bit offset (shifted left by 2 bits and sign-extended to 64 bits). all branches occur with a delay of one instruction. calculation of the target address is performed at the rf stage and the ex stage of the instruction. the target instruction of the branch is fetched at the ex stage of the branch instruction. if the branch condition does not meet in executing a likely instruction, the instruction in its delay slot is nullified. for all other branch instructions, the instruction in its delay slot is unconditionally executed. table 2-19. branch instructions (1/2) instruction format and description branch on equal beq rs, rt, offset if the contents of register rs are equal to that of register rt, the program branches to the target address. branch on not equal bne rs, rt, offset if the contents of register rs are not equal to that of register rt, the program branches to the target address. branch on less than or equal to zero blez rs, offset if the contents of register rs are less than or equal to zero, the program branches to the target address. branch on greater than zero bgtz rs, offset if the contents of register rs are greater than zero, the program branches to the target address. instruction format and description branch on less than zero bltz rs, offset if the contents of register rs are less than zero, the program branches to the target address. branch on greater than or equal to zero bgez rs, offset if the contents of register rs are greater than or equal to zero, the program branches to the target address. branch on less than zero and link bltzal rs, offset the address of the instruction that follows delay slot is stored to register r31 (link register). if the contents of register rs are less than zero, the program branches to the target address. branch on greater than or equal to zero and link bgezal rs, offset the address of the instruction that follows delay slot is stored to register r31 (link register). if the contents of register rs are greater than or equal to zero, the program branches to the target address. remark sub: sub-operation code op rs rt offset regimm offset rs sub
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 49 table 2-19. branch instructions (2/2) instruction format and description branch on coprocessor 0 true bc0t offset adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the instruction in the delay slot to calculate out the branch target address. if the conditional signal of the coprocessor 0 is true, the program branches to the target address with one-instruction delay. branch on coprocessor 0 false bc0f offset adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the instruction in the delay slot to calculate out the branch target address. if the conditional signal of the coprocessor 0 is false, the program branches to the target address with one-instruction delay. remark bc: bc sub-operation code br: branch condition identifier table 2-20. branch instructions (extended isa) (1/2) instruction format and description branch on equal likely beql rs, rt, offset if the contents of register rs are equal to that of register rt, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on not equal likely bnel rs, rt, offset if the contents of register rs are not equal to that of register rt, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on less than or equal to zero likely blezl rs, offset if the contents of register rs are less than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than zero bgtzl rs, offset if the contents of register rs are greater than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. o p rs rt offset cop0 offset bc br
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 50 table 2-20. branch instructions (extended isa) (2/2) instruction format and description branch on less than zero likely bltzl rs, offset if the contents of register rs are less than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than or equal to zero likely bgezl rs, offset if the contents of register rs are greater than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on less than zero and link likely bltzall rs, offset the address of the instruction that follows delay slot is stored to register r31 (link register). if the contents of register rs are less than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than or equal to zero and link likely bgezall rs, offset the address of the instruction that follows delay slot is stored to register r31 (link register). if the contents of register rs are greater than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. remark sub: sub-operation code instruction format and description branch on coprocessor 0 true likely bc0tl offset adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the instruction in the delay slot to calculate out the branch target address. if the conditional signal of the coprocessor 0 is true, the program branches to the target address with one-instruction delay. if the branch condition is not met, the instruction in the delay slot is discarded. branch on coprocessor 0 false likely bc0fl offset adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the instruction in the delay slot to calculate out the branch target address. if the conditional signal of the coprocessor 0 is false, the program branches to the target address with one-instruction delay. if the branch condition is not met, the instruction in the delay slot is discarded. remark bc: bc sub-operation code br: branch condition identifier regimm offset rs sub cop0 offset bc br
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 51 2.4.4 special instructions special instructions generate software exceptions. their formats are r-type (syscall, break). the trap instruction is available only for the products that support the mips iii instruction set or later. all the other instructions are available for all v r series. table 2-21. special instructions instruction format and description synchronize sync completes the load/store instruction executing in the current pipeline before the next load/store instruction starts execution. system call syscall generates a system call exception, and then transits control to the exception handling program. breakpoint break generates a break point exception, and then transits control to the exception handling program. remark sync instruction is handled as a nop instruction in the v r 4100 series. table 2-22. special instructions (extended isa) (1/2) instruction format and description trap if greater than or equal tge rs, rt the contents of register rs are compared with that of register rt, treating both operands as signed integers. if the contents of register rs are greater than or equal to that of register rt, an exception occurs. trap if greater than or equal unsigned tgeu rs, rt the contents of register rs are compared with that of register rt, treating both operands as unsigned integers. if the contents of register rs are greater than or equal to that of register rt, an exception occurs. trap if less than tlt rs, rt the contents of register rs are compared with that of register rt, treating both operands as signed integers. if the contents of register rs are less than that of register rt, an exception occurs. trap if less than unsigned tltu rs, rt the contents of register rs are compared with that of register rt, treating both operands as unsigned integers. if the contents of register rs are less than that of register rt, an exception occurs. trap if equal teq rs, rt if the contents of registers rs and rt are equal, an exception occurs. trap if not equal tne rs, rt if the contents of registers rs and rt are not equal, an exception occurs. special rs rt funct r d sa special rs rt funct r d sa
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 52 table 2-22. special instructions (extended isa) (2/2) instruction format and description trap if greater than or equal immediate tgei rs, immediate the contents of register rs are compared with 16-bit sign-extended immediate data, treating both operands as signed integers. if the contents of register rs are greater than or equal to 16-bit sign- extended immediate data, an exception occurs. trap if greater than or equal immediate unsigned tgeiu rs, immediate the contents of register rs are compared with 16-bit zero-extended immediate data, treating both operands as unsigned integers. if the contents of register rs are greater than or equal to 16-bit sign- extended immediate data, an exception occurs. trap if less than immediate tlti rs, immediate the contents of register rs are compared with 16-bit sign-extended immediate data, treating both operands as signed integers. if the contents of register rs are less than 16-bit sign-extended immediate data, an exception occurs. trap if less than immediate unsigned tltiu rs, immediate the contents of register rs are compared with 16-bit zero-extended immediate data, treating both operands as unsigned integers. if the contents of register rs are less than 16-bit sign-extended immediate data, an exception occurs. trap if equal immediate teqi rs, immediate if the contents of register rs and immediate data are equal, an exception occurs. trap if not equal immediate tnei rs, immediate if the contents of register rs and immediate data are not equal, an exception occurs. remark sub: sub-operation code 2.4.5 system control coprocessor (cp0) instructions system control coprocessor (cp0) instructions perform operations specifically on the cp0 registers to manipulate the memory management and exception handling facilities of the processor. the power mode instructions added in the v r 4100 series are included in this instruction group. table 2-23. system control coprocessor (cp0) instructions (1/2) instruction format and description move to system control coprocessor mtc0 rt, rd the word data of general-purpose register rt in the cpu are loaded into general-purpose register rd in the cp0. move from system control coprocessor mfc0 rt, rd the word data of general-purpose register rd in the cp0 are loaded into general-purpose register rt in the cpu. doubleword move to system control coprocessor 0 dmtc0 rt, rd the doubleword data of general-purpose register rt in the cpu are loaded into general-purpose register rd in the cp0. doubleword move from system control coprocessor 0 dmfc0 rt, rd the doubleword data of general-purpose register rd in the cp0 are loaded into general-purpose register rt in the cpu. remark sub: sub-operation code regimm immediate rs sub cop0 su b rt 0 r d
chapter 2 cpu instruction set summary user?s manual u15509ej2v0um 53 table 2-23. system control coprocessor (cp0) instructions (2/2) instruction format and description read indexed tlb entry tlbr the tlb entry indexed by the index register is loaded into the entryhi, entrylo0, entrylo1, or pagemask register. write indexed tlb entry tlbwi the contents of the entryhi, entrylo0, entrylo1, or pagemask register are loaded into the tlb entry indexed by the index register. write random tlb entry tlbwr the contents of the entryhi, entrylo0, entrylo1, or pagemask register are loaded into the tlb entry indexed by the random register. probe tlb for matching entry tlbp the address of the tlb entry that matches with the contents of entryhi register is loaded into the index register. return from exception eret the program returns from exception, interrupt, or error trap. remark co: sub-operation identifier instruction format and description standby standby the processor?s operating mode is transited from fullspeed mode to standby mode. suspend suspend the processor?s operating mode is transited from fullspeed mode to suspend mode. hibernate hibernate the processor?s operating mode is transited from fullspeed mode to hibernate mode. remark co: sub-operation identifier instruction format and description cache operation cache op, offset (base) the 16-bit offset is sign extended to 32 bits and added to the contents of the register base, to form virtual address. this virtual address is translated to physical address with tlb. for this physical address, cache operation that is indicated by 5-bit sub-opcode is performed. cop0 funct co cache offset base op cop0 funct co
user?s manual u15509ej2v0um 54 chapter 3 mips16 instruction set 3.1 outline if the mips16 ase (application-specific extension), which is an expanded function for mips isa (instruction set architecture), is used, system costs can be considerably reduced by lowering the memory capacity requirement of embedded hardware. mips16 is an instruction set that uses the 16-bit instruction length, and is compatible with mips i, ii, iii, iv, and v note instruction sets in any combination. moreover, existing 32-bit instruction length binary data can be executed with mips16 without change. note the v r 4100 series currently supports the mips i, ii, and iii instruction sets. mips16 instruction set is enabled or disabled in the v r 4100 series according to the state of mips16en pin during a reset. 3.2 features ? 16-bit length instruction format ? reduces memory capacity requirements to lower overall system cost ? mips16 instructions can be used with mips instruction binary ? compatibility with mips i, ii, iii, iv, and v instruction sets ? used with switching between mips16 instruction length mode and 32-bit mips instruction length mode. ? supports 8-bit, 16-bit, 32-bit, and 64-bit data formats ? provides 8 general-purpose registers and special registers ? improved code generation efficiency using special 16-bit dedicated instructions
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 55 3.3 register set tables 3-1 and 3-2 show the mips16 register sets. these register sets form part of the register sets that can be accessed in 32-bit instruction length mode. mips16 instructions can directly access 8 of the 32 registers that can be used in the 32-bit instruction length mode. in addition to these 8 general-purpose registers, the special instructions of mips16 reference the stack pointer register (sp), return address register (ra), condition code register (t8), and program counter (pc). sp and ra are mapped by fixing to the general-purpose registers in the 32-bit instruction length mode. mips16 has 2 move instructions that are used in addressing 32 general-purpose registers. table 3-1. general-purpose registers mips16 register encoding 32-bit mips register encoding symbol comment 0 16 s0 general-purpose register 1 17 s1 general-purpose register 2 2 v0 general-purpose register 3 3 v1 general-purpose register 4 4 a0 general-purpose register 5 5 a1 general-purpose register 6 6 a2 general-purpose register 7 7 a3 general-purpose register n/a 24 t8 mips16 condition code register. bteqz, btnez, cmp, cmpi, slt, sltu, slti, and sltiu instructions are implicitly referenced. n/a 29 sp stack pointer register n/a 31 ra return address register remarks 1. the symbols are the general assembler symbols. 2. the mips register encoding numbers 0 to 7 correspond to the mips16 binary encoding of the registers, and are used to show the relationship between this encoding and the mips registers. the numbers 0 to 7 are not used to reference registers, except within binary mips16 instructions. registers are referenced from the assembler using the mips name ($16, $17, $2, etc.) or the symbol name (s0, s1, v0, etc.). for example, when register number 17 is accessed with the register file, the programmer references either $17 or s1 even if the mips16 encoding of this register is 001. 3. the general-purpose registers not shown in this table cannot be accessed with a mips16 instruction set other than the move instruction. the move instruction of mips16 can access all 32 general-purpose registers. 4. to reference the mips16 condition code registers with this manual, either t, t8, or $24 has to be used, depending on the case. these three names reference the same physical register.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 56 table 3-2. special registers symbol description pc program counter. the pc-relative add instruction and load instruction can access this register. hi the upper word of the multiply or divide result is inserted lo the lower word of the multiply or divide result is inserted 3.4 isa mode mips16 instruction set supports procedure calling, and returns from the mips16 instruction mode or the 32-bit instruction length mode to the mips16 instruction mode or the 32-bit instruction length mode. ? the jal instruction supports calling to the same isa. ? the jalx instruction supports calling that inverses isa. ? the jalr instruction supports calling to either isa. ? the jr instruction supports also returning to either isa. mips16 instruction set also supports a return operation from exception processing. ? the eret instruction, which is defined only in 32-bit instruction length mode, supports returning to isa when an exception has not occurred. the isa mode bit defines the instruction length mode to be executed. if the isa mode bit is 0, the processor executes only 32-bit instructions. if the isa mode bit is 1, the processor executes only mips16 instructions. 3.4.1 changing isa mode bit by software only the jalx, jr, and jalr instructions change the isa mode bit between the mips16 instruction mode and the 32-bit instruction length mode. the isa mode bit cannot be directly overwritten by software. the jalx changes the isa mode bit to select another isa mode. the jr instruction and jalr instruction load the isa mode bit from bit 0 of the general-purpose register that holds the target address. bit 0 is not a part of the target address. bit 0 of the target address is always 0, and no address exception is generated. moreover, the jal, jalr, and jalx instructions save the isa mode bit to bit 0 of the general-purpose register that acquires the return address. the contents of this general-purpose register are later used by the jr and jalr instruction for return and restoration of the isa mode. 3.4.2 changing isa mode bit by exception even if an exception occurs, the isa mode does not change. when an exception occurs, the isa mode bit is cleared to 0 so that the exception is serviced with 32-bit code. then the isa mode status before the exception occurred is saved to the least significant bit of the epc register or the errorepc register. during return from an exception, the isa mode before the exception occurred is returned to by executing the jr or eret instruction with the contents of this register. moreover, the isa mode bit is cleared to 0 after cold reset and soft reset of the cpu core, and the 32-bit instruction length mode returns to its initial state.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 57 3.4.3 enabling change isa mode bit changing the isa mode bit is valid only when mips16en pin is set to active during the rtc reset, and the mips16 instruction mode is enabled. the operation of the jalx, jalr, jr, and eret instructions in the 32-bit instruction mode, differs depending on whether the mips16 instruction mode is enabled or prohibited. if the mips16 instruction mode is prohibited, the jalx instruction generates a reserved instruction exception. the jr and jalr instructions generate an address exception when bit 0 of the source register is 1. the eret instruction generates an address exception when bit 0 of the epc or errorepc register is 1. if the mips16 instruction mode is enabled, the jalx instruction executes jal, and the isa mode bit is inverted. the jr and jalr instructions load the isa mode from bit 0 of the source register. the eret instruction loads the isa mode from bit 0 of the epc or errorepc register. bit 0 of the target address is always 0, and no address exception is generated even when bit 0 of the source register is 1. 3.5 types of instructions this section describes the different types of instructions, and indicates the mips16 instructions included in each group. instructions are divided into the following types. load and store instructions : move data between memory and the general-purpose registers. computational instructions : perform arithmetic operations, logical operations, and shift operations on values in registers. jump and branch instructions: change the control flow of a program. special instructions : syscall, break, and extend instructions. syscall and break transfer control to an exception handler. extend enlarges the immediate field of the next instruction. instructions that can be extended with extend are indicated as note 1 in table 3-3 mips16 instruction set outline . table 3-3. mips16 instruction set outline (1/2) op description op description load and store instructions multiply/divide instructions lb note 1 load byte mult multiply lbu note 1 load byte unsigned multu multiply unsigned lh note 1 load halfword div divide lhu note 1 load halfword unsigned divu divide unsigned lw note 1 load word mfhi move from hi lwu notes 1, 2 load word unsigned mflo move from lo ld notes 1, 2 load doubleword dmult note 2 doubleword multiply sb note 1 store byte dmultu note 2 doubleword multiply unsigned sh note 1 store halfword ddiv note 2 doubleword divide sw note 1 store word ddivu note 2 doubleword divide unsigned sd notes 1, 2 store doubleword notes 1. extendable instruction. for details, see 3.8.2 extend instruction . 2. can be used in 64-bit mode and 32-bit kernel mode.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 58 table 3-3. mips16 instruction set outline (2/2) op description op description arithmetic instructions: alu immediate instructions jump/branch instructions li note 1 load immediate jal jump and link addiu note 1 add immediate unsigned jalx jump and link exchange daddiu notes 1, 2 doubleword add immediate unsigned jr jump register slti note 1 set on less than immediate jalr jump and link register sltiu note 1 set on less than immediate unsigned beqz note 1 branch on equal to zero cmpi note 1 compare immediate bnez note 1 branch on not equal to zero bteqz note 1 branch on t equal to zero arithmetic instructions: 2/3 operand register instructions btnez note 1 branch on t not equal to zero addu add unsigned b note 1 branch unconditional subu subtract unsigned daddu note 2 doubleword add unsigned shift instructions dsubu note 2 doubleword subtract unsigned sll note 1 shift left logical slt set on less than srl note 1 shift right logical sltu set on less than unsigned sra note 1 shift right arithmetic cmp compare sllv shift left logical variable neg negate srlv shift right logical variable and and srav shift right arithmetic variable or or dsll notes 1, 2 doubleword shift left logical xor exclusive or dsrl notes 1, 2 doubleword shift right logical not not dsra notes 1, 2 doubleword shift right arithmetic move move dsllv note 2 doubleword shift left logical variable dsrlv note 2 doubleword shift right logical variable special instructions dsrav note 2 doubleword shift right arithmetic variable extend extend break breakpoint sycall system call notes 1. extendable instruction. for details, see 3.8.2 extend instruction . 2. can be used in 64-bit mode and 32-bit kernel mode.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 59 3.6 instruction format the mips16 instruction set has a length of 16 bits and is located at the half-word boundary. one part of jump instructions and instructions for which the extend instruction extends immediate become 32 bits in length, but crossing the word boundary does not represent a problem. the instruction format is shown below. variable subfields are indicated with lower case letters (rx, ry, rz, immediate, etc.). in the case of special functions, constants are input to the two instruction subfields op and funct. these values are indicated by upper case mnemonics. for example, in the case of the load byte instruction, op is lb, and in the case of the add instruction, op is special, and function is add. the constants of the fields used in the instruction formats are shown below. table 3-4. field definition field definition op 5-bit major operation code rx 3-bit source/destination register specification ry 3-bit source/destination register specification immediate or imm 4-bit, 5-bit, 8-bit, or 11-bit immediate value, branch displacement, or address displacement rz 3-bit source/destination register specification funct or f function field i-type (immediate) instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 op immediate ri-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 op rx immediate rr-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 op rx ry funct
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 60 rri-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 op rx ry immediate rrr-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 rrr rx ry rz f rri-a type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 rri-a rx ry f immediate shift instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 shift rx ry shamt note f note the 3-bit shamt field can encode shift count numbers from 0 to 7. 0-bit shift (nop) cannot be executed. 0 is regarded as shift count 8. i8-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i8 funct immediate i8_movr32 instruction format (used only with movr32 instruction) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i8 funct ry r32[4:0]
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 61 i8_mov32r instruction format (used only with mov32r instruction) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i8 funct r32[2:0, 4:3] note rz note the r32 field uses special bit encoding. for example, encoding of $7 (00111) is 11100 in the r32 field. i64-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i64 funct immediate ri64-type instruction format 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i64 funct ry immediate jal and jalx instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 jal x immediate(20:16) immediate(25:21) immediate(15:0) jal in case of x = 0 instruction jalx in case of x = 1 instruction
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 62 ext-i instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) major 000000 immediate(4:0) ext-ri instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) major rx 000 immediate(4:0) ext-rri instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) major rx ry immediate(4:0) ext-rri-a instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:4) immediate(14:11) rri-a rx ry immediate(3:0) f
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 63 ext-shift instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend shamt(4:0) s5 note 00000 shift rx ry 00 f 0 note only in the case of dsll, the s5 bit is the most significant bit of the 6-bit shift count field (shamt). in the case of all 32-bit extended shifts, s5 must be 0. for a normal shift instruction, the display of shift count 0 is considered as shift count 8, but the extended shift instruction does not perform such mapping changes. therefore, 0-bit shift using the extended format is possible. ext-i8 instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) i8 funct 0 0 0 immediate(4:0) ext-i64 instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) i64 funct 0 0 0 immediate(4:0) ext-ri64 instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend immediate(10:5) immediate(15:11) i64 funct ry immediate(4:0) ext-shift64 instruction format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 extend shamt(4:0) s5 note 00000 rr 0 0 0 ry function note the s5 bit is the most significant bit of the 6-bit shift count field (shamt). in the case of a normal shift instruction, the display of shift count 0 is considered as shift count 8, but the extended shift instruction does not perform such mapping changes. therefore, 0-bit shift using the extended format is possible.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 64 3.7 mips16 operation code bit encoding this section describes encoding for major and minor opcode. table 3-5 shows bit encoding of the mips16 major operation code. tables 3-6 to 3-11 show bit encoding of the minor operation code. the italic operation codes in the tables are instructions for the extended isa. table 3-5. bit encoding of major operation code (op) instruction bits instruction bits [ 13:11 ] [ 15:14 ] 000 001 010 011 100 101 110 111 00 addiusp note 1 addiupc note 2 b jal(x) note 3 beqz bnez shift ld 01 rri-a addiu8 note 4 slti sltiu l8 li cmpi sd 10 lb lh lwsp lw lbu lhu lwpc lwu 11 sb sh swsp sw rrr rr extend l64 notes 1. addiusp : addiu rx, sp, immediate 2. addiupc : addiu rx, pc, immediate 3. jal(x) : jal instruction and jalx instruction 4. addiu8 : aadiu rx, immediate table 3-6. rr minor operation code (rr-type instruction) instruction bits instruction bits [ 2:0 ] [ 4:3 ] 000 001 010 011 100 101 110 111 00 j(al)r note 1 ? slt sltu sllv break srlv srav 01 dsrl note 2 syscall cmp neg and or xor not 10 mfhi ? mflo dsra note 2 dsllv ? dsrlv dsrav 11 mult multu div divu dmult dmultu ddiv ddivu notes 1. j(al)r: jr rx instruction (ry = 000) jr ra instruction (ry = 001, rx = 000) jalr ra, rx instruction (ry = 010) 2. dsrl and dsra use the rx register field to encode the shift count (8-digit shift for 0). in the case of the extended version of these two instructions, the ext-shift64 format is used. only these two rr instructions can be extended. remarks the symbols in the figures have the following meaning. ? : execution of operation code with an asterisk on the current v r 4100 series causes a reserved instruction exception to be generated. this code is reserved for future extension.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 65 table 3-7. rrr minor operation code (rrr-type instruction) instruction bits [ 1:0 ] 00 01 10 11 daddu addu dsubu subu table 3-8. rri-a minor operation code (rri-type add instruction) instruction bit [ 4 ] 01 addiu note 1 daddiu note 2 notes 1. addiu : addiu ry, rx, immediate 2. daddiu : daddiu ry, rx immediate table 3-9. shift minor operation code (shift-type instruction) instruction bits [ 1:0 ] 00 01 10 11 sll dsll srl sra table 3-10. i8 minor operation code (i8-type instruction) instruction bits [ 10:8 ] 000 001 010 011 100 101 110 111 bteqz btnez swrasp note 1 adjsp note 2 ? mov32r note 3 ? movr32 note 4 notes 1. swrasp : sw ra, immediate(sp) 2. adjsp : addiu sp, immediate 3. mov32r: move r32, rz 4. movr32: move ry, r32 remark the symbols used in the figures have the following meaning. ? : execution of operation code with an asterisk on the current v r 4100 series causes a reserved instruction exception to be generated. this code is reserved for future extension.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 66 table 3-11. i64 minor operation code (64-bit only, i64-type instruction) instruction bits [ 10:8 ] 000 001 010 011 100 101 110 111 ldsp note 1 sdsp note 2 sdrasp note 3 dadjsp note 4 ldpc note 5 daddiu5 note 6 dadiupc note 7 dadiusp note 8 notes 1. ldsp : ld ry, immediate 2. sdsp : sd ry, immediate 3. sdrasp : sd ra, immediate 4. dadjsp : daddiu sp, immediate 5. ldpc : ld ry, immediate 6. daddiu5 : daddiu ry, immediate 7. dadiupc : daddiu ry, pc, immediate 8. dadiusp : daddiu ry, sp, immediate
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 67 3.8 outline of instructions this section describes the assembler syntax and defines each instruction. instructions can be divided into the following four types. ? load and store instructions ? computational instructions ? jump and branch instructions ? special instructions 3.8.1 pc-relative instructions pc-relative instructions is the instruction format first defined among the mips16 instruction set. mips16 supports both extension and non-extension through the extend instruction for four pc-relative instructions. load word lw rx, offset(pc) load doubleword ld ry, offset(pc) add immediate unsigned addiu rx, pc, immediate doubleword add immediate unsigned daddiu ry, pc, immediate all these instructions calculate the pc value of a pc-relative instruction or the pc value of the instruction immediately preceding as the base address. the address calculation base using various function combinations is shown next. table 3-12. base pc address setting instruction base pc value non-extension pc-relative instructions not located in jump delay slot pc of instruction extension pc-relative instruction pc of extend instruction non-extension pc-relative instruction in jump delay slot of jr or jalr pc of jr instruction or jalr instruction non-extension pc-relative instruction in jump delay slot of jal or jalx pc of initial halfword of jal or jalx note note because the jal and jalx instruction length is 32 bits. the pc value used as the base for address calculation for the pc-relative instruction outlines shown in tables 3-14 and 3-15 is called base pc value. the base pc value is defined so as to be equivalent to the exception program counter (epc) value related to the pc-relative instruction.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 68 3.8.2 extend instruction the extend instruction can extend the immediate fields of mips16 instructions, which have fewer immediate fields than equivalent 32-bit mips instructions. the extend instruction must always precede (by one instruction) the instruction whose immediate field you want to extend. every extended instruction consumes four bytes in program memory instead of two bytes (two bytes for extend and two bytes for the instruction being extended), and it can cross a word boundary. for example, the mips16 instruction lw ry, offset (rx) contains a five-bit immediate. the immediate expands to 16 bits (000000000 || offset || 00) before execution in the pipeline. this allows 32 different offset values of 0, 4, 8, and up through 124. once extended, this instruction can hold any of the normal 65,536 values in the range ?32768 through 32767. shift instructions are extended to 5-bit unsigned immediate values. all other immediate instructions expand to either signed or unsigned 16-bit immediate values. the only exceptions are addiu ry, rx, immediate daddiu ry, rx, immediate which can be extended only to a 15-bit signed immediate. there is only one restriction. extended instructions should not be placed in jump delay slots. otherwise, the results are unpredictable because the pipeline would attempt to execute one half the instruction. table 3-13 lists the mips16 extendable instructions, the size of their immediate, and how much each immediate can be extended when preceded with the extend instruction. for the instruction format of the extend instruction, see 3.6 instruction format .
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 69 table 3-13. extendable mips16 instructions mips16 instruction mips16 immediate instruction format extended immediate instruction format load byte 5 rri 16 ext-rri load byte unsigned 5 rri 16 ext-rri load halfword 5 rri 16 ext-rri load halfword unsigned 5 rri 16 ext-rri load word 5 8 rri ri 16 16 ext-rri ext-ri load word unsigned 5 rri 16 ext-rri load doubleword 5 rri 16 ext-rri store byte 5 rri 16 ext-rri store halfword 5 rri 16 ext-rri store word 5 (other) 8 (sw rx, offset(sp)) 8 (sw ra, offset(sp)) rri ri i8 16 16 16 ext-rri ext-ri ext-i8 store doubleword 5 (sd ry, offset(rx)) 8 (other) rri i64 16 16 ext-rri ext-i64 load immediate 8 ri 16 ext-ri add immediate unsigned 4 (addiu ry, rx, imm) 8 (addiu sp, imm) 8 (other) rri-a i8 ri 15 16 16 ext-rri-a ext-i8 ext-ri doubleword add immediate unsigned 4 (daddiu ry, rx, imm) 5 (daddiu ry, pc, imm) 8 (other) rri-a ri64 i64 15 16 16 ext-rri-a ext-ri64 ext-i64 set on less than immediate 8 ri 16 ext-ri set on less than immediate unsigned 8 ri 16 ext-ri compare immediate 8 ri 16 ext-ri shift left logical 3 shift 5 ext-shift shift right logical 3 shift 5 ext-shift shift right arithmetic 3 shift 5 ext-shift doubleword shift left logical 3 shift 6 ext-shift doubleword shift right logical 3 rr 6 ext- shift64 doubleword shift right arithmetic 3 rr 6 ext- shift64 branch on equal to zero 8 ri 16 ext-ri branch on not equal to zero 8 ri 16 ext-ri branch on t equal to zero 8 i8 16 ext-i8 branch on t not equal to zero 8 i8 16 ext-i8 branch unconditional 11 i 16 ext-i
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 70 3.8.3 delay slots mips16 instructions normally execute in one cycle. however, some instructions have special requirements that must be met to assure optimum instruction flow. the instructions include all load, branch, and multiply/divide instructions. (1) load delay slots mips16 operates with delayed loads. this is similar to the method used by 32-bit length instruction sets. if another instruction references the load destination register before the load operation is completed, one cycle occurs automatically. to assure the best performance, the compiler should always schedule load delay slots as early as possible. (2) branch delay slots not supported unlike for 32-bit length instructions, there are no branch delay slots for branch instructions in mips16. if a branch is taken, the instruction that immediately follows the branch (instruction corresponding to 32-bit length instruction's delay slot) is cancelled. there are no restrictions on the instruction that follows a branch instruction, and such instruction is executed only when a branch is not taken. branches, jumps, and extended instructions are permitted in the instruction slot after a branch. (3) jump delay slots with mips16, there is a delay of one cycle after each jump instruction. the processor executes any instruction in the jump delay slot before it executes the jump target instruction. two restrictions apply to any instruction placed in the jump delay slot: 1. do not specify a branch or jump in the delay slot. 2. do not specify an extended instruction (32 bits) in the delay slot. doing so will make the results unpredictable. (4) multiply and divide scheduling multiply and divide latency depends on the hardware implementation. if an mflo or mfhi instruction references the multiply or divide result registers before the result is ready, the pipeline stalls until the operation is complete and the result is available. however, to assure the best performance, the compiler should always schedule multiply and divide instructions as early as possible. mips16 requires that all mfhi and mflo instructions be followed by two instructions that do not write to the hi or lo registers. otherwise, the data read by mflo or mfhi will be undefined. the extend instruction is counted singly as one instruction.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 71 3.8.4 instruction details (1) load and store instructions load and store instructions move data between memory and the general-purpose registers. the only addressing mode that is supported is the mode for adding immediate offset to the base register. table 3-14. load and store instructions (1/3) instruction format and description load byte lb ry, offset (rx) the 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to form the virtual address. the bytes of the memory location specified by the address are sign extended and loaded into general-purpose register ry. load byte unsigned lbu ry, offset (rx) the 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to form the virtual address. the bytes of the memory location specified by the address are zero extended and loaded into general-purpose register ry load halfword lh ry, offset (rx) the 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of general- purpose register rx to form the virtual address. the halfword of the memory location specified by the address is sign extended and loaded to general-purpose register ry. if the least significant bit of the address is not 0, an address error exception is generated. load halfword unsigned lhu ry, offset (rx) the 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of general- purpose register rx to form the virtual address. the halfword of the memory location specified by the address is zero extended and loaded to general-purpose register ry. if the least significant bit of the address is not 0, an address error exception is generated. lw ry, offset (rx) the 5-bit immediate is shifted left two bits, zero extended, and then added to the contents of general- purpose register rx to form the virtual address. the word of the memory location specified by the address is loaded to general-purpose register ry. in the 64-bit mode, it is further sign extended to 64 bits. if either of the lower two bits is not 0, an address error exception is generated. lw rx, offset (pc) the two lower bits of the basepc value associated with the instruction are cleared to form the masked basepc value. the 8-bit immediate is shifted left two bits, zero extended, and then added to the masked basepc to form the virtual address. the contents of the word at the memory location specified by the address are loaded to general-purpose register rx. in the 64-bit mode, it is further sign extended to 64 bits. load word lw rx, offset (sp). the 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of general- purpose register sp to form the virtual address. the contents of the word at the memory location specified by the address are loaded to general-purpose register rx. in the 64-bit mode, it is further sign extended to 64 bits. if either of the two lower bits of the address is 0, an address error exception is generated.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 72 table 3-14. load and store instructions (2/3) instruction format and description load word unsigned lwu ry, offset (rx) the 5-bit immediate is shifted left two bits, zero extended to 64 bits, and then added to the contents of general-purpose register rx to form the virtual address. the word of the memory location specified by the address is zero extended and loaded to general-purpose register ry. if either of the two lower bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. ld ry, offset (rx) the 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents of general-purpose register rx to form the virtual address. the 64-bit doubleword of the memory location specified by the address is loaded to general-purpose register ry. if any of the lower three bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. ld ry, offset (pc) the lower three bits of the base pc value related to the instruction are cleared to form the masked basepc value. the 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the masked basepc to form the virtual address. the 64-bit doubleword at the memory location specified by the address is loaded to general-purpose register ry. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. load doubleword ld ry, offset (sp) the 5-bit immediate is shifted left three bits, zero extended to 64 bits, and added to the contents of general-purpose register sp to form the virtual address. the 64-bit doubleword at the memory location specified by the address is loaded to general-purpose register ry. if any of the three lower bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 73 table 3-14. load and store instructions (3/3) instruction format and description store byte sb ry, offset (rx) the 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to form the virtual address. the least significant byte of general-purpose register ry is stored to the memory location specified by the address. store halfword sh ry, offset (rx) the 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of general- purpose register rx to form the virtual address. the lower halfword of general-purpose register ry is stored to the memory location specified by the address. if the least significant bit of the address is not 0, an address error exception is generated. sw ry, offset (rx) the 5-bit immediate is shifted left two bits, zero extended, and then added to the contents of general- purpose register rx to form a virtual address. the contents of general-purpose register ry are stored to the memory location specified by the address. if either of the two lower bits of the address is not 0, an address error exception is generated. sw rx, offset (sp) the 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of general- purpose register sp to form the virtual address. the contents of general-purpose register rx are stored to the memory location specified by the address. if either of the two lower bits of the address is not 0, and address error exception is generated. store word sw ra, offset (sp) the 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of general- purpose register sp to form the virtual address. the contents of general-purpose register ra are stored to the memory location specified by the address. if either of the two lower bits of the address is not 0, an address error exception is generated. sd ry, offset (rx) the 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents of general-purpose register rx to form the virtual address. the 64 bits of general-purpose register ry are stored to the memory location specified by the address. if any of the lower three bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. sd ry, offset (sp) the 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents of general-purpose register sp to form the virtual address. the 64 bits of general-purpose register ry are stored to the memory location specified by the address. if any of the lower three bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. store doubleword sd ra, offset (sp). the 8-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents of general-purpose register sp to form the virtual address. the 64 bits of general-purpose register ra are stored to the memory location specified by the memory. if any of the three lower bits of the address is not 0, an address error exception is generated. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 74 (2) computational instructions computational instructions perform arithmetic, logical, and shift operations on values in registers. there are four categories of computational instructions: alu immediate, two/three-operand register-type, shift, and multiply/divide. table 3-15. alu immediate instructions (1/2) instruction format and description load immediate li rx, immediate the 8-bit immediate is zero extended and loaded to general-purpose register rx. addiu ry, rx, immediate the 4-bit immediate is sign extended and then added to the contents of general-purpose register rx to form a 32-bit result. the result is placed into general-purpose register ry. no integer overflow exception occurs under any circumstances. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value. addiu rx, immediate the 8-bit immediate is sign extended and then added to the contents of general-purpose register rx to form a 32-bit result. the result is placed into general-purpose register rx. no integer overflow exception occurs under any circumstances. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value. addiu sp, immediate the 8-bit immediate is shifted left three bits, sign extended, and then added to the contents of general- purpose register sp to form a 32-bit result. the result is placed into general-purpose register sp. no integer overflow exception occurs under any circumstances. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value. addiu rx, pc, immediate the two lower bits of the basepc value associated with the instruction are cleared to form the masked basepc value. the 8-bit immediate is shifted left two bits, zero extended, and then added to the masked basepc value to form the virtual address. this address is placed into general-purpose register rx. no integer overflow exception occurs under any circumstances. add immediate unsigned addiu rx, sp, immediate the 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of register sp to form a 32-bit result. the result is placed into general-purpose register rx. no integer overflow exception occurs under any circumstance. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 75 table 3-15. alu immediate instructions (2/2) instruction format and description daddiu ry, rx, immediate the 4-bit immediate is sign extended to 64 bits, and then added to the contents of register rx to form a 64-bit result. the result is placed into general-purpose register ry. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. daddiu ry, immediate the 5-bit immediate is sign extended to 64 bits, and then added to the contents of register ry to form a 64-bit result. the result is placed into general-purpose register ry. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. daddiu sp, immediate the 8-bit immediate is shifted left three bits, sign extended to 64 bits, and then added to the contents of register sp to form a 64-bit result. the result is placed into general-purpose register sp. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. daddiu ry, pc, immediate the two lower bits of the basepc value associated with the instruction are cleared to form the masked basepc value. the 5-bit immediate is shifted left two bits, zero extended, and added to the masked basepc value to form the virtual address. this address is placed into general-purpose register ry. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword add immediate unsigned daddiu ry, sp, immediate the 5-bit immediate is shifted left two bits, zero extended to 64 bits, and then added to the contents of register sp to form a 64-bit result. this result is placed into register ry. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. set on less than immediate slti rx, immediate the 8-bit immediate is zero extended and subtracted from the contents of general-purpose register rx. considering both quantities as signed integers, if rx is less than the zero-extended immediate, the result is set to 1; otherwise, the result is set to 0. the result is placed into register t ($24). set on less than immediate unsigned sltiu rx, immediate the 8-bit immediate is zero extended and subtracted from the contents of general-purpose register rx. considering both quantities as signed integers, if rx is less than the zero-extended immediate, the result is set to 1; otherwise, the result is set to 0. the result is placed into register t ($24). compare immediate cmpi rx, immediate the 8-bit immediate is zero extended and exclusive ored in 1-bit units with the contents of general- purpose register rx. the result is placed into register t ($24).
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 76 table 3-16. two-/three-operand register type (1/2) instruction format and description add unsigned addu rz, rx, ry the contents of general-purpose registers rx and ry are added together to form a 32-bit result. the result is placed into general-purpose register rz. no integer overflow exception occurs under any circumstances. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32- bit value. subtract unsigned subu rz, rx, ry the contents of general-purpose register ry are subtracted from the contents of general-purpose register rx. the 32-bit result is placed into general-purpose register rz. no integer overflow exception occurs under any circumstances. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value. doubleword add unsigned daddu rz, rx, ry the contents of general-purpose register ry are added to the contents of general-purpose register rx. the 64-bit result is placed into register rz. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword subtract unsigned dsubu rz, rx, ry the contents of general-purpose register ry are subtracted from the contents of general-purpose register rx. the 64-bit result is placed into general-purpose register rz. no integer overflow exception occurs under any circumstances. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. set on less than slt rx, ry the contents of general-purpose register ry are subtracted from the contents of general-purpose register rx. considering both quantities as signed integers, if the contents of rx are less than the contents of ry, the result is set to 1; otherwise, the result is set to 0. the result is placed into register t ($24). no integer overflow exception occurs. the comparison is valid even if the subtraction overflows. set on less than unsigned sltu rx, ry the contents of general-purpose register ry are subtracted from the contents of general-purpose register rx. considering both quantities as unsigned integers, if the contents of rx are less than the contents of ry, the result is set to 1; otherwise, the result it set to 0. the result is place in register t ($24). no integer overflow exception occurs. the comparison is valid even if the subtraction overflows.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 77 table 3-16. two-/three-operand register type (2/2) instruction format and description compare cmp rx, ry the contents of general-purpose register ry are exclusive-ored with the contents of general-purpose register rx. the result is placed into register t ($24). negate neg rx, ry the contents of general-purpose register ry are subtracted from zero to form a 32-bit result. the result is placed in general-purpose register rx. and and rx, ry the contents of general-purpose register ry are logical anded with the contents of general-purpose register rx in 1-bit units. the result is placed in general-purpose register rx. or or rx, ry the contents of general-purpose register ry are logical ored with the contents of general-purpose register ry. the result is placed in general-purpose register rx. exclusive or xor rx, ry the contents of general-purpose register ry are exclusive-ored with the contents of general-purpose register rx in 1-bit units. the result is placed in general-purpose register rx. not not rx, ry the contents of general-purpose register ry are inverted in 1-bit units and placed in general-purpose register rx. move ry, r32 the contents of general-purpose register r32 are moved to general-purpose register ry. r32 can specify any one of the 32 general-purpose registers. move move r32, rz the contents of general-purpose register rz are moved to general-purpose register r32. r32 can specify any one of the 32 general-purpose registers
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 78 table 3-17. shift instructions (1/2) instruction format and description shift left logical sll rx, ry, immediate the 32-bit contents of general-purpose register ry are shifted left and zeros are inserted into the emptied low-order bits. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. the result is placed in general-purpose register rx. in the 64-bit mode, the value that is formed by sign-extending shifted 32-bit value is stored as the result. shift right logical slr rx, ry, immediate the 32-bit contents of general-purpose register ry are shifted right, and zeros are inserted into the emptied high-order bits. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. the result is placed in general-purpose register rx. in the 64-bit mode, the value that is formed by sign-extending shifted 32-bit value is stored as the result. shift right arithmetic sra rx, ry, immediate the 32-bit contents of general-purpose register ry are shifted right and the emptied high-order bits are sign extended. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. in the 64-bit mode, the value that is formed by sign-extending shifted 32-bit value is stored as the result. shift left logical variable sllv ry, rx the 32-bit contents of general-purpose register ry are shifted left, and zeros are inserted into the emptied low-order bits. the five low-order bits of general-purpose register rx specify the shift count. the result is placed in general-purpose register ry. in the 64-bit mode, the value that is formed by sign- extending shifted 32-bit value is stored as the result. shift right logical variable srlv ry, rx the 32-bit contents of general-purpose register ry are shifted right, and the emptied high-order bits are sign extended. the five lower-order bits of general-purpose register rx specify the shift count. the register is placed in general-purpose register ry. in the 64-bit mode, the value that is formed by sign- extending shifted 32-bit value is stored as the result. shift right arithmetic variable srav ry, rx the 32-bit contents of general-purpose register ry are shifted right, and the emptied high-order bits are sign extended. the five low-order bits of general-purpose register rx specify the shift count. the result is placed in general-purpose register ry. in the 64-bit mode, the value that is formed by sign-extending shifted 32-bit value is stored as the result.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 79 table 3-17. shift instructions (2/2) instruction format and description doubleword shift left logical dsll rx, ry, immediate the 64-bit doubleword contents of general-purpose register ry are shifted left, and zeros are inserted into the emptied low-order bits. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. the 64-bit result is placed in general-purpose register rx. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword shift right logical dsrl ry, immediate the 64-bit doubleword contents of general-purpose register ry are shifted right, and zeros are inserted into the emptied high-order bits. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword shift right arithmetic dsra ry, immediate the 64-bit doubleword contents of general-purpose register ry are shifted right, and the emptied high- order bits are sign extended. the 3-bit immediate specifies the shift count. a shift count of 0 is interpreted as a shift count of 8. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword shift left logical variable dsllv ry, rx the 64-bit doubleword contents of general-purpose register ry are shifted left, and zeros are inserted into the emptied low-order bits. the six low-order bits of general-purpose register rx specify the shift count. the result is placed in general-purpose register ry. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword shift right logical variable dsrlv ry, rx the 64-bit doubleword contents of general-purpose register ry are shifted right, and zeros are inserted into the emptied high-order bits. the six low-order bits of general-purpose register rx specify the shift count. the result is placed in general-purpose register ry. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword shift right arithmetic variable dsrav ry, rx the 64-bit doubleword contents of general-purpose register ry are shifted right, and the emptied high- order bits are sign extended. the six low-order bits of general-purpose register rx specify the shift count. the result is placed in general-purpose register ry. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 80 table 3-18. multiply/divide instructions (1/2) instruction format and description multiply mult rx, ry the contents of general-purpose registers rx and ry are multiplied, treating both operands as 32-bit two's complement values. no integer overflow exception occurs. in the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value. the low-order 32-bit word of the result are placed in special register lo, and the high-order 32-bit word is placed in special register hi. in the 64-bit mode, each result is sign extended and then stored. if either of the two immediately preceding instructions is mfhi or mflo, their transfer instruction execution result becomes undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions, and the mult instruction. multiply unsigned multu rx, ry the contents of general-purpose registers rx and ry are multiplied, treating both operands as 32-bit unsigned values. no integer overflow exception occurs. in the 64-bit mode, the operand must be a 64- bit value formed by sign-extending a 32-bit value. the low-order 32-bit word of the result is placed in special register lo, and the high-order 32-bit word is placed in special register hi. in the 64-bit mode, each result is sign extended and stored. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the multu instruction. divide div rx, ry the contents of general-purpose register rx are divided by the contents of general-purpose register ry, treating both operands as 32-bit two's complement values. no integer overflow exception occurs. the result when the divisor is 0 is undefined. the 32-bit quotient is placed in special register lo, and the 32-bit remainder is placed in special register hi. in the 64-bit mode, the result is sign extended. normally, this instruction is executed after instructions checking for division by zero and overflow. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the div instruction. divide unsigned divu rx, ry the contents of general-purpose register rx are divided by the contents of general-purpose register ry, treating both operands as unsigned values. no integer overflow exception occurs. the result when the divisor is 0 is undefined. the 32-bit quotient is placed in special register lo, and the 32-bit remainder is placed in special register hi. in the 64-bit mode, the result is sign extended. normally, this instruction is executed after instructions checking for division by zero. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the divu instruction. move from hi mfhi rx the contents of special register hi are loaded into general-purpose register rx. to ensure correct operation when an interrupt occurs, do not use an instruction that changes the hi register (mult, multu, div, divu, dmult, dmultu, ddiv, ddivu) for the two instructions after the mfhi instruction.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 81 table 3-18. multiply/divide instructions (2/2) instruction format and description move from lo mflo rx the contents of special register lo are loaded into general-purpose register rx. to ensure correct operation when an interrupt occurs, do not use an instruction that changes the hi register (mult, multu, div, divu, dmult, dmultu, ddiv, ddivu) for the two instructions after the mflo instruction. doubleword multiply dmult rx, ry the 64-bit contents of general-purpose register rx and ry are multiplied, treating both operands as two's complement values. no integer overflow exception occurs. the low-order 64 bits of the result are placed in special register lo, and the high-order 64 bits are placed in special register hi. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the dmult instruction. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword multiply unsigned dmultu rx, ry the 64-bit contents of general-purpose registers rx and ry are multiplied, treating both operands as unsigned values. no integer overflow exception occurs. the low-order 64 bits of the result are placed in special register lo, and the high-order 64 bits of the result are placed in special register hi. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the dmultu instruction. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword divide ddiv rx, ry the 64-bit contents of general-purpose registers rx are divided by the contents of general-purpose register ry, treating both operands as two's complement values. no integer overflow exception occurs. the result when the divisor is 0 is undefined. the 64-bit quotient is placed in special register lo, and the 64-bit remainder is placed in special register hi. normally, this instruction is executed after instructions checking for division by zero and overflow. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the ddiv instruction. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated. doubleword divide unsigned ddivu rx, ry the 64-bit contents of general-purpose register rx are divided by the contents of general-purpose register ry, treating both operands as unsigned values. no integer overflow exception occurs. the result when the divisor is 0 is undefined. the 64-bit quotient is placed in special register lo, and the 64-bit remainder is placed in special register hi. normally, this instruction is executed after an instruction checking for division by zero. if either of the two immediately preceding instructions is mfhi or mflo, the result of execution of these transfer instructions is undefined. to obtain the correct result, insert two or more other instructions between the mfhi, mflo instructions and the ddivu instruction. this operation is defined in the 64-bit mode and the 32-bit kernel mode. when this instruction is executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 82 (3) jump and branch instructions jump and branch instructions change the control flow of a program. all jump instructions occur with a one-instruction delay. that is, the instruction immediately following the jump is always executed. branch instructions do not have a delay slot. if a branch is taken, the instruction immediately following the branch is never executed. if the branch is not taken, the instruction immediately following the branch is always executed. table 3-19 shows the mips16 jump and branch instructions. table 3-19. jump and branch instructions (1/2) instruction format and description jump and link jal target the 26-bit target address is shifted left two bits and combined with the high-order four bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. the address of the instruction immediately following the delay slot is placed in register ra. the isa mode bit is left unchanged. the value stored in ra bit 0 will reflect the current isa mode bit. jump and link exchange jalx target the 26-bit target address is shifted left two bits and combined with the high-order four bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. the address of the instruction immediately following the delay slot is placed in register ra. the isa mode bit is inverted with a delay of one instruction. the value stored in ra bit 0 will reflect the isa mode bit before execution of the jump execution. jr rx the program unconditionally jumps to the address specified in general-purpose register rx, with a delay of one instruction. the instruction sets the isa mode bit to the value in rx bit 0. if the jump target address is in the mips16 instruction length mode, no address exception occurs when bit 0 of the source register is 1 because bit 0 of the target address is 0 so that the instruction is located at the halfword boundary. if the 32-bit length instruction mode is changed, an address exception occurs when the jump target address is fetched if the two low-order bits of the target address are not 0. jump register jr ra the program unconditionally jumps to the address specified in register ra, with a delay of one instruction. the instruction sets the isa mode bit to the value in ra bit 0. if the jump target address is in the mips16 instruction length mode, no address exception occurs when bit 0 of the source register is 1 because bit 0 of the target address is 0 so that the instruction is located at the halfword boundary. if the 32-bit length instruction mode is changed, an address exception occurs when the jump target address is fetched if the two low-order bits of the target address are not 0. jump and link register jalr ra, rx the program unconditionally jumps to the address contained in register rx, with a delay of one instruction. this instruction sets the isa mode bit to the value in rx bit 0. the address of the instruction immediately following the delay slot is placed in register ra. the value stored in ra bit 0 will reflect the isa mode bit before the jump execution is executed. if the jump target address is in the mips16 instruction length mode, no address exception occurs when bit 0 of the source register is 1 because bit 0 of the target address is 0 so that the instruction is located at the halfword boundary. if the 32-bit length instruction mode is changed, an address exception occurs when the jump target address is fetched if the two low-order bits of the target address are not 0.
chapter 3 mips16 instruction set user?s manual u15509ej2v0um 83 table 3-19. jump and branch instructions (2/2) instruction format and description branch on equal to zero beqz rx, immediate the 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the instruction after the branch to form the target address. if the contents of general-purpose register rx are equal to zero, the program branches to the target address. no delay slot is generated. branch on not equal to zero bnez rx, immediate the 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the instruction after the branch to form the target address. if the contents of general-purpose register rx are not equal to zero, the program branches to the target address. no delay slot is generated. branch on t equal to zero bteqz immediate the 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the instruction after the branch to form the target address. if the contents of special register t ($24) are not equal to zero, the program branches to the target address. no delay slot is generated. branch on t not equal to zero btnez immediate the 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the instruction after the branch to form the target address. if the contents of special register t ($24) are not equal to zero, the program branches to the target address. no delay slot is generated. branch unconditional b immediate the 11-bit immediate is shifted left one bit, sign extended, and then added to the address of the instruction after the branch to form the target address. the program branches to the target address unconditionally. (4) special instructions special instructions unconditionally perform branching to general exception vectors. special instructions are of the r type. table 3-20 shows three special instructions. table 3-20. special instructions instruction format and description breakpoint break immediate a breakpoint trap occurs, immediately and unconditionally transferring control to the exception handler. by using a 6-bit code area, parameters can be sent to the exception handler. if the exception handler uses this parameter, the contents of memory including instructions must be loaded as data. extend extend immediate the 11-bit immediate is combined with the immediate in the next instruction to form a larger immediate equivalent to 32-bit mips. the extend instruction must always precede (by one instruction) the instruction whose immediate field you want to extend. every extended instruction consumes four bytes in program memory instead of two bytes (two bytes for extend and two bytes for the instruction being extended), and it can cross a word boundary. (for details, see 3.8.2 extend instruction. ) system call syscall a system call trap occurs, immediately and unconditionally transferring control to the exception handler.
user?s manual u15509ej2v0um 84 chapter 4 pipeline this chapter describes the basic operation of the v r 4100 series processor pipeline, which includes descriptions of the delay slots (instructions that follow a branch or load instruction in the pipeline), and interrupts to the pipeline flow caused by interlocks and exceptions. 4.1 pipeline stages in the v r series, an instruction execution system called a pipeline is adopted. in the pipeline, instruction execution processing is delimited into several stages. instruction execution is complete when each stage is passed. when processing of one instruction in one stage of the pipeline is complete, the next instruction enters that stage. when the pipeline is full, it means that instructions equaling the number of pipeline stages are being executed simultaneously. the pipeline clock is called the pclock. each cycle of the pclock is called a pcycle. instructions are read in synchronization with the pclock. each stage of the pipeline is executed in one pcycle. therefore, executing an instruction requires as many pcycles as the number of pipeline stages. when the required data has not been cached and must instead be fetched from the main memory, the execution requires more cycles than the number of pipeline stages. 4.1.1 v r 4121, v r 4122, v r 4181a the pipeline of the v r 4121, v r 4122, or v r 4181a has five stages in the mips iii (32-bit length) instruction mode, or six stages in the mips16 (16-bit length) instruction mode. the name and meanings of each stage are as follows. ? if - instruction cache fetch ? it - instruction translation (in mips16 instruction mode only) ? rf - register fetch ? ex - execution ? dc - data cache fetch ? wb - writeback
chapter 4 pipelilne user?s manual u15509ej2v0um 85 figure 4-1. pipeline stages (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode pcycle pclock stage if rf ex dc wb (b) mips16 instruction mode pcycle pclock stage if it ex dc wb rf figure 4-2 shows instruction execution in the pipeline. in this figure, a row indicates the execution process of each instruction, and a column indicates the processes executed simultaneously.
chapter 4 pipelilne user?s manual u15509ej2v0um 86 figure 4-2. instruction execution in the pipeline (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode (5-deep) current cpu cycle pcycle if rf ex dc wb if rf ex dc wb if rf ex dc wb if rf ex dc wb if rf ex dc wb (b) mips16 instruction mode pcycle it rf ex dc wb if rf ex dc wb if rf ex dc wb if rf ex dc wb if rf ex dc wb if rf ex dc wb if it rf if it rf if it rf if it rf if it rf if (6-deep) current cpu cycle
chapter 4 pipelilne user?s manual u15509ej2v0um 87 4.1.2 v r 4131 the pipeline of the v r 4131 employs the 2-way superscalar mechanism that can execute two instructions each in the same stage. each pipeline has six stages in the mips iii (32-bit length) instruction mode, or seven stages in the mips16 (16-bit length) instruction mode. the name and meanings of each stage are as follows. ? if - instruction cache fetch ? it - instruction translation (in mips16 instruction mode only) ? rf - register fetch ? ex - execution ? dc1 - data cache fetch ? dc2 - data read ? wb - writeback figure 4-3. pipeline stages (v r 4131) (a) mips iii instruction mode pcycle if rf ex dc1 dc2 wb pclock stage (b) mips16 instruction mode pcycle if it rf ex dc1 dc2 wb pclock stage figure 4-4 shows instruction execution in the pipeline. in this figure, a row indicates the execution process of each instruction, and a column indicates the processes executed simultaneously.
chapter 4 pipelilne user?s manual u15509ej2v0um 88 figure 4-4. instruction execution in the pipeline (v r 4131) (a) mips iii instruction mode pcycle (6-deep) current cpu cycle if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if rf rf ex ex dc1 dc1 dc2 dc2 wb wb (b) mips16 instruction mode pcycle (7-deep) current cpu cycle if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb if if it it rf rf ex ex dc1 dc1 dc2 dc2 wb wb
chapter 4 pipelilne user?s manual u15509ej2v0um 89 4.1.3 v r 4181 the pipeline of the v r 4181 has five stages regardless the instruction set modes. each stage has two phases: 1 and 2. the name and meanings of each stage are as follows. ? if - instruction cache fetch ? rf - register fetch ? ex - execution ? dc - data cache fetch ? wb - write back figure 4-5. pipeline stages (v r 4181) 1 pcycle pclock stage phase if rf ex dc wb 2 1 2 1 2 1 2 1 2 figure 4-6 shows instruction execution in the pipeline. in this figure, a row indicates the execution process of each instruction, and a column indicates the processes executed simultaneously. figure 4-6. instruction execution in the pipeline (v r 4181) (5-deep) current cpu cycle pcycle rf1 ex1 dc1 wb1 if1 rf2 ex2 dc2 wb2 if2 rf1 ex1 dc1 wb1 if1 rf2 ex2 dc2 wb2 if2 rf1 ex1 dc1 wb1 if1 rf2 ex2 dc2 wb2 if2 rf1 ex1 dc1 wb1 if1 rf2 ex2 dc2 wb2 if2 rf1 ex1 dc1 wb1 if1 rf2 ex2 dc2 wb2 if2
chapter 4 pipelilne user?s manual u15509ej2v0um 90 4.2 branch delay during a v r 4100 series' pipeline operation, a branch delay occurs when: ? target address is calculated by a jump instruction ? branch condition of branch instruction is met and then logical operation starts for branch-destination comparison the instruction location immediately following a jump/branch instruction is referred to as the branch delay slot. 4.2.1 v r 4121, v r 4122, v r 4181a the instruction address generated at the ex stage in the jump/branch instruction is available in the if stage two instructions later. in the v r 4121, v r 4122, and v r 4181a, two cycles of branch delay occurs during mips iii (32-bit length) instruction mode, or three cycles during mips16 (16-bit length) instruction mode, when a branch condition is met. an instruction in the branch delay slot is executed during mips iii instruction mode (except for branch likely instructions), though it is discarded during mips16 instruction mode. figure 4-7 illustrates the branch delay and the location of the branch delay slot. figure 4-7. branch delay (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode pcycle if rf ex dc wb if target rf ex dc wb if jump/branch branch delay (branch delay slot) rf ex dc wb (b) mips16 instruction mode target jump/branch branch delay (branch delay slot) pcycle if it ex dc wb if ex dc wb rf it rf if it ex dc wb rf
chapter 4 pipelilne user?s manual u15509ej2v0um 91 4.2.2 v r 4131 the instruction address prefetched at the rf stage in the jump/branch instruction is available in the if stage two instructions later. since the v r 4131 employs the 2-way superscalar mechanism, the manipulation of succeeding instructions differs depending that the address of a jump/branch instruction is higher or not than that of the instruction in the other way when it is fetched. (1) mips iii instruction mode in the v r 4131, two cycles of branch delay occurs when a branch condition is met. an instruction in the branch delay slot is executed (except for branch likely instructions). figure 4-8 illustrates the branch delay and the location of the branch delay slot. figure 4-8. branch delay (v r 4131, mips iii instruction mode) (a) when jump/branch has lower address pcycle jump/branch if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf if rf 4 8 c 0 (branch delay slot) if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 0 4 target (b) when jump/branch has higher address pcycle jump/branch if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf 4 c 8 (branch delay slot) if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 0 target
chapter 4 pipelilne user?s manual u15509ej2v0um 92 (2) mips16 instruction mode in the v r 4131, three cycles of branch delay occurs when a branch condition is met. an instruction in the branch delay slot is discarded. figure 4-9 illustrates the branch delay and the location of the branch delay slot. figure 4-9. branch delay (v r 4131, mips16 instruction mode) (a) when jump/branch has lower address 3 1 pcycle jump/branch target if rf ex dc1 dc2 wb if rf it it ex it rf it if if rf 3 1 7 5 it it if if b 9 it rf ex dc1 dc2 wb it if if rf ex dc1 dc2 wb (branch delay slot) (b) when jump/branch has higher address pcycle jump/branch target if rf ex dc1 dc2 wb it it rf it if if rf 3 7 5 it it if if b 9 it rf ex dc1 dc2 wb it if if rf ex dc1 dc2 wb 3 1 (branch delay slot)
chapter 4 pipelilne user?s manual u15509ej2v0um 93 4.2.3 v r 4181 the instruction address generated at the rf stage in the jump/branch instruction are available in the if stage, two instructions later. in the v r 4181, one cycle of branch delay occurs when a branch condition is met in mips iii instruction mode. an instruction in the branch delay slot is executed (except for branch likely instructions). no branch delay due to a branch instruction occurs in mips16 instruction mode. when a branch condition is met, the instruction representing a delay slot is discarded. figure 4-10 illustrates the branch delay and the location of the branch delay slot. figure 4-10. branch delay (v r 4181) pcycle if rf ex dc wb if target rf ex dc wb if jump/branch branch delay (branch delay slot) rf ex dc wb
chapter 4 pipelilne user?s manual u15509ej2v0um 94 4.3 branch prediction the v r 4122, v r 4131, and v r 4181a have a branch prediction mechanism to speed up branch instruction processing. the v r 4122, v r 4131, and v r 4181a have a full-associative virtual address cache called a branch prediction table. this table holds the history of the branches that have been satisfied recently, using the address of the branch instruction as a tag and the branch destination address as data. the v r 4122, v r 4131, and v r 4181a reference the branch prediction table when they fetch a branch instruction. if the same branch instruction is in the table (hit), they branch to the branch destination address in the table rather than calculating the branch destination address. if the corresponding branch instruction is not in the table (miss), they recalculate the branch destination address. if the condition of a missed branch instruction is satisfied, that branch instruction and the address of the branch destination are stored in the branch prediction table. new history is written over the entry stored earliest (lru (least recently used) algorithm). the branch prediction table of the v r 4122 and v r 4181a can hold four entries, and that of the v r 4131 can hold eight entries. whether the branch prediction mechanism is to be used can be specified by using the bp bit of the config register of cp0. branch prediction is executed when the bp bit is cleared to 0; it is not executed when the bit is set to 1. the bp bit is cleared to 0 by default. branch prediction is not executed in the mips16 instruction mode and debug mode. the bp bit is automatically set to 1. because the branch prediction table is a virtual address cache, it is invalid if the contents of a physical address corresponding to a virtual address change. when performing an operation that rewrites the text area (such as changing the bank or downloading), therefore, either disable branch prediction (by setting the bp bit to 1) or clear the history of the branch prediction table immediately before. clear the history regardless of whether the v r 4122, v r 4131, or v r 4181a operates in the virtual address mode. the v r 4122, v r 4131, and v r 4181a clear the history of the branch prediction table in the following cases. - writing to entryhi register - writing to config register (v r 4131 only) - execution of tlbwi instruction - execution of tlbwr instruction - execution of tlbr instruction
chapter 4 pipelilne user?s manual u15509ej2v0um 95 4.3.1 v r 4122, v r 4181a the v r 4122 and v r 4181a reference the branch prediction table in the if stage of a branch instruction. if a hit occurs when the branch condition is decoded in the rf stage, the instruction at the branch destination address output from the branch prediction table is fetched. when the branch condition is checked in the ex stage and it has been ascertained that a branch is to occur, the pipeline processing of the instruction at the branch destination continues. if it has been found that a branch is not to occur, the processing of the instruction at the branch destination is stopped, and the next instruction in the branch delay slot is fetched in the dc stage. if it is found that the condition of a branch instruction missed in the branch prediction table is satisfied and that a branch is to occur, the branch prediction table is updated in the dc stage. the figure below illustrates the pipeline operation when branch prediction is performed. figure 4-11. pipeline on branch prediction (v r 4122, v r 4181a) (1/2) (a) when branch prediction misses and no branch is to occur pcycle if rf ex dc wb if instruction following branch delay slot rf ex dc wb if branch (branch delay slot) rf ex dc wb (b) when branch prediction misses and branch is to occur pcycle if rf ex dc wb if target rf ex dc wb if branch (branch delay slot) rf ex dc wb
chapter 4 pipelilne user?s manual u15509ej2v0um 96 figure 4-11. pipeline on branch prediction (v r 4122, v r 4181a) (2/2) (c) when branch prediction hits and no branch is to occur pcycle if rf ex dc wb if target if branch (branch delay slot) rf ex dc wb if instruction following branch delay slot rf ex dc wb (d) when branch prediction hits and branch is to occur pcycle if rf ex dc wb if target rf ex dc wb if branch (branch delay slot) rf ex dc wb
chapter 4 pipelilne user?s manual u15509ej2v0um 97 4.3.2 v r 4131 the v r 4131 references the branch prediction table in the if stage of a branch instruction. if a hit occurs, the instruction at the branch destination address output from the branch prediction table is fetched. when the branch condition is checked in the ex stage and it has been ascertained that a branch is to occur, the pipeline processing of the instruction at the branch destination continues. if it has been found that a branch is not to occur, the processing of the instruction at the branch destination is stopped, and the next instruction in the branch delay slot is fetched in the dc stage. if it is found that the condition of a branch instruction missed in the branch prediction table is satisfied and that a branch is to occur, the branch prediction table is updated in the dc stage. the figure below illustrates the pipeline operation when branch prediction is performed. figure 4-12. pipeline on branch prediction (v r 4131, when the branch is in the lower address) (1/2) (a) when branch prediction misses and no branch is to occur pcycle branch if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 8 c 0 (branch delay slot) (b) when branch prediction misses and branch is to occur pcycle branch target if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf if if if rf 4 8 c 0 14 0 4 10 (branch delay slot)
chapter 4 pipelilne user?s manual u15509ej2v0um 98 figure 4-12. pipeline on branch prediction (v r 4131, when the branch is in the lower address) (2/2) (c) when branch prediction hits and no branch is to occur pcycle branch target instruction following branch delay slot if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf if 4 4 8 0 c 8 (branch delay slot) (d) when branch prediction hits and branch is to occur pcycle branch target if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 4 0 c 8 (branch delay slot)
chapter 4 pipelilne user?s manual u15509ej2v0um 99 figure 4-13. pipeline on branch prediction (v r 4131, when the branch is in the higher address) (1/2) (a) when branch prediction misses and no branch is to occur pcycle branch if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 c 8 (branch delay slot) (b) when branch prediction misses and branch is to occur pcycle branch target if rf ex dc1 dc2 wb if rf rf ex dc1 dc2 wb if if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 c 8 4 0 (branch delay slot)
chapter 4 pipelilne user?s manual u15509ej2v0um 100 figure 4-13. pipeline on branch prediction (v r 4131, when the branch is in the higher address) (2/2) (c) when branch prediction hits and no branch is to occur pcycle branch instruction following branch delay slot if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 c 8 0 c (branch delay slot) (d) when branch prediction hits and branch is to occur pcycle branch target if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb if if rf ex dc1 dc2 wb if rf ex dc1 dc2 wb 4 c 8 4 0 (branch delay slot)
chapter 4 pipelilne user?s manual u15509ej2v0um 101 4.4 load delay the instruction location immediately following a load instruction is referred to as the load delay slot. the instruction in a load delay slot can use the contents of the loaded register, however in such cases hardware interlocks insert additional delay cycles. consequently, scheduling load delay slots can be desirable, both for performance and v r -series processor compatibility. in the v r 4121, v r 4122, and v r 4181a, two cycles of dc stage are necessary during a load instruction execution for data read from the data cache and data alignment, and therefore hardware automatically causes interlock. 4.5 instruction streaming if a miss occurs in the instruction cache, a cycle to refill instructions from the main memory to the instruction cache is started. at this time, the v r 4122, v r 4131, and v r 4181a continue pipeline processing while writing data (instructions) to the instruction cache and bypassing the data (instructions) to the instruction decoder of the cpu. therefore, processing can be resumed earlier from a stall that takes place if a miss occurs in the instruction cache. this instruction data bypassing function is called streaming. the instruction streaming function is enabled or disabled by the is bit of the config register of cp0. instruction streaming is executed when the is bit is cleared to 0; it is not executed when the bit is set to 1. the is bit is cleared to 0 by default. if instruction streaming is not executed, the pipeline is stalled until refilling the instruction cache has been completed.
chapter 4 pipelilne user?s manual u15509ej2v0um 102 4.6 pipeline activities figure 4-14 shows the activities that can occur during each pipeline stage; table 4-1 describes these pipeline activities. figure 4-14. pipeline activities (1/2) (a) v r 4121, v r 4122, and v r 4181a alu load/store branch if pcycle pclock stage instruction fetch instruction translation & decode rf ex dc dc itc ica idec wb ex rf itlb it itr wb dca dva dtlb dla dtc wb dsa dtd dcw bac itc note (b) v r 4131 pcycle pclock alu stage instruction fetch load/store branch if it rf ica itc itc note ex dc1 dc2 wb itlb idec rf ex dva dca dtlb dsa bac dla dtc dtd wb wb dcw itr decode note when mips iii instruction mode
chapter 4 pipelilne user?s manual u15509ej2v0um 103 figure 4-14. pipeline activities (2/2) (c) v r 4181 alu load/store branch if1 pcycle pclock stage instruction fetch & decode itc idec wb ex rf itlb dca dva dtlb dla dtc wb sa dtd dcw bac rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2 ica idc table 4-1. description of pipeline activities during each stage mnemonic description idc instruction cache address decode itlb instruction address translation ica instruction cache array access itr instruction translation itc instruction tag check idec instruction decode rf register operand fetch bac branch address calculation ex execution stage dva data virtual address calculation sa/dsa store align dca data cache address decode/array access dtlb data address translation dla data cache load align dtc data tag check dtd data transfer to data cache dcw data cache write wb write back to register file the operation of the pipeline is illustrated by the following examples that describe how typical instructions are executed. each instruction is taken through the pipeline and the operations that occur in each relevant stage are described.
chapter 4 pipelilne user?s manual u15509ej2v0um 104 (1) add instruction (add rd, rs, rt) if stage the eleven least-significant bits of the virtual address are used to access the instruction cache. then the cache index is compared with the page frame number and the cache data is read out. the virtual pc is incremented by 4 so that the next instruction can be fetched. it stage a mips16 instruction is translated into a 32-bit length instruction (v r 4121, v r 4122, v r 4131, and v r 4181a only). rf stage the 2-port register file is addressed with the rs and rt fields and the register data is valid at the register file output. at the same time, bypass multiplexers select inputs from either the ex- or dc- stage output in addition to the register file output, depending on the need for an operand bypass. ex stage the operands flow into the alu inputs, and the alu operation is started. the result of the alu operation is latched into the alu output latch. dc stage this stage is a nop for this instruction. the data from the output of the ex stage (the alu) is moved into the output latch of the dc. wb stage the wb latch feeds the data to the inputs of the register file, which is accessed by the rd field. the data is written into the file. figure 4-15. add instruction pipeline activities (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode if stage pcycle pclock rf ex dc wb itc ica idec wb ex itlb rf (b) mips16 instruction mode if stage pcycle pclock it ex dc wb itc ica idec wb ex rf itlb rf itr
chapter 4 pipelilne user?s manual u15509ej2v0um 105 figure 4-16. add instruction pipeline activities (v r 4131) (a) mips iii instruction mode pcycle pclock stage if rf ica itlb idec rf ex wb itc ex dc1 dc2 wb (b) mips16 instruction mode pcycle pclock stage if it rf ica itlb itr idec rf ex wb itc ex dc1 dc2 wb figure 4-17. add instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec wb ex rf itlb rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 106 (2) jump and link register instruction (jalr rd, rs) if stage same as the if stage for the add instruction. it stage same as the it stage for the add instruction (v r 4121, v r 4122, v r 4131, and v r 4181a only). rf stage a register specified in the rs field is read from the file, and the value read from the rs register is input to the virtual pc latch synchronously. this value is used to fetch an instruction at the jump destination. the value of the virtual pc incremented during the if stage is incremented again to produce the link address pc + 8 (pc + 4 in mips16 instruction mode) where pc is the address of the jalr instruction. the resulting value is the pc to which the program will eventually return. this value is placed in the link output latch of the instruction address unit. ex stage the pc + 8 (pc + 4 in mips16 instruction mode) value is moved from the link output latch to the output latch of the ex stage. dc stage the pc + 8 (pc + 4 in mips16 instruction mode) value is moved from the output latch of the ex stage to the output latch of the dc stage. wb stage refer to the add instruction. note that if no value is explicitly provided for rd then register 31 is used as the default. if rd is explicitly specified, it cannot be the same register addressed by rs; if it is, the result of executing such an instruction is undefined. figure 4-18. jalr instruction pipeline activities (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode bac if stage pcycle pclock rf ex dc wb itc ica idec wb ex rf itlb (b) mips16 instruction mode bac if stage pcycle pclock rf ex dc wb itc ica idec wb ex rf itlb it itr
chapter 4 pipelilne user?s manual u15509ej2v0um 107 figure 4-19. jalr instruction pipeline activities (v r 4131) (a) mips iii instruction mode pcycle pclock stage if rf ica itlb idec rf ex bac wb itc ex dc1 dc2 wb (b) mips16 instruction mode pcycle pclock stage if it rf ica itlb itr idec rf ex bac wb itc ex dc1 dc2 wb figure 4-20. jalr instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec wb ex rf itlb bac rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 108 (3) branch on equal instruction (beq rs, rt, offset) if stage same as the if stage for the add instruction. it stage same as the it stage for the add instruction (v r 4121, v r 4122, v r 4131, and v r 4181a only). rf stage the register file is addressed with the rs and rt fields. a check is performed to determine if each corresponding bit position of these two operands has equal values. if they are equal, the pc is set to pc + target, where target is the sign-extended offset field. if they are not equal, the pc is set to pc + 4. ex stage the next pc resulting from the branch comparison is valid at the beginning of instruction fetch. dc stage this stage is a nop for this instruction. wb stage this stage is a nop for this instruction. figure 4-21. beq instruction pipeline activities (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode bac if pclock rf ex dc wb itc ica idec wb ex rf itlb pcycle stage (b) mips16 instruction mode bac if stage pclock rf ex dc wb itc ica idec wb ex rf itlb it itr pcycle
chapter 4 pipelilne user?s manual u15509ej2v0um 109 figure 4-22. beq instruction pipeline activities (v r 4131) (a) mips iii instruction mode pcycle pclock stage if rf ica itlb idec rf ex bac wb itc ex dc1 dc2 wb (b) mips16 instruction mode pcycle pclock stage if it rf ica itlb itr idec rf ex bac wb itc ex dc1 dc2 wb figure 4-23. beq instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec ex rf itlb bac rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 110 (4) trap if less than instruction (tlt rs, rt) remark tlt instruction is not included in the mips16 instruction set. if stage same as the if stage for the add instruction. rf stage same as the rf stage for the add instruction. ex stage alu controls are set to do an a ? b operation. the operands flow into the alu inputs, and the alu operation is started. the result of the alu operation is latched into the alu output latch. the sign bits of operands and of the alu output latch are checked to determine if a less than condition is true. if this condition is true, a trap exception occurs. the value in the pc register is used as an exception vector value, and from now on any instruction will be invalid. dc stage this stage is a nop for this instruction. wb stage the epc register is loaded with the value of the pc if the less than condition was met in the ex stage. the cause register excode field and bd bit are updated appropriately, as is the exl bit of the status register. if the less than condition was not met in the ex stage, no activity occurs in the wb stage. figure 4-24. tlt instruction pipeline activities (v r 4121, v r 4122, v r 4181a) if pclock rf ex dc wb itc ica idec ex rf itlb pcycle stage
chapter 4 pipelilne user?s manual u15509ej2v0um 111 figure 4-25. tlt instruction pipeline activities (v r 4131) pcycle pclock stage if rf ica itlb idec rf ex itc ex dc1 dc2 wb figure 4-26. tlt instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec ex rf itlb rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 112 (5) load word instruction (lw rt, offset (base)) if stage same as the if stage for the add instruction. it stage same as the it stage for the add instruction (v r 4121, v r 4122, v r 4131, and v r 4181a only). rf stage same as the rf stage for the add instruction. note that the base field is in the same position as the rs field. ex stage refer to the ex stage for the add instruction. for lw, the inputs to the alu come from gpr[base] through the bypass multiplexer and from the sign-extended offset field. the result of the alu operation that is latched into the alu output latch represents the effective virtual address of the operand (dva). dc stage the cache tag field is compared with the page frame number (pfn) field of the tlb entry. after passing through the load aligner, aligned data is placed in the dc output latch. dc2 stage after passing through the load aligner, aligned data is placed in the dc2 output latch (v r 4121, v r 4122, v r 4131, and v r 4181a only). wb stage the cache read data is written into the register file addressed by the rt field. figure 4-27. lw instruction pipeline activities (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode dva dca dla dtlb dtc if pclock rf ex dc dc2 itc ica idec wb ex rf itlb wb pcycle stage (b) mips16 instruction mode dva dca dla dtlb dtc if stage pcycle pclock rf ex dc dc2 itc ica idec wb ex rf itlb wb it
chapter 4 pipelilne user?s manual u15509ej2v0um 113 figure 4-28. lw instruction pipeline activities (v r 4131) (a) mips iii instruction mode pcycle pclock stage if rf ica itlb idec rf ex dva dca dtlb dla dtc wb itc ex dc1 dc2 wb (b) mips16 instruction mode pcycle pclock stage if it rf ica itlb idec rf ex dva dca dtlb dla dtc wb itc ex dc1 dc2 wb figure 4-29. lw instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec wb ex rf itlb dca dva dtlb dla dtc rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 114 (6) store word instruction (sw rt, offset (base)) if stage same as the if stage for the add instruction. it stage same as the it stage for the add instruction (v r 4121, v r 4122, v r 4131, and v r 4181a only). rf stage same as the rf stage for the lw instruction. ex stage refer to the lw instruction for a calculation of the effective address. from the rf output latch, the gpr[rt] is sent through the bypass multiplexer and into the main shifter. the results of the alu are latched in the output latches. dc stage refer to the lw instruction for a description of the cache access. the store data is aligned. dc2 stage refer to the lw instruction for a description of the cache access (v r 4121, v r 4122, v r 4131, and v r 4181a only). wb stage if there was a cache hit, the content of the store data output latch is written into the data cache at the appropriate word location. note that all store instructions use the data cache for two consecutive pcycles. if the following instruction requires use of the data cache, the pipeline is slipped for one pcycle to complete the writing of an aligned store data. figure 4-30. sw instruction pipeline activities (v r 4121, v r 4122, v r 4181a) (a) mips iii instruction mode dva dca dla dtlb dtc if pclock rf ex dc dc2 itc ica idec wb ex rf itlb wb dsa dtd stage pcycle (b) mips16 instruction mode dva dca dla dtlb dtc if stage pclock rf ex dc dc2 itc ica idec wb ex rf itlb wb it dsa dtd pcycle
chapter 4 pipelilne user?s manual u15509ej2v0um 115 figure 4-31. sw instruction pipeline activities (v r 4131) (a) mips iii instruction mode pcycle pclock stage if rf ica itlb idec rf ex dva dca dtlb dla dtc dsa dtd wb itc ex dc1 dc2 wb (b) mips16 instruction mode pcycle pclock stage if it rf ica itlb idec rf ex dva dca dtlb dla dtc dsa dtd wb itc ex dc1 dc2 wb figure 4-32. sw instruction pipeline activities (v r 4181) if1 pcycle pclock stage itc ica idc idec ex rf itlb dva dtlb dtc sa dtd dcw rf1 ex1 dc1 wb1 if2 rf2 ex2 dc2 wb2 1 phase 2 1 2 1 2 1 2 1 2
chapter 4 pipelilne user?s manual u15509ej2v0um 116 4.7 interlock and exception smooth pipeline flow is interrupted when cache misses or exceptions occur, or when data dependencies are detected. interruptions handled using hardware, such as cache misses, are referred to as interlocks, while those that are handled using software are called exceptions. as shown in figure 4-33, all interlock and exception conditions are collectively referred to as faults. figure 4-33. interlocks, exceptions, and faults software hardware faults exceptions abort stall slip interlocks at each cycle, exception and interlock conditions are checked for all active instructions. because each exception or interlock condition corresponds to a particular pipeline stage, a condition can be traced back to the particular instruction in the exception/interlock stage, as shown in table 4-2. for instance, an ldi interlock is raised in the register fetch (rf) stage. tables 4-3 and 4-4 describe the pipeline interlocks and exceptions listed in table 4-2.
chapter 4 pipelilne user?s manual u15509ej2v0um 117 table 4-2. correspondence of pipeline stage to interlock and exception conditions stage if rf ex dc wb status (it) interlock stall ? itm icm ? dtm dcm dcb ? slip ? ldi mdi sli cp0 ??? exception iaerr nmi itlb intr ibe sysc bp cun rsvd trap ovf daerr reset dtlb dtmod wat dbe nmi (v r 4131) intr (v r 4131) ? remark in the above table, exception conditions are listed up in higher priority order.
chapter 4 pipelilne user?s manual u15509ej2v0um 118 table 4-3. pipeline interlock interlock description itm interrupt tlb miss icm interrupt cache miss ldi load data interlock mdi md busy interlock sli store-load interlock cp0 coprocessor 0 interlock dtm data tlb miss dcm data cache miss dcb data cache busy table 4-4. description of pipeline exception exception description iaerr instruction address error exception nmi non-maskable interrupt exception itlb itlb exception intr interrupt exception ibe instruction bus error exception sysc system call exception bp breakpoint exception cun coprocessor unusable exception rsvd reserved instruction exception trap trap exception ovf overflow exception daerr data address error exception reset reset exception dtlb dtlb exception dtmod dtlb modified exception wat watch exception dbe data bus error exception
chapter 4 pipelilne user?s manual u15509ej2v0um 119 4.7.1 exception conditions when an exception condition occurs, the relevant instruction and all those that follow it in the pipeline are cancelled. accordingly, any stall conditions and any later exception conditions that may have referenced this instruction are inhibited; there is no benefit in servicing stalls for a cancelled instruction. when an exceptional condition is detected for an instruction, the v r 4100 series will kill it and all following instructions. when this instruction reaches the wb stage, the exception flag and various information items are written to cp0 registers. the current pc is changed to the appropriate exception vector address and the exception bits of earlier pipeline stages are cleared. this implementation allows all preceding instructions to complete execution and prevents all subsequent instructions from completing. thus the value in the epc is sufficient to restart execution. it also ensures that exceptions are taken in the order of execution; an instruction taking an exception may itself be killed by an instruction further down the pipeline that takes an exception in a later cycle. figure 4-34. exception detection 2 : killed stage : cancellation instruction causing exception 1 if rf ex dc wb if rf if ex dc wb rf ex dc wb if rf ex dc wb exception vector
chapter 4 pipelilne user?s manual u15509ej2v0um 120 4.7.2 stall conditions stalls are used to stop the pipeline for conditions detected after the rf stage. when a stall occurs, the processor will resolve the condition and then the pipeline will continue. figure 4-35 shows a data cache miss stall, and figure 4-36 shows a cache instruction stall. figure 4-35. data cache miss stall <1> if rf ex dc wb wb wb wb wb if rf ex dc dc dc dc dc wb if rf ex ex ex ex ex dc wb if rf rf rf rf rf ex dc wb <2> <3> <1> data cache miss <2> start moving data cache line to write buffer <3> get last word into cache and restart pipeline if the cache line to be replaced is dirty ? the w bit is set ? the data is moved to the internal write buffer in the next cycle. the write-back data is returned to memory. the last word in the data is returned to the cache at <3>, and pipelining restarts. figure 4-36. cache instruction stall <1> if rf ex dc wb wb wb wb wb if rf ex dc dc dc dc dc wb if rf ex ex ex ex ex dc wb if rf rf rf rf rf ex dc wb <2> <1> cache instruction start <2> cache instruction complete when the cache instruction enters the dc pipe-stage, the pipeline stalls while the cache instruction is executed. the pipeline begins running again when the cache instruction is completed, allowing the instruction fetch to proceed.
chapter 4 pipelilne user?s manual u15509ej2v0um 121 4.7.3 slip conditions during the rf stage and the ex stage, internal logic will determine whether it is possible to start the current instruction in this cycle. if all of the source operands are available (either from the register file or via the internal bypass logic) and all the hardware resources necessary to complete the instruction will be available whenever required, then the instruction ?run?; otherwise, the instruction will ?slip?. slipped instructions are retired on subsequent cycles until they issue. the backend of the pipeline (stages dc and wb) will advance normally during slips in an attempt to resolve the conflict. nops will be inserted into the bubble in the pipeline. instructions killed by branch likely instructions, eret or exceptions will not cause slips. figure 4-37. load data interlock (a) v r 4121, v r 4122, v r 4131, v r 4181a <1> add a,b load b load a if rf ex dc if rf ex dc if rf rf rf ex rf ex dc wb <2> dc2 dc if dc2 bypass wb wb wb <1> (b) v r 4181 <1> add a,b load b load a if rf ex dc if rf ex dc if rf rf ex dc rf ex dc wb <2> wb wb if wb bypass <1> detect load data interlock <2> get target data load data interlock is detected in the rf stage and also the pipeline slips in the stage. load data interlock occurs when data fetched by a load instruction and data moved from hi, lo or cp0 register is required by the next immediate instruction. the pipeline begins running again at the clock after the target of the load is read from the data cache, hi, lo and cp0 registers. the data returned at the end of the dc stage is input into the end of the rf stage, using the bypass multiplexers.
chapter 4 pipelilne user?s manual u15509ej2v0um 122 figure 4-38. md busy interlock (a) v r 4121, v r 4122, v r 4131, v r 4181a mfhi/mflo if rf ex dc wb rf rf rf if <1> if rf ex ex ex <2> bypass mult/div ex dc wb ex dc wb rf <1> <1> <1> multiply/divide (b) v r 4181 mfhi/mflo if rf ex dc wb ex dc wb rf if <1> if rf rf ex dc <2> wb bypass <1> detect md busy interlock <2> get target data md busy interlock occurs when hi/lo register is required by mfhi/mflo instruction before finishing multiply/divide execution. the pipeline begins running again at the clock after finishing multiply/divide execution. in the v r 4121, v r 4122, v r 4131, and v r 4181a, md busy interlock is detected in the ex stage and also the pipeline slips in the stage. the data returned from the hi/lo register at the end of the dc stage is input into the end of the ex stage, using the bypass multiplexer. in the v r 4181, md busy interlock is detected in the rf stage and also the pipeline slips in the stage. the data returned from the hi/lo register at the end of the dc stage is input into the end of the rf stage, using the bypass multiplexer. store-load interlock is detected in the ex stage and the pipeline slips in the rf stage. store-load interlock occurs when store instruction followed by load instruction is detected. the pipeline begins running again one clock later. coprocessor 0 interlock is detected in the ex stage and the pipeline slips in the rf stage. coprocessor interlock occurs when an mtc0 instruction for the config or status register is detected. the pipeline begins running again one clock later.
chapter 4 pipelilne user?s manual u15509ej2v0um 123 4.7.4 bypassing in some cases, data and conditions produced in the ex, dc and wb stages of the pipeline are made available to the ex stage (only) through the bypass data path. operand bypass allows an instruction in the ex stage to continue without having to wait for data or conditions to be written to the register file at the end of the wb stage. instead, the bypass control unit is responsible for ensuring data and conditions from later pipeline stages are available at the appropriate time for instructions earlier in the pipeline. the bypass control unit is also responsible for controlling the source and destination register addresses supplied to the register file.
user?s manual u15509ej2v0um 124 chapter 5 memory management system the v r 4100 series provides a memory management unit (mmu) which uses a translation lookaside buffer (tlb) to translate virtual addresses into physical addresses. this chapter describes the virtual and physical address spaces, the virtual-to-physical address translation, the operation of the tlb in making these translations, and the cp0 registers that provide the software interface to the tlb. 5.1 processor modes 5.1.1 operating mode the processor has three operating modes, and accessible address spaces are determined by these modes. ? user mode ? supervisor mode ? kernel mode user and kernel modes are common to all v r -series processors. generally, kernel mode is used to executing the operating system, while user mode is used to run application programs. the v r 4000 series tm and later processors have a third mode, which is called supervisor mode and categorized in between user and kernel modes. this mode is used to configure a high-security system. when an exception occurs, the cpu enters kernel mode, and remains in this mode until an exception return instruction (eret) is executed. the eret instruction brings back the processor to the mode in which it was just before the exception occurs. access to the kernel address space is allowed when the processor is in kernel mode. access to the supervisor address space is allowed when the processor is in kernel or supervisor mode. access to the user address space is allowed in any of the three operating modes. 5.1.2 addressing mode in the v r 4100 series, 32- or 64-bit mode is independently selectable for user, supervisor, and kernel operating modes. a processor in 64-bit mode translates 64-bit addresses and processes data in 64-bit unit.
chapter 5 memory management system user?s manual u15509ej2v0um 125 5.2 translation lookaside buffer (tlb) virtual addresses are translated into physical addresses using an on-chip tlb. the on-chip tlb is a fully- associative memory that holds 32 entries, which provide mapping to 32 odd/even page pairs for one entry. the pages can have five different sizes, 1 k, 4 k, 16 k, 64 k, and 256 k, and can be specified in each entry. if it is supplied with a virtual address, each of the 32 tlb entries is checked simultaneously to see whether they match the virtual addresses that are provided with the asid field and saved in the entryhi register. if there is a virtual address match, or ?hit,? in the tlb, the physical page number is extracted from the tlb and concatenated with the offset to form the physical address. if no match occurs (tlb ?miss?), an exception is taken and software refills the tlb from the page table resident in memory. the software writes to an entry selected using the index register or a random entry indicated in the random register. if more than one entry in the tlb matches the virtual address being translated, tlb operations are not performed correctly. in the v r 4181, the tlb-shutdown (ts) bit of the status register is set to 1, and the tlb becomes unusable (an attempt to access the tlb results in a tlb refill exception regardless of whether there is an entry that hits). the ts bit can be cleared only by a reset. the v r 4121, v r 4122, v r 4131, and v r 4181a have no ts bit, and their operation is undefined if more than one entry in the tlb matches. note that virtual addresses may be converted to physical addresses without using a tlb, depending on the address space that is being subjected to address translation. for example, address translation for the kseg0 or kseg1 address space does not use mapping. the physical addresses of these address spaces are determined by subtracting the base address of the address space from the virtual addresses. 5.2.1 format of a tlb entry each tlb entry has fields corresponding to the entryhi, entrylo0, ebtrylo1, and pagemask registers. the format of the entryhi, entrylo0, ebtrylo1, and pagemask registers are nearly the same as the tlb entry. however, the bit in the entryhi register that corresponds to the tlb g bit is a reserved bit (0), and the bit in the tlb entry that corresponds to the g bit of the entrylo register is reserved to 0. for details about other bits, refer to the descriptions of each register. figure 5-1 shows the tlb entry formats for both 32- and 64-bit modes.
chapter 5 memory management system user?s manual u15509ej2v0um 126 figure 5-1. format of a tlb entry 127 0 mask (a) 32-bit mode (b) 64-bit mode 0 115 114 95 vpn2 63 60 59 38 37 35 34 33 32 pfn c d v 0 0 31 28 27 6 5 3 2 1 0 pfn c d v 0 0 63 28 27 6 5 3 2 1 0 pfn c d v 0 0 127 92 91 70 69 67 66 65 64 pfn c d v 0 0 191 168 167 190 189 139 137 138 136 135 128 vpn2 g 0 asid 0 r g 0 asid 75 73 72 71 64 74 107 106 96 255 0 mask 0 211 210 203 202 192 5.2.2 manipulation of tlb the contents of each tlb entry can be read or written through the entryhi, entrylo0, ebtrylo1, and pagemask registers with tlb manipulation instructions, as shown in figure 5-2. an entry specified through the index register or indicated in the random register is used as a target. the tlb must also be initialized and set after reset. refer to v r series programming guide application note for details about procedures and program examples of initialization.
chapter 5 memory management system user?s manual u15509ej2v0um 127 figure 5-2. tlb manipulation overview tlb entry specified by index register or random register 31 pagemask entryhi entrylo1 entrylo0 0 tlb 0 127/255 5.2.3 tlb instructions the instructions used for tlb control are described below. refer to chapter 9 for details about each instruction. (1) translation lookaside buffer probe (tlbp) the translation lookaside buffer probe (tlbp) instruction loads the index register with a tlb number that matches the content of the entryhi register. if there is no tlb number that matches the tlb entry, the highest- order bit of the index register is set. (2) translation lookaside buffer read (tlbr) the translation lookaside buffer read (tlbr) instruction loads the entryhi, entrylo0, entrylo1, and pagemask registers with the content of the tlb entry indicated by the content of the index register. (3) translation lookaside buffer write index (tlbwi) the translation lookaside buffer write index (tlbwi) instruction writes the contents of the entryhi, entrylo0, entrylo1, and pagemask registers to the tlb entry indicated by the content of the index register. (4) translation lookaside buffer write random (tlbwr) the translation lookaside buffer write random (tlbwr) instruction writes the contents of the entryhi, entrylo0, entrylo1, and pagemask registers to the tlb entry indicated by the content of the random register. 5.2.4 tlb exceptions if there is no tlb entry that matches the virtual address, a tlb refill exception occurs. if the access control bits (d and v) indicate that the access is not valid, a tlb modified or tlb invalid exception occurs. if the c bit is 010, the retrieved physical address directly accesses main memory, bypassing the cache. see chapter 6 for details of the tlb miss exception.
chapter 5 memory management system user?s manual u15509ej2v0um 128 5.3 virtual-to-physical address translation converting a virtual address to a physical address begins by comparing the virtual address from the processor with the virtual addresses of all entries in the tlb. either of the following comparisons is performed for the virtual page number (vpn): ? in 32-bit mode, the high-order bits note of the 32-bit virtual address are compared to the contents of the vpn2 (virtual page number divided by two) of each tlb entry. ? in 64-bit mode, the high-order bits note of the 64-bit virtual address are compared to the contents of the vpn2 (virtual page number divided by two) and r of each tlb entry. note the number of bits differs from page sizes. the table below shows the examples of high-order bits of the virtual address in page size of 256 kb and 1 kb. page size mode 256 kb 1 kb 32-bit mode bits 31 to 19 bits 31 to 11 64-bit mode bits 63, 62, 39 to 19 bits 63, 62, 39 to 11 it is a match when there is an entry whose vpn field is the same as that of virtual address, and either: ? the global (g) bit of the tlb entry is set to 1, or ? the asid field of the virtual address is the same as the asid field of the tlb entry. this match is referred to as a tlb hit. if a tlb entry matches, the physical address and access control bits (c, d, and v) are retrieved from the matching tlb entry. while the v bit of the entry must be set to 1 for a valid address translation to take place, it is not involved in the determination of a matching tlb entry. the offset is concatenated to the retrieved physical address. an offset, which indicates an address within the page frame space, is the low-order bits of the virtual address and is output without passing through the tlb. if there is no match, a tlb refill exception is taken by the processor and software is allowed to refill the tlb from a page table of virtual/physical addresses in memory. figure 5-3 illustrates an outline of the address translation, and figure 5-4 illustrates the tlb address translation flow.
chapter 5 memory management system user?s manual u15509ej2v0um 129 figure 5-3. virtual-to-physical address translation asid vpn g 1 vpn (virtual page number, high-order bits of virtual address) is compared with that in tlb. 2 if there is a match, pfn (page frame number, high-order bits of physical address) is output from tlb. 3 the offset, which does not pass through tlb, is concatenated to pfn. asid pfn pfn physical address tlb vpn tlb entry virtual address offset offset
chapter 5 memory management system user?s manual u15509ej2v0um 130 figure 5-4. address translation in tlb no no yes yes address ok? virtual address input physical address output user mode? mapped adderss? address error exception no no yes vpn match? no yes g bit = 1? no yes v bit = 1? no yes d bit = 1? no yes uncached? no yes no yes no yes yes no write? no yes no yes asid match? tlb invalid exception physical address output address ok? supervisor mode? address ok? address error exception address error exception yes 32-bit address? tlb refill exception xtlb refill exception tlb modified exception access main memory access cache
chapter 5 memory management system user?s manual u15509ej2v0um 131 5.3.1 32-bit mode address translation figure 5-5 shows the virtual-to-physical-address translation of a 32-bit mode address. the pages can have five different sizes between 1 kb (10 bits) and 256 kb (18 bits), each being 4 times as large as the preceding one in ascending order, that is 1 k, 4 k, 16 k, 64 k, and 256 k. this figure illustrates the two possible page sizes: a 1 kb page (10 bits) and a 256 kb page (18 bits). ? shown at the top of figure 5-5 is the virtual address space in which the page size is 1 kb and the offset is 10 bits. the 22 bits excluding the asid field represents the virtual page number (vpn), enabling selecting a page table of 4 m entries. ? shown at the bottom of figure 5-5 is the virtual address space in which the page size is 256 kb and the offset is 18 bits. the 14 bits excluding the asid field represents the vpn, enabling selecting a page table of 16 k entries. figure 5-5. 32-bit mode virtual address translation 39 tlb tlb 22 bits = 4m pages virtual-to-physical address translation in tlb note offset passed unchanged and used for physical address offset passed unchanged and used for physical address virtual address with 4m (2 22 ) 1kb pages 32-bit physical address 14 bits = 16k pages pfn offset asid vpn offset 32 31 31 0 29 28 10 9 0 39 asid vpn offset 32 31 29 28 18 17 0 virtual address with 16k (2 14 ) 256kb pages virtual-to-physical address translation in tlb note note bits 31 to 29 of the virtual address select user, supervisor, or kernel address spaces.
chapter 5 memory management system user?s manual u15509ej2v0um 132 5.3.2 64-bit mode address translation figure 5-6 shows the virtual-to-physical-address translation of a 64-bit mode address. the pages can have five different sizes between 1 kb (10 bits) and 256 kb (18 bits), each being 4 times as large as the preceding one in ascending order, that is 1 k, 4 k, 16 k, 64 k, and 256 k. this figure illustrates the two possible page sizes: a 1 kb page (10 bits) and a 256 kb page (18 bits). ? shown at the top of figure 5-6 is the virtual address space in which the page size is 1 kb and the offset is 10 bits. the 30 bits excluding the asid field represents the virtual page number (vpn), enabling selecting a page table of 1 g entry. ? shown at the bottom of figure 5-6 is the virtual address space in which the page size is 256 kb and the offset is 18 bits. the 22 bits excluding the asid field represents the vpn, enabling selecting a page table of 4 m entries. figure 5-6. 64-bit mode virtual address translation 71 64 63 62 61 40 39 10 9 0 tlb tlb pfn offset asid 0 or -1 vpn offset 31 0 71 asid 0 or -1 vpn offset 64 63 62 61 18 17 40 39 0 30 bits = 1g pages virtual address with 1g (2 30 ) 1kb pages 22 bits = 4m pages virtual address with 4 m (2 22 ) 256kb pages virtual-to-physical address translation in tlb offset passed unchanged and used for physical address offset passed unchanged and used for physical address 32-bit physical address virtual-to-physical address translation in tlb note note note bits 63 and 62 of the virtual address select user, supervisor, or kernel address spaces.
chapter 5 memory management system user?s manual u15509ej2v0um 133 5.4 address spaces the address space of the cpu is extended in memory management system, by converting (translating) huge virtual memory addresses into physical addresses. the physical address space of the v r 4100 series is 4 gb and 32-bit width addresses are used. for the virtual address space, up to 2 gb (2 31 bytes) are provided as a user?s area and 32-bit width addresses are used in the 32-bit mode. in the 64-bit mode, up to 1 tb (2 40 bytes) is provided as a user?s area and 64-bit width addresses are used. for the format of the tlb entry in each mode, refer to 5.2.1 . as shown in figures 5-5 and 5-6, the virtual address is extended with an address space identifier (asid), which reduces the frequency of tlb flushing when switching contexts. this 8-bit asid is in the cp0 entryhi register, and the global (g) bit is in the entrylo0 and entrylo1 registers, described later in this chapter. 5.4.1 user mode virtual address space during user mode, a 2 gb (2 31 bytes) virtual address space (useg) can be used in the 32-bit mode. in the 64-bit mode, a 1 tb (2 40 bytes) virtual address space (xuseg) can be used. as shown in tables 5-5 and 5-6, each virtual address is extended independently as another virtual address by setting an 8-bit address space id area (asid), to support user processes of up to 256. the contents of tlb can be retained after context switching by allocating each process by asid. useg and xuseg can be referenced via tlb. whether a cache is used or not is determined for each page by the tlb entry (depending on the c bit setting in the tlb entry). the user segment starts at address 0 and the current active user process resides in either useg (in 32-bit mode) or xuseg (in 64-bit mode). the tlb identically maps all references to useg/xuseg from all modes, and controls cache accessibility. the processor operates in user mode when the status register contains the following bit-values: ? ksu = 10 ? exl = 0 ? erl = 0 in conjunction with these bits, the ux bit in the status register selects 32- or 64-bit user mode addressing as follows: ? when ux = 0, 32-bit useg space is selected. ? when ux = 1, 64-bit xuseg space is selected. figure 5-7 shows the address mapping for the user mode, and table 5-1 lists the characteristics of each user segment (useg and xuseg).
chapter 5 memory management system user?s manual u15509ej2v0um 134 figure 5-7. user mode address space 0xffff ffff 0x8000 0000 0x7fff ffff 0x0000 0000 0xffff ffff ffff ffff 0x0000 0100 0000 0000 0x0000 00ff ffff ffff 0x0000 0000 0000 0000 useg xuseg 64-bit mode 32-bit mode note address error 2gb tlb mapped address error 1tb tlb mapped note the v r 4100 series uses 64-bit addresses within it. when the processor is running in kernel mode, it saves the contents of each register or restores their previous contents to initialize them before switching the context. for 32-bit mode addressing, bit 31 is sign-extended to bits 32 to 63, and the resulting 32 bits are used for addressing. usually, it is impossible for 32-bit mode programs to generate invalid addresses. if context switching occurs and the processor enters kernel mode, however, an attempt may be made to save an address other than the sign-extended 32-bit address mentioned above to a 64-bit register. in this case, user-mode programs are likely to generate an invalid address. table 5-1. user mode segments mode address bit status register bit value segment address range size value ksu exl erl ux name 32-bit a31 = 0 10 0 0 0 useg 0x0000 0000 to 0x7fff ffff 2 gb (2 31 bytes) 64-bit a(63:40) = 0 10 0 0 1 xuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff 1 tb (2 40 bytes)
chapter 5 memory management system user?s manual u15509ej2v0um 135 (1) useg (32-bit mode) in user mode, when ux = 0 in the status register and the most significant bit of the virtual address is 0, this virtual address space is labeled useg. any attempt to reference an address with the most-significant bit set while in user mode causes an address error exception (see chapter 6 exception processing ). the tlb refill exception vector is used for tlb misses. (2) xuseg (64-bit mode) in user mode, when ux = 1 in the status register and bits 63 to 40 of the virtual address are all 0, this virtual address space is labeled xuseg. any attempt to reference an address with bits 63:40 equal to 1 causes an address error exception (see chapter 6 exception processing ). the xtlb refill exception vector is used for tlb misses. 5.4.2 supervisor mode virtual address space supervisor mode is designed for layered operating systems in which a true kernel runs in kernel mode, and the rest of the operating system runs in supervisor mode. all of the suseg, sseg, xsuseg, xsseg, and csseg spaces are referenced via tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. the processor operates in supervisor mode when the status register contains the following bit-values: ? ksu = 01 ? exl = 0 ? erl = 0 in conjunction with these bits, the sx bit in the status register selects 32- or 64-bit supervisor mode addressing as follows: ? when sx = 0, 32-bit supervisor space is selected. ? when sx = 1, 64-bit supervisor space is selected. figure 5-8 shows the supervisor mode address space, and table 5-2 lists the characteristics of the supervisor mode segments.
chapter 5 memory management system user?s manual u15509ej2v0um 136 figure 5-8. supervisor mode address space 0xffff ffff 0xc000 0000 0xbfff ffff 0x8000 0000 0x7fff ffff 0x0000 0000 0xe000 0000 0xdfff ffff 0xffff ffff ffff ffff 0xffff ffff c000 0000 0xffff ffff bfff ffff 0xffff ffff e000 0000 0xffff ffff dfff ffff 0x4000 0000 0000 0000 0x3fff ffff ffff ffff 0x4000 0100 0000 0000 0x4000 00ff ffff ffff 0x0000 0000 0000 0000 0x0000 0100 0000 0000 0x0000 00ff ffff ffff suseg sseg csseg xsseg xsuseg 0.5gb tlb mapped 0.5gb tlb mapped 64-bit mode 32-bit mode note address error 2gb tlb mapped address error 1tb tlb mapped address error address error address error 1tb tlb mapped note the v r 4100 series uses 64-bit addresses within it. for 32-bit mode addressing, bit 31 is sign- extended to bits 32 to 63, and the resulting 32 bits are used for addressing. usually, it is impossible for 32-bit mode programs to generate invalid addresses. in an operation of base register + offset for addressing, however, a two?s complement overflow may occur, causing an invalid address. note that the result becomes undefined. two factors that can cause a two?s complement follow: ? when offset bit 15 is 0, base register bit 31 is 0, and bit 31 of the operation ?base register + offset? is 1 ? when offset bit 15 is 1, base register bit 31 is 1, and bit 31 of the operation ?base register + offset? is 0
chapter 5 memory management system user?s manual u15509ej2v0um 137 table 5-2. 32-bit and 64-bit supervisor mode segments mode address bit status register bit value segment address range size value ksu exl erl sx name 32-bit a31 = 0 01 0 0 0 suseg 0x0000 0000 to 0x7fff ffff 2 gb (2 31 bytes) 32-bit a(31:29) = 110 01 0 0 0 sseg 0xc000 0000 to 0xdfff ffff 512 mb (2 29 bytes) 64-bit a(63:62) = 00 01 0 0 1 xsuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff 1 tb (2 40 bytes) 64-bit a(63:62) = 01 01 0 0 1 xsseg 0x4000 0000 0000 0000 to 0x4000 00ff ffff ffff 1 tb (2 40 bytes) 64-bit a(63:62) = 11 01 0 0 1 csseg 0xffff ffff c 000 0000 to 0xffff ffff dfff ffff 512 mb (2 29 bytes) (1) suseg (32-bit supervisor mode, user space) when sx = 0 in the status register and the most-significant bit of the virtual address space is set to 0, the suseg virtual address space is selected; it covers 2 gb (2 31 bytes) of the current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. this mapped space starts at virtual address 0x0000 0000 and runs through 0x7fff ffff. (2) sseg (32-bit supervisor mode, supervisor space) when sx = 0 in the status register and the three most-significant bits of the virtual address space are 110, the sseg virtual address space is selected; it covers 512 mb (2 29 bytes) of the current supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. this mapped space begins at virtual address 0xc000 0000 and runs through 0xdfff ffff. (3) xsuseg (64-bit supervisor mode, user space) when sx = 1 in the status register and bits 63 and 62 of the virtual address space are set to 00, the xsuseg virtual address space is selected; it covers 1 tb (2 40 bytes) of the current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. this mapped space starts at virtual address 0x0000 0000 0000 0000 and runs through 0x0000 00ff ffff ffff.
chapter 5 memory management system user?s manual u15509ej2v0um 138 (4) xsseg (64-bit supervisor mode, current supervisor space) when sx = 1 in the status register and bits 63 and 62 of the virtual address space are set to 01, the xsseg virtual address space is selected; it covers 1 tb (2 40 bytes) of the current supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. this mapped space begins at virtual address 0x4000 0000 0000 0000 and runs through 0x4000 00ff ffff ffff. (5) csseg (64-bit supervisor mode, separate supervisor space) when sx = 1 in the status register and bits 63 and 62 of the virtual address space are set to 11, the csseg virtual address space is selected; it covers 512 mb (2 29 bytes) of the separate supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. this mapped space begins at virtual address 0xffff ffff c000 0000 and runs through 0xffff ffff dfff ffff. 5.4.3 kernel mode virtual address space if the status register satisfies any of the following conditions, the processor runs in kernel mode. ? ksu = 00 ? exl = 1 ? erl = 1 the addressing width in kernel mode varies according to the state of the kx bit of the status register, as follows: ? when kx = 0, 32-bit kernel space is selected. ? when kx = 1, 64-bit kernel space is selected. the processor enters kernel mode whenever an exception is detected and it remains in kernel mode until an exception return (eret) instruction is executed and results in erl and/or exl = 0. the eret instruction restores the processor to the mode existing prior to the exception. kernel mode virtual address space is divided into regions differentiated by the high-order bits of the virtual address, as shown in figure 5-9. table 5-3 lists the characteristics of the 32-bit kernel mode segments, and table 5-4 lists the characteristics of the 64-bit kernel mode segments.
chapter 5 memory management system user?s manual u15509ej2v0um 139 figure 5-9. kernel mode address space 0xffff ffff 0xc000 0000 0xbfff ffff 0xa000 0000 0x9fff ffff 0x8000 0000 0x7fff ffff 0x0000 0000 0xe000 0000 0xdfff ffff 0xffff ffff ffff ffff 0xffff ffff a000 0000 0xffff ffff 9fff ffff 0xffff ffff e000 0000 0xffff ffff dfff ffff 0xffff ffff c000 0000 0xffff ffff bfff ffff 0xc000 00ff 8000 0000 0xc000 00ff 7fff ffff 0xffff ffff 8000 0000 0xffff ffff 7fff ffff 0x0000 0000 0000 0000 0xc000 0000 0000 0000 0xbfff ffff ffff ffff 0x8000 0000 0000 0000 0x7fff ffff ffff ffff 0x4000 0100 0000 0000 0x4000 00ff ffff ffff 0x4000 0000 0000 0000 0x3fff ffff ffff ffff 0x0000 0100 0000 0000 0x0000 00ff ffff ffff kuseg kseg3 ksseg kseg1 kseg0 ckseg3 cksseg ckseg1 ckseg0 xkseg xkphys xksseg xkuseg tlb mapped tlb unmapped (refer to figure 5-10 ) 0.5 gb tlb mapped 0.5 gb tlb mapped 0.5 gb tlb unmapped uncached 0.5 gb tlb unmapped cacheable note2 0.5 gb tlb mapped 0.5 gb tlb mapped 0.5 gb tlb unmapped uncached 0.5 gb tlb unmapped cacheable note2 64-bit mode 32-bit mode note1 address error 2 gb tlb mapped address error 1 tb tlb mapped address error 1 tb tlb mapped notes 1. the v r 4100 series uses 64-bit addresses within it. for 32-bit mode addressing, bit 31 is sign- extended to bits 32 to 63, and the resulting 32 bits are used for addressing. usually, a 64-bit instruction is used for the program in 32-bit mode. in an operation of base register + offset for addressing, however, a two?s complement overflow may occur, causing an invalid address. note that the result becomes undefined. two factors that can cause a two?s complement follow: ? when offset bit 15 is 0, base register bit 31 is 0, and bit 31 of the operation ?base register + offset? is 1 ? when offset bit 15 is 1, base register bit 31 is 1, and bit 31 of the operation ?base register + offset? is 0 2. the k0 field of the config register controls cacheability of kseg0 and ckseg0.
chapter 5 memory management system user?s manual u15509ej2v0um 140 figure 5-10. xkphys area address space 0xbfff ffff ffff ffff 0xb800 0000 0000 0000 0xb7ff ffff ffff ffff 0xb000 0001 0000 0000 0xb000 0000 ffff ffff 0xb000 0000 0000 0000 0xafff ffff ffff ffff 0xa800 0001 0000 0000 0xa800 0000 ffff ffff 0xa800 0000 0000 0000 0xa7ff ffff ffff ffff 0xa000 0001 0000 0000 0xa000 0000 ffff ffff 0xa000 0000 0000 0000 0x9fff ffff ffff ffff 0x9800 0001 0000 0000 0x9800 0000 ffff ffff 0x9800 0000 0000 0000 0x97ff ffff ffff ffff 0x9000 0001 0000 0000 0x9000 0000 ffff ffff 0x9000 0000 0000 0000 0x8fff ffff ffff ffff 0x8800 0001 0000 0000 0x8800 0000 ffff ffff 0x8800 0000 0000 0000 0x87fff ffff ffff ffff 0x8000 0001 0000 0000 0x8000 0000 ffff ffff 0x8000 0000 0000 0000 0xb800 0001 0000 0000 0xb800 0000 ffff ffff 4 gb tlb unmapped cacheable 4 gb tlb unmapped cacheable 4 gb tlb unmapped cacheable 4 gb tlb unmapped cacheable 4 gb tlb unmapped cacheable 4 gb tlb unmapped uncached 4 gb tlb unmapped cacheable 4 gb tlb unmapped cacheable address error address error address error address error address error address error address error address error
chapter 5 memory management system user?s manual u15509ej2v0um 141 table 5-3. 32-bit kernel mode segments address bit value status register bit value segment virtual address physical size ksu exl erl kx name address a31 = 0 0 kuseg 0x0000 0000 to 0x7fff ffff tlb map 2 gb (2 31 bytes) a(31:29) = 100 0 kseg0 0x8000 0000 to 0x9fff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(31:29) = 101 0 kseg1 0xa000 0000 to 0xbfff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(31:29) = 110 0 ksseg 0xc 000 0000 to 0xdfff ffff tlb map 512 mb (2 29 bytes) a(31:29) = 111 ksu = 00 or exl = 1 or erl = 1 0 kseg3 0xe000 0000 to 0xffff ffff tlb map 512 mb (2 29 bytes) (1) kuseg (32-bit kernel mode, user space) when kx = 0 in the status register, and the most-significant bit of the virtual address space is 0, the kuseg virtual address space is selected; it is the current 2 gb (2 31 -byte) user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to kuseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. if the erl bit of the status register is 1, the user address space is assigned 2 gb (2 31 bytes) without tlb mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached so that the cache error handler can use it. this allows the cache error exception code to operate uncached using r0 as a base register. (2) kseg0 (32-bit kernel mode, kernel space 0) when kx = 0 in the status register and the most-significant three bits of the virtual address space are 100, the kseg0 virtual address space is selected; it is the current 512 mb (2 29 -byte) physical space. references to kseg0 are not mapped through tlb; the physical address selected is defined by subtracting 0x8000 0000 from the virtual address. the k0 field of the config register controls cacheability (refer to 5.5.8 ).
chapter 5 memory management system user?s manual u15509ej2v0um 142 (3) kseg1 (32-bit kernel mode, kernel space 1) when kx = 0 in the status register and the most-significant three bits of the virtual address space are 101, the kseg1 virtual address space is selected; it is the current 512 mb (2 29 -byte) physical space. references to kseg1 are not mapped through tlb; the physical address selected is defined by subtracting 0xa000 0000 from the virtual address. caches are disabled for accesses to these addresses, and main memory (or memory-mapped i/o device registers) is accessed directly. (4) ksseg (32-bit kernel mode, supervisor space) when kx = 0 in the status register and the most-significant three bits of the virtual address space are 110, the ksseg virtual address space is selected; it is the current 512 mb (2 29 -byte) virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to ksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. (5) kseg3 (32-bit kernel mode, kernel space 3) when kx = 0 in the status register and the most-significant three bits of the virtual address space are 111, the kseg3 virtual address space is selected; it is the current 512 mb (2 29 -byte) kernel virtual space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to kseg3 are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry.
chapter 5 memory management system user?s manual u15509ej2v0um 143 table 5-4. 64-bit kernel mode segments address bit status register bit value segment virtual address physical size value ksu exl erl kx name address a(63:62) = 00 1 xkuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff tlb map 1 tb (2 40 bytes) a(63:62) = 01 1 xksseg 0x 4000 0000 0000 0000 to 0x4000 00ff ffff ffff tlb map 1 tb (2 40 bytes) a(63:62) = 10 1 xkphys 0x8000 0000 0000 0000 to 0xbfff ffff ffff ffff 0x0000 0000 to 0xffff ffff 4 gb (2 32 bytes) a(63:62) = 11 1 xkseg 0xc000 0000 0000 0000 to 0xc000 00ff 7fff ffff tlb map 2 40 - 2 31 bytes a(63:62) = 11 a(63:31) = -1 1 ckseg0 0xffff ffff 8000 0000 to 0xffff ffff 9fff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(63:62) = 11 a(63:31) = -1 1 ckseg1 0xffff ffff a000 0000 to 0xffff ffff bfff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(63:62) = 11 a(63:31) = -1 1 cksseg 0xffff ffff c 000 0000 to 0xffff ffff dfff ffff tlb map 512 mb (2 29 bytes) a(63:62) = 11 a(63:31) = -1 ksu = 00 or exl = 1 or erl = 1 1 ckseg3 0xffff ffff e000 0000 to 0xffff ffff ffff ffff tlb map 512 mb (2 29 bytes) (6) xkuseg (64-bit kernel mode, user space) when kx = 1 in the status register and bits 63 and 62 of the virtual address space are 00, the xkuseg virtual address space is selected; it is the 1 tb (2 40 -byte) current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to xkuseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. if the erl bit of the status register is 1, the user address space is assigned 2 gb (2 31 bytes) without tlb mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached so that the cache error handler can use it. this allows the cache error exception code to operate uncached using r0 as a base register. (7) xksseg (64-bit kernel mode, current supervisor space) when kx = 1 in the status register and bits 63 and 62 of the virtual address space are 01, the xksseg address space is selected; it is the 1 tb (2 40 -byte) current supervisor address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to xksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry.
chapter 5 memory management system user?s manual u15509ej2v0um 144 (8) xkphys (64-bit kernel mode, physical spaces) when the kx = 1 in the status register and bits 63 and 62 of the virtual address space are 10, the virtual address space is called xkphys and selected as either cached or uncached. if any of bits 58 to 32 of the address is 1, an attempt to access that address results in an address error. whether cache can be used or not is determined by bits 59 to 61 of the virtual address. table 5-5 shows cacheability corresponding to 8 address spaces. table 5-5. cacheability and the xkphys address space bits 61 to 59 cacheability address range 0 cached 0x8000 0000 0000 0000 to 0x8000 0000 ffff ffff 1 cached 0x8800 0000 0000 0000 to 0x8800 0000 ffff ffff 2 uncached 0x9000 0000 0000 0000 to 0x9000 0000 ffff ffff 3 cached 0x9800 0000 0000 0000 to 0x9800 0000 ffff ffff 4 cached 0xa000 0000 0000 0000 to 0xa000 0000 ffff ffff 5 cached 0xa800 0000 0000 0000 to 0xa800 0000 ffff ffff 6 cached 0xb000 0000 0000 0000 to 0xb000 0000 ffff ffff 7 cached 0xb800 0000 0000 0000 to 0xb800 0000 ffff ffff (9) xkseg (64-bit kernel mode, kernel spaces) when the kx = 1 in the status register and bits 63 and 62 of the virtual address space are 11, the virtual address space is called xkseg and selected as either of the following: ? kernel virtual space, xkseg, the current kernel virtual space; the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address references to xkseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. ? one of the four 32-bit kernel compatibility spaces, as described in the next section.
chapter 5 memory management system user?s manual u15509ej2v0um 145 (10) 64-bit kernel mode compatible spaces (ckseg0, ckseg1, cksseg, and ckseg3) if the conditions listed below are satisfied in kernel mode, ckseg0, ckseg1, cksseg, or ckseg3 (each having 512 mbytes) is selected as a compatible space according to the state of the bits 30 and 29 (two low-order bits) of the address. ? the kx bit of the status register is 1. ? bits 63 and 62 of the 64-bit virtual address are 11. ? bits 61 to 31 of the virtual address are all 1. (a) ckseg0 this space is an unmapped region, compatible with the 32-bit mode kseg0 space. the k0 field of the config register controls cacheability and coherency (refer to 5.5.8 ). (b) ckseg1 this space is an unmapped and uncached region, compatible with the 32-bit mode kseg1 space. (c) cksseg this space is the current supervisor virtual space, compatible with the 32-bit mode ksseg space. references to cksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. (d) ckseg3 this space is the current supervisor virtual space, compatible with the 32-bit mode kseg3 space. references to ckseg3 are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry.
chapter 5 memory management system user?s manual u15509ej2v0um 146 5.5 memory management registers this section describes the cp0 registers that are accessed by the memory management system and software. table 5-6 lists the cp0 registers. about the exception processing registers of the cp0 registers, refer to chapter 6 exception processing . table 5-6 cp0 registers (a) memory management registers (b) exception processing registers register name register number register name register number index register 0 context register 4 random register 1 badvaddr register 8 entrylo0 register 2 count register 9 entrylo1 register 3 compare register 11 pagemask register 5 status register 12 wired register 6 cause register 13 entryhi register 10 epc register 14 prid register 15 watchlo register 18 config register 16 watchhi register 19 lladdr register note1 17 xcontext register 20 taglo register 28 parity error register note2 26 taghi register 29 cache error register note2 27 ?? errorepc register 30 notes 1. this register is defined to maintain compatibility with the v r 4000 and v r 4400. the content of this register is meaningless in the normal operation. 2. this register is defined to maintain compatibility with the v r 4100. this register is not used in the normal operation. details about each register are explained below. the parenthesized number in section titles is the register number (refer to 1.2.3 ).
chapter 5 memory management system user?s manual u15509ej2v0um 147 5.5.1 index register (0) the index register is a 32-bit, read/write register containing five low-order bits to index an entry in the tlb. the most-significant bit of the register shows the success or failure of a tlb probe (tlbp) instruction. the index register also specifies the tlb entry affected by tlb read (tlbr) or tlb write index (tlbwi) instructions. the contents of the index register after reset are undefined so that it must be initialized by software. figure 5-11. index register 31 p 0 index 30 5 4 0 p : indicates whether probing is successful or not. it is set to 1 if the latest tlbp instruction fails. it is cleared to 0 when the tlbp instruction is successful. index : specifies an index to a tlb entry that is a target of the tlbr or tlbwi instruction. 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read. 5.5.2 random register (1) the random register is a read-only register. the low-order 5 bits are used in referencing a tlb entry. this register is decremented each time an instruction is executed. the values that can be set in the register are as follows: ? the lower bound is the content of the wired register. ? the upper bound is 31. the random register specifies the entry in the tlb that is affected by the tlbwr instruction. the register is readable to verify proper operation of the processor. the random register is set to the value of the upper bound upon cold reset. this register is also set to the upper bound when the wired register is written. figure 5-12 shows the format of the random register. figure 5-12. random register 31 0 random 54 0 random : tlb random index 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 5 memory management system user?s manual u15509ej2v0um 148 5.5.3 entrylo0 (2) and entrylo1 (3) registers the entrylo register consists of two registers that have identical formats: entrylo0, used for even virtual pages and entrylo1, used for odd virtual pages. the entrylo0 and entrylo1 registers are both read-/write-accessible. they are used to access the built-in tlb. when a tlb read/write operation is carried out, the entrylo0 and entrylo1 registers hold the contents of the low-order 32 bits of tlb entries at even and odd addresses, respectively. the contents of these registers after reset are undefined so that they must be initialized by software. figure 5-13. entrylo0 and entrylo1 registers (a) 32-bit mode (b) 64-bit mode 31 28 27 6 5 3 2 1 0 pfn c d v g 0 entrylo0 31 28 27 6 5 3 2 1 0 pfn c d v g 0 entrylo1 63 28 27 6 5 3 2 1 0 pfn c d v g 0 entrylo0 63 28 27 6 5 3 2 1 0 pfn c d v g 0 entrylo1 pfn : page frame number; high-order bits of the physical address. c : specifies the tlb page attribute (see table 5-7 ). d : dirty. if this bit is set to 1, the page is marked as dirty and, therefore, writable. this bit is actually a write-protect bit that software can use to prevent alteration of data. v : valid. if this bit is set to 1, it indicates that the tlb entry is valid; otherwise, a tlb invalid exception (tlbl or tlbs) occurs. g : global. if this bit is set in both entrylo0 and entrylo1, then the processor ignores the asid during tlb lookup. 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read. the coherency attribute (c) bits are used to specify whether to use the cache in referencing a page. when the cache is used, whether the page attribute is ?cached? or ?uncached? is selected by algorithm. table 5-7 lists the page attributes selected according to the value in the c bits.
chapter 5 memory management system user?s manual u15509ej2v0um 149 table 5-7. cache algorithm c bit value cache algorithm 0 cached 1 cached 2 uncached 3 cached 4 cached 5 cached 6 cached 7 cached 5.5.4 pagemask register (5) the pagemask register is a read/write register used for reading from or writing to the tlb; it holds a comparison mask that sets the page size for each tlb entry, as shown in table 5-8. page sizes must be from 1 kb to 256 kb. tlb read and write instructions use this register as either a source or a destination; bits 18 to 11 that are targets of comparison are masked during address translation. the contents of the pagemask register after reset are undefined so that it must be initialized by software. figure 5-14. pagemask register 31 19 18 11 10 0 mask 0 0 mask : page comparison mask, which determines the virtual page size for the corresponding entry. 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read. table 5-8 lists the mask pattern for each page size. if the mask pattern is one not listed below, the tlb behaves unexpectedly. table 5-8. mask values and page sizes page size bit 18 17 16 15 14 13 12 11 1 kb 00000000 4 kb 00000011 16 kb 00001111 64 kb 00111111 256 kb 1 1111111
chapter 5 memory management system user?s manual u15509ej2v0um 150 5.5.5 wired register (6) the wired register is a read/write register that specifies the lower boundary of the random entry of the tlb as shown in figure 5-15. wired entries cannot be overwritten by a tlbwr instruction. they can, however, be overwritten by a tlbwi instruction. random entries can be overwritten by both instructions. figure 5-15. positions indicated by the wired register 31 wired register value 0 range specified by random register range of wired entries tlb the wired register is set to 0 upon cold reset. writing this register also sets the random register to the value of its upper bound (see 5.5.2 random register (1) ). figure 5-16 shows the format of the wired register. figure 5-16. wired register 31 5 4 0 0 wired wired : tlb wired boundary 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 5 memory management system user?s manual u15509ej2v0um 151 5.5.6 entryhi register (10) the entryhi register is write-accessible. it is used to access the built-in tlb. the entryhi register holds the high- order bits of a tlb entry for tlb read and write operations. if a tlb refill, tlb invalid, or tlb modified exception occurs, the entryhi register holds the high-order bit of the tlb entry. the entryhi register is also set with the virtual page number (vpn2) for a virtual address where an exception occurred and the asid. see chapter 6 for details of the tlb exception. the asid is used to read from or write to the asid field of the tlb entry. it is also checked with the asid of the tlb entry as the asid of the virtual address during address translation. the entryhi register is accessed by the tlbp, tlbwr, tlbwi, and tlbr instructions. the contents of the entryhi register after reset are undefined so that it must be initialized by software. figure 5-17. entryhi register 31 11 10 8 7 0 (a) 32-bit mode (b) 64-bit mode vpn2 0 asid 63 62 61 11 10 40 39 8 7 0 fill vpn2 r 0 asid vpn2 : virtual page number divided by two (mapping to two pages) asid : address space id. an 8-bit asid field that lets multiple processes share the tlb; each process has a distinct mapping of otherwise identical virtual page numbers. r : space type (00 user, 01 supervisor, 11 kernel). matches bits 63 and 62 of the virtual address. fill : reserved. ignored on write. when read, returns zero. 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 5 memory management system user?s manual u15509ej2v0um 152 5.5.7 processor revision identifier (prid) register (15) the 32-bit, read-only processor revision identifier (prid) register contains information identifying the implementation and revision level of the cpu and cp0. figure 5-18 shows the format of the prid register. figure 5-18. prid register 31 16 15 8 7 0 0 imp rev imp : cpu core processor id number (0x0c for the v r 4100 series) rev : cpu core processor revision number 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read. the processor revision number is stored as a value in the form y.x, where y is a major revision number in bits 7 to 4 and x is a minor revision number in bits 3 to 0. the processor revision number identifies the revision of a cpu core. the major revision number (bits 7 to 4) identifies the v r 4100 series processors as follows: processor rev field v r 4121 0110 xxxx v r 4122 0111 xxxx (xxxx may be 0010 or less) v r 4131 1000 xxxx v r 4181 0101 xxxx v r 4181a 0111 xxxx (xxxx may be 0011 or greater) the minor revision number (bits 3 to 0) may be different even though the same processor names. there is no guarantee that changes to the cpu core will necessarily be reflected in the prid register, or changes to the revision number necessarily reflect real cpu core changes. therefore, create a program that does not depend on the processor revision number field.
chapter 5 memory management system user?s manual u15509ej2v0um 153 5.5.8 config register (16) the config register specifies various configuration options selected on v r 4100 series processors. some configuration options, as defined by the ec, m16, and be fields, are set by the hardware during cold reset and are included in the config register as read-only status bits for the software to access. other configuration options are read/write (ad, ep, and k0 fields) and controlled by software; on cold reset these fields are undefined. since only a subset of the v r 4000 series options are available in the v r 4100 series, some bits are set to constants (e.g., bits 14 to 13) that were variable in the v r 4000 series. the config register should be initialized by software before caches are used. figure 5-19 shows the format of the config register. the contents of writable fields except for is and bp bits in the config register after reset are undefined so that they must be initialized by software. figure 5-19. config register (1/2) (a) v r 4121, v r 4181 31 30 28 27 24 23 22 21 20 19 18 17 16 15 14 13 12 11 9 8 6 5 3 2 0 0ec epad0 m16 010be10cs ic dc 0 k0 (b) v r 4122 31 30 28 27 24 23 22 21 20 19 18 17 16 15 14 13 12 11 9 8 6 5 3 2 0 is ec ep ad 0 m16 0 1 bp be 10 cs ic dc 0 k0 4 ib (c) v r 4131, v r 4181a 31 30 28 27 24 23 22 21 20 19 18 17 16 15 14 13 12 11 9 8 6 5 3 2 0 is ec ep ad 0 m16 0 1 bp be 10 cs ic dc 0 k0 4 ib db is : instruction streaming function (v r 4122, v r 4131, v r 4181a only) 0 on (default value) 1 off ec : system clock ratio (see table 5-9 ) ep : transfer data pattern (cache write-back pattern) setting 0 dd: 1 word/1 cycle others reserved ad : accelerate data mode 0 v r 4000 series compatible mode 1 reserved m16 : mips16 isa mode enable/disable indication (read only) 0 mips16 instruction cannot be executed 1 mips16 instruction can be executed be : endian mode of memory and a kernel. 0 little endian 1 big endian (v r 4131 only)
chapter 5 memory management system user?s manual u15509ej2v0um 154 figure 5-19. config register (2/2) cs : cache size mode indication (n = ic, dc). fixed to 1 in the v r 4100 series. 0 reserved 1 2 (n+10) bytes ic : instruction cache size indication. 2 (ic+10) bytes in the v r 4100 series (see table 5-10 ). dc : data cache size indication. 2 (dc+10) bytes in the v r 4100 series (see table 5-11 ). ib : instruction cache refill size setting (v r 4122, v r 4131, and v r 4181a only, and fixed to 1 in the v r 4181a). 0 4 words (16 bytes) 1 8 words (32 bytes) db : data cache refill size setting (v r 4131 and v r 4181a only, and fixed to 1 in the v r 4181a). 0 4 words (16 bytes) 1 8 words (32 bytes) k0 : kseg0 cache coherency algorithm 010 uncached others cached 1 : 1 is returned when read. 0 : 0 is returned when read. caution be sure to set the ep field and the ad bit to 0. if they are set with any other values, the processor may behave unexpectedly. (1) instruction streaming function (v r 4122, v r 4131, and v r 4181a only) instruction streaming can shorten the period during which the pipeline is stalled. usually, the pipeline is stalled until the cache line is refilled if an instruction cache miss occurs. with the v r 4122, v r 4131, and v r 4181a, however, the stalled pipeline is resumed, even if refilling is not completed, as soon as the instruction to be fetched has been read from the external memory. (2) indication of clock frequency ratio the ec area indicates the ratio of the internal peripheral function operating clock frequency to the pipeline clock (pclock) frequency. the frequency ratio to be indicated differs depending on the processor, as follows. table 5-9 system interface clock ratio (to pclock) ec field v r 4121 v r 4122 v r 4131 v r 4181 v r 4181a 0 1/1.5 reserved 1/2 reserved 1 1/2 1/3 1/2 2 1/2.5 reserved 1/4 reserved 3 1/3 reserved 1/3 4 1/4 reserved 1/4 5 1/5 reserved 1/5 6 1/6 reserved 1/6 7 1/1 reserved 1/1
chapter 5 memory management system user?s manual u15509ej2v0um 155 (3) branch prediction function (v r 4122, v r 4131, and v r 4181a only) usually, a branch delay of at least 1 clock occurs in order to check the branch condition and calculate the branch destination address when a branch instruction is fetched. the v r 4122, v r 4131, and v r 4181a can reduce the occurrence of this delay using branch prediction. the v r 4122, v r 4131, and v r 4181a have a branch prediction table to which branch instructions whose branch conditions have been satisfied and their branch destination addresses are registered. when the next branch instruction is fetched, this branch prediction table is referenced. if the same branch instruction is in the table (hit), an instruction is fetched from the branch destination address in the table. this branch prediction is performed and branch instructions can be executed without delay if the bp bit is cleared to 0. (4) indication of cache size the ic and dc fields indicate the respective capacities of the instruction cache and data cache. because the capacities of the caches differ depending on the processor, these fields are fixed to the value corresponding to the processor. table 5-10 instruction cache sizes processor size ic field v r 4121 16 kb 4 v r 4122 32 kb 5 v r 4131 16 kb 4 v r 4181 4 kb 2 v r 4181a 8 kb 3 table 5-11 data cache sizes processor size dc field v r 4121 8 kb 3 v r 4122 16 kb 4 v r 4131 16 kb 4 v r 4181 4 kb 2 v r 4181a 8 kb 3 5.5.9 load linked address (lladdr) register (17) the read/write load linked address (lladdr) register is not used with the v r 4100 series processor except for diagnostic purpose, and serves no function during normal operation. lladdr register is implemented just for compatibility between the v r 4100 series and v r 4000/v r 4400. the contents of the lladdr register after reset are undefined. figure 5-20. lladdr register 31 0 paddr paddr : 32-bit physical address
chapter 5 memory management system user?s manual u15509ej2v0um 156 5.5.10 taglo (28) and taghi (29) registers the taglo and taghi registers are 32-bit read/write registers that hold the primary cache tag during cache initialization, cache diagnostics, or cache error processing. the taglo and taghi registers are written by the cache and mtc0 instructions. figures 5-21 and 5-22 show the format of these registers. the contents of these registers after reset are undefined. figure 5-21. taglo register (a) v r 4121, v r 4122, v r 4181, v r 4181a 31 10 9 8 7 6 0 ptaglo for data cache v d w 0 31 10 9 8 0 ptaglo for instruction cache v0 (b) v r 4131 31 10 9 8 7 6 0 ptaglo for data cache vdw 0 31 10 9 8 0 ptaglo for instruction cache v0 543 0lr 6 0 543 lr ptaglo : specifies physical address bits 31 to 10. v : valid bit d : dirty bit. however, this bit is defined only for the compatibility with the v r 4000 series processors, and does not indicate the status of cache memory in spite of its readability and writability. this bit cannot change the status of cache memory. in the v r 4131, a write to this bit is ignored and the same value as the v bit is read on read. w : write-back bit (set if cache line has been updated) l : lock bit. if this bit is set, the cache line is not refilled on cache misses. r : lru bit. indicates the way to be refilled on cache misses. 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read. figure 5-22. taghi register 31 0 0 0 : reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
user?s manual u15509ej2v0um 157 chapter 6 exception processing this chapter describes cpu exception processing, including an explanation of hardware that processes exceptions, followed by the format and use of each cpu exception register. 6.1 exception processing overview the processor receives exceptions from a number of sources, including translation lookaside buffer (tlb) misses, arithmetic overflows, i/o interrupts, and system calls. when the cpu detects an exception, the normal sequence of instruction execution is suspended and the processor enters kernel mode (see chapter 5 for a description of system operating modes). if an exception occurs while executing a mips16 instruction, the processor stops the mips16 instruction execution, and shifts to the 32-bit instruction execution mode. the processor then disables interrupts and transfers control for execution to the exception handler (located at a specific address as an exception handling routine implemented by software). the handler saves the context of the processor, including the contents of the program counter, the current operating mode (user or supervisor), statuses, and interrupt enabling. this context is saved so it can be restored when the exception has been serviced. when an exception occurs, the cpu loads the exception program counter (epc) register with a location where execution can restart after the exception has been serviced. the restart location in the epc register is the address of the instruction that caused the exception or, if the instruction was executing in a branch delay slot, the address of the branch instruction immediately preceding the delay slot. note that no branch delay slot generated by executing a branch instruction exists when the processor operates in the mips16 mode. when mips16 instructions are enabled to be executed, bit 0 of the epc register indicates the operating mode in which an exception occurred. it indicates 1 when in the mips16 instruction mode, and indicates 0 when in the mips iii instruction mode. the v r 4100 series processors have registers other than above that retain address, cause, or status information during exception processing. details about these registers are described in 6.2 exception processing registers . for detailed descriptions about exception processing, refer to 6.4 details of exceptions . 6.1.1 precision of exceptions v r 4100 series exceptions are logically precise; the instruction that causes an exception and all those that follow it are aborted and can be re-executed after servicing the exception. when succeeding instructions are killed, exceptions associated with those instructions are also killed. exceptions are not taken in the order detected, but in instruction fetch order. the exception handler can still determine exception and its origin. the cause of the program can be restarted by rewriting the destination register - not automatically, however, as in the case of all the other precise exceptions where no status change occurs.
chapter 6 exception processing user?s manual u15509ej2v0um 158 6.2 exception processing registers this section describes the cp0 registers that are used in exception processing. table 6-1 lists the cp0 registers. about the memory management registers of the cp0 registers, refer to chapter 5 memory management system . table 6-1. cp0 registers (a) exception processing registers (b) memory management registers register name register number register name register number context register 4 index register 0 badvaddr register 8 random register 1 count register 9 entrylo0 register 2 compare register 11 entrylo1 register 3 status register 12 pagemask register 5 cause register 13 wired register 6 epc register 14 entryhi register 10 watchlo register 18 prid register 15 watchhi register 19 config register 16 xcontext register 20 lladdr register note2 17 parity error register note1 26 taglo register 28 cache error register note1 27 taghi register 29 errorepc register 30 ?? notes 1. this register is defined to maintain compatibility with the v r 4100. this register is not used in the normal operation. 2. this register is defined to maintain compatibility with the v r 4000 and v r 4400. the content of this register is meaningless in the normal operation. software examines the cp0 registers during exception processing to determine the cause of the exception and the state of the cpu at the time the exception occurred. details about each register are explained below. the parenthesized number in section titles is the register number (refer to 1.2.3 ).
chapter 6 exception processing user?s manual u15509ej2v0um 159 6.2.1 context register (4) the context register is a read/write register containing the pointer to an entry in the page table entry (pte) array on the memory; this array is a table that stores virtual-to-physical address translations. when there is a tlb miss, the operating system loads the unsuccessfully translated entry from the pte array to the tlb. the context register is used by the tlb refill exception handler for loading tlb entries. the context register duplicates some of the information provided in the badvaddr register, but the information is arranged in a form that is more useful for a software tlb exception handler. figure 6-1 shows the format of the context register. figure 6-1. context register (a) 32-bit mode (b) 64-bit mode 0 24 24 25 31 4 3 ptebase badvpn2 0 0 25 63 4 3 ptebase badvpn2 0 ptebase: the ptebase field is a base address of the pte entry table. badvpn2: the badvpn2 field is written by hardware if a tlb miss occurs. this field holds the value (vpn2) obtained by halving the virtual page number of the most recent virtual address for which translation failed. 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read. the ptebase field is used by software as the pointer to the base address of the pte table in the current user address space. the 21-bit badvpn2 field contains bits 31 to 11 of the virtual address that caused the tlb miss; bit 10 is excluded because a single tlb entry maps to an even-odd page pair. for a 1 kb page size, this format can directly address the pair-table of 8-byte ptes. when the page size is 4 kb or more, shifting or masking this value produces the correct pte reference address.
chapter 6 exception processing user?s manual u15509ej2v0um 160 6.2.2 badvaddr register (8) the bad virtual address (badvaddr) register is a read-only register that saves the most recent virtual address that failed to have a valid translation, or that had an addressing error. figure 6-2 shows the format of the badvaddr register. caution this register saves no information after a bus error exception, because it is not an address error exception. figure 6-2. badvaddr register (a) 32-bit mode 0 31 badvaddr (b) 64-bit mode 0 63 badvaddr badvaddr: most recent virtual address for which an addressing error occurred, or for which address translation failed. 6.2.3 count register (9) the read/write count register acts as a timer. it is incremented in synchronization with the masterout clock (internal clock), regardless of whether instructions are being executed, retired, or any forward progress is actually made through the pipeline. this register is a free-running type. when the register reaches all ones, it rolls over to zero and continues counting. this register is used for self-diagnostic test, system initialization, or the establishment of inter-process synchronization. figure 6-3 shows the format of the count register. figure 6-3. count register 0 31 count count: 32-bit up-date count value that is compared with the value of the compare register.
chapter 6 exception processing user?s manual u15509ej2v0um 161 6.2.4 compare register (11) the compare register causes a timer interrupt; it maintains a stable value that does not change on its own. when the value of the count register (see 6.2.3 ) equals the value of the compare register, the ip7 bit in the cause register is set. this causes an interrupt as soon as the interrupt is enabled. writing a value to the compare register, as a side effect, clears the timer interrupt request. for diagnostic purposes, the compare register is a read/write register. normally, this register should be only used for a write. figure 6-4 shows the format of the compare register. figure 6-4. compare register 0 31 compare compare: value that is compared with the count value of the count register. 6.2.5 status register (12) the status register is a read/write register that contains the operating mode, interrupt enabling, and the diagnostic states of the processor. figure 6-5 shows the format of the status register. figure 6-5. status register (1/2) (a) v r 4121, v r 4122, v r 4181, v r 4181a 292827262524 1615 8765 3210 31 0 cu0 0 re ds im ux ksu erl ie kx sx exl 4 (b) v r 4131 292827262524 1615 8765 3210 31 0 cu0 0 re ds im ux ksu erl ie kx sx exl 4 30 xx xx: write 0 in a write operation. when this bit is read, 0 is read (v r 4131 only). cu0: enables/disables the use of the coprocessor (1 enabled, 0 disabled). cp0 can be used by the kernel at all times. re: enables/disables reversing of the endian setting in user mode (0 disabled, 1 enabled). this bit must be set to 0 in the v r 4100 series. ds: diagnostic status field (see figure 6-6 ).
chapter 6 exception processing user?s manual u15509ej2v0um 162 figure 6-5. status register (2/2) im: interrupt mask field used to enable/disable interrupts (0 disabled, 1 enabled). this field consists of 8 bits that are used to control eight interrupts. the bits are assigned to interrupts as follows: im7: masks a timer interrupt. im(6:2): mask ordinary interrupts (int(4:0) note ). however, int3 note occurs in the v r 4121 and v r 4181a only, and int4 note in the v r 4181a only. im(1:0): mask software interrupts. note int(4:0) are internal signals of the cpu core. for details about connection to the on-chip peripheral units, refer to hardware user's manual of each processor. kx: enables 64-bit addressing in kernel mode (0 32-bit, 1 64-bit). sx: enables 64-bit addressing and operation in supervisor mode (0 32-bit, 1 64-bit). ux: enables 64-bit addressing and operation in user mode (0 32-bit, 1 64-bit). ksu: sets and indicates the operating mode (00 kernel, 01 supervisor, 10 user). erl: sets and indicates the error level (0 normal, 1 error). exl: sets and indicates the exception level (0 normal, 1 exception). ie: sets and indicates interrupt enabling/disabling (0 disabled, 1 enabled). 0: reserved for future use. write 0 in a write operation. when this bit is read, 0 is read.
chapter 6 exception processing user?s manual u15509ej2v0um 163 figure 6-6 shows the details of the diagnostic status (ds) field. all ds field bits other than the ts bit are readable and writable. figure 6-6. status register diagnostic status field (a) v r 4181 16 17 18 19 20 21 22 23 24 0 bev ts sr 0 ch ce de (b) v r 4121, v r 4122, v r 4131, v r 4181a 16 17 18 19 20 21 22 23 24 0 bev 0 sr 0 ch ce de bev: specifies the base address of a tlb refill exception vector and common exception vector (0 normal, 1 bootstrap). ts: occurs the tlb to be shut down (v r 4181 only) (0 not shut down, 1 shut down). this bit is read only and used to avoid any problems that may occur when multiple tlb entries match the same virtual address. after the tlb has been shut down, reset the processor to enable restart. note that the tlb is shut down even if a tlb entry matching a virtual address is marked as being invalid (with the v bit cleared). sr: occurs a soft reset or nmi exception (0 not occurred, 1 occurred). ch: cp0 condition bit (0 false, 1 true). this bit can be read and written by software only; it cannot be accessed by hardware. ce, de: these are prepared to maintain compatibility with the v r 4100, and are not used in the v r 4100 series hardware. 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 6 exception processing user?s manual u15509ej2v0um 164 the status register has the following fields where the modes and access status are set. (1) interrupt enable interrupts are enabled when all of the following conditions are true: ? ie bit is set to 1. ? exl bit is cleared to 0. ? erl bit is cleared to 0. ? the appropriate bit of the im field is set to 1. (2) operating modes the following status register bit settings are required for user, kernel, and supervisor modes. ? the processor is in user mode when ksu field = 10, exl bit = 0, and erl bit = 0. ? the processor is in supervisor mode when ksu field = 01, exl bit = 0, and erl bit = 0. ? the processor is in kernel mode when ksu field = 00, exl bit = 1, or erl bit = 1. access to the kernel address space is allowed when the processor is in kernel mode. access to the supervisor address space is allowed when the processor is in supervisor or kernel mode. access to the user address space is allowed in any of the three operating modes. (3) addressing modes the following status register bit settings select 32- or 64-bit operation for user, kernel, and supervisor operating modes. enabling 64-bit operation permits the execution of 64-bit opcodes and translation of 64-bit addresses. 64-bit operation for user, kernel and supervisor modes can be set independently. ? 64-bit addressing for kernel mode is enabled when kx bit = 1. if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the kernel mode address space. 64-bit operations are always valid in kernel mode. ? 64-bit addressing and operations are enabled for supervisor mode when sx bit = 1. if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the supervisor mode address space. ? 64-bit addressing and operations are enabled for user mode when ux bit = 1. if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the user mode address space. (4) status after reset the contents of the status register are undefined after cold resets, except for the following bits in the diagnostic status field. ? ts bit is cleared to 0 (v r 4181 only). ? sr bit is cleared to 0. sr bit is 0 after cold reset, and is 1 after soft reset or nmi exception. ? erl and bev bits are both set to 1. remark cold reset and soft reset are cpu core reset. for details, refer to hardware user's manual of each processor.
chapter 6 exception processing user?s manual u15509ej2v0um 165 6.2.6 cause register (13) the 32-bit read/write cause register holds the cause of the most recent exception. a 5-bit exception code indicates one of the causes (see table 6-2 ). other bits holds the detailed information of the specific exception. all bits in the cause register, with the exception of the ip1 and ip0 bits, are read-only; ip1 and ip0 are used for software interrupts. figure 6-7 shows the fields of this register; table 6-2 describes the cause register codes. figure 6-7. cause register 8 27 16 15 6 7210 31 30 29 28 bd 0 ce 0 ip 0 exccode 0 bd: indicates whether the most recent exception occurred in the branch delay slot (1 in delay slot, 0 normal). ce: indicates the coprocessor number in which a coprocessor unusable exception occurred. this field will remain undefined for as long as no exception occurs. ip: indicates whether an interrupt is pending (1 interrupt pending, 0 no interrupt pending). the bits are assigned to interrupts as follows: im7: a timer interrupt. im(6:2): ordinary interrupts (int(4:0) note ). however, int3 note occurs in the v r 4121 and v r 4181a only, and int4 note in the v r 4181a only. im(1:0): software interrupts. only these bits cause an interrupt exception, when they are set to 1 by means of software. note int(4:0) are internal signals of the cpu core. for details about connection to the on-chip peripheral units, refer to hardware user's manual of each processor. exccode: exception code field (see table 6-2 ). 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 6 exception processing user?s manual u15509ej2v0um 166 table 6-2. cause register exception code field exception code mnemonic description 0 int interrupt exception 1 mod tlb modified exception 2 tlbl tlb refill exception (load or fetch) 3 tlbs tlb refill exception (store) 4 adel address error exception (load or fetch) 5 ades address error exception (store) 6 ibe bus error exception (instruction fetch) 7 dbe bus error exception (data load or store) 8 sys system call exception 9 bp breakpoint exception 10 ri reserved instruction exception 11 cpu coprocessor unusable exception 12 ov integer overflow exception 13 tr trap exception 14 to 22 ? reserved for future use 23 watch watch exception 24 to 31 ? reserved for future use the v r 4100 series has eight interrupt request sources, ip7 to ip0, that are used for the following purpose. for the detailed description of interrupts, refer to chapter 8. (1) ip7 this bit indicates whether there is a timer interrupt request. it is set when the values of count register and compare register match. (2) ip6 to ip2 ip6 to ip2 reflect the state of the interrupt request signal of the cpu core. (3) ip1 and ip0 these bits are used to set/clear a software interrupt request.
chapter 6 exception processing user?s manual u15509ej2v0um 167 6.2.7 exception program counter (epc) register (14) the exception program counter (epc) is a read/write register that contains the address at which processing resumes after an exception has been serviced. the contents of this register change depending on whether execution of mips16 instructions is enabled or disabled. setting the mips16en pin after rtc reset specifies whether execution of the mips16 instructions is enabled or disabled. when the mips16 instruction execution is disabled, the epc register contains either: ? virtual address of the instruction that caused the exception, or ? virtual address of the immediately preceding branch or jump instruction (when the instruction associated with the exception is in a branch delay slot, and the bd bit in the cause register is set to 1). when the mips16 instruction execution is enabled, the epc register contains either: ? virtual address of the instruction that caused the exception and isa mode at which an exception occurs, or ? virtual address of the immediately preceding branch or jump instruction and isa mode at which an exception occurs (when the instruction associated with the exception is in a branch delay slot of the jump instruction, and the bd bit in the cause register is set to 1). when the 16-bit instruction is executed, the epc register contains either: ? virtual address of the instruction that caused the exception and isa mode at which an exception occurs, or ? virtual address of the immediately preceding extend or jump instruction and isa mode at which an exception occurs (when the instruction associated with the exception is in a branch delay slot of the jump instruction or in the instruction following the extend instruction, and the bd bit in the cause register is set to 1). the exl bit in the status register is set to 1 to keep the processor from overwriting the address of the exception- causing instruction contained in the epc register in the event of another exception. the epc register never indicates the address of the instruction in branch delay slot. figure 6-8 shows the epc register format when mips16 isa is disabled, and figure 6-9 shows the epc register format when mips16 isa is enabled. figure 6-8. epc register (when mips16 isa is disabled) 0 31 epc (a) 32-bit mode (b) 64-bit mode 0 63 epc epc: restart address after exception processing.
chapter 6 exception processing user?s manual u15509ej2v0um 168 figure 6-9. epc register (when mips16 isa is enabled) 10 31 epc eim epc: bits 31 to 1 of restart address after exception processing. eim: isa mode at which an exception occurs. (1 when mips16 sia instruction is executed, 0 when mips iii isa instruction is executed.) 10 63 epc eim epc: bits 63 to 1 of restart address after exception processing. eim: isa mode at which an exception occurs. (1 when mips16 sia instruction is executed, 0 when mips iii isa instruction is executed.) 6.2.8 watchlo (18) and watchhi (19) registers the v r 4100 series processor provides a debugging feature to detect references to a selected physical address; load and store instructions to the location specified by the watchlo and watchhi registers cause a watch exception. figures 6-10 and 6-11 show the format of the watchlo and watchhi registers. the contents of these registers after reset are undefined so that they must be initialized by software. figure 6-10. watchlo register 3210 31 paddr0 0 r w paddr0: specifies physical address bits 31 to 3. r: specifies detection of watch address references when load instructions are executed (1 detect, 0 not detect). w: specifies detection of watch address references when store instructions are executed (1 detect, 0 not detect). 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read. figure 6-11. watchhi register 0 31 0 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 6 exception processing user?s manual u15509ej2v0um 169 6.2.9 xcontext register (20) the read/write xcontext register contains a pointer to an entry in the page table entry (pte) array, an operating system data structure that stores virtual-to-physical address translations. if a tlb miss occurs, the operating system loads the untranslated data from the pte into the tlb to handle the software error. the xcontext register is used by the xtlb refill exception handler to load tlb entries in 64-bit addressing mode. the xcontext register duplicates some of the information provided in the badvaddr register, and puts it in a form useful for the xtlb exception handler. this register is included solely for operating system use. the operating system sets the ptebase field in the register, as needed. figure 6-12 shows the format of the xcontext register. figure 6-12. xcontext register 32 0 35 34 33 63 4 3 ptebase r badvpn2 0 ptebase: the ptebase field is a base address of the pte entry table. r: space type (00 user, 01 supervisor, 11 kernel). the setting of this field matches virtual address bits 63 and 62. badvpn2: this field holds the value (vpn2) obtained by halving the virtual page number of the most recent virtual address for which translation failed. 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read. the 29-bit badvpn2 field has bits 39 to 11 of the virtual address that caused the tlb miss; bit 10 is excluded because a single tlb entry maps to an even-odd page pair. for a 1 kb page size, this format may be used directly to address the pair-table of 8-byte ptes. for 4 kb-or-more page and pte sizes, shifting or masking this value produces the appropriate address.
chapter 6 exception processing user?s manual u15509ej2v0um 170 6.2.10 parity error register (26) the parity error (perr) register is a readable/writable register. this register is defined to maintain software- compatibility with the v r 4100, and is not used in hardware because the v r 4100 series has no parity. figure 6-13 shows the format of the perr register. figure 6-13. parity error register 0 87 31 0 diagnostic diagnostic: 8-bit self diagnostic field. 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read. 6.2.11 cache error register (27) the cache error register is a readable/writable register. this register is defined to maintain software-compatibility with the v r 4100, and is not used in hardware because the v r 4100 series has no parity. figure 6-14 shows the format of the cache error register. figure 6-14. cache error register 31 0 0 0: reserved for future use. write 0 in a write operation. when this field is read, 0 is read.
chapter 6 exception processing user?s manual u15509ej2v0um 171 6.2.12 errorepc register (30) the error exception program counter (errorepc) register is similar to the epc register. it is used to store the program counter value at which the cold reset, soft reset, or nmi exception has been serviced. the read/write errorepc register contains the virtual address at which instruction processing can resume after servicing an error. the contents of this register change depending on whether execution of mips16 instructions is enabled or disabled. setting the mips16en pin after rtc reset specifies whether the execution of mips16 instructions is enabled or disabled. when the mips16 isa is disabled, this address can be: ? virtual address of the instruction that caused the exception, or ? virtual address of the immediately preceding branch or jump instruction, when the instruction associated with the error exception is in a branch delay slot. when the mips16 instruction execution is enabled during a 32-bit instruction execution, this address can be: ? virtual address of the instruction that caused the exception and isa mode at which an exception occurs, or ? virtual address of the immediately preceding branch or jump instruction and isa mode at which an exception occurs when the instruction associated with the exception is in a branch delay slot. when the mips16 instruction execution is enabled during a 16-bit instruction execution, this address can be: ? virtual address of the instruction that caused the exception and isa mode at which an exception occurs, or ? virtual address of the immediately preceding jump instruction or extend instruction and isa mode at which an exception occurs when the instruction associated with the exception is in a branch delay slot of the jump instruction or is the instruction following the extend instruction. the contents of the errorepc register do not change when the erl bit of the status register is set to 1. this prevents the processor when other exceptions occur from overwriting the address of the instruction in this register which causes an error exception. there is no branch delay slot indication for the errorepc register. figure 6-15 shows the format of the errorepc register when the mips16isa is disabled. figure 6-16 shows the format of the errorepc register when the mips16isa is enabled.
chapter 6 exception processing user?s manual u15509ej2v0um 172 figure 6-15. errorepc register (when mips16 isa is disabled) (a) 32-bit mode 0 31 errorepc (b) 64-bit mode 0 63 errorepc errorepc: program counter that indicates the restart address after cold reset, soft reset, or nmi exception. figure 6-16. errorepc register (when mips16 isa is enabled) (a) 32-bit mode 10 31 errorepc erim errorepc: bits 31 to 1 of virtual restart address after cold reset, soft reset, or nmi exception. erim: isa mode at which an error exception occurs (1 mips16 isa, 0 mips iii isa). (b) 64-bit mode 10 63 errorepc erim errorepc: bits 63 to 1 of virtual restart address after cold reset, soft reset, or nmi exception. erim: isa mode at which an error exception occurs (1 mips16 isa, 0 mips iii isa).
chapter 6 exception processing user?s manual u15509ej2v0um 173 6.3 overview of exceptions when the processor takes an exception, the exl bit is set to 1, meaning the system is in kernel mode. after saving the appropriate state, the exception handler typically resets the exl bit back to 0. the exception handler sets the exl bit to 1 so that the saved state is not lost upon the occurrence of another exception while the saved state is being restored. returning from an exception also resets the exl bit to 0. for details, see chapter 9 cpu instruction set details . remark when the exl and erl bits in the status register are 0, either user, supervisor, or kernel operating mode is specified by the ksu bits in the status register. when either the exl or erl bit is set to 1, the processor is in kernel mode. 6.3.1 exception types exceptions are classified to as follows according to the internal status of the processor retained at the occurrence of an exception. ? cold reset ? soft reset, nmi ? remaining processor exceptions (common exceptions) 6.3.2 exception vector locations when an exception occurs, the exception vector address is set to the program counter and the processing branches to there from the main program. a program called exception handler that processes exceptions must be placed at the location of the exception vector address. a vector address is calculated by adding a vector offset to a base address. each exception type has a different vector address. 64-/32-bit mode exception vectors and their offsets are shown below.
chapter 6 exception processing user?s manual u15509ej2v0um 174 table 6-3. 32-bit mode exception vector base addresses exception vector base address (virtual) vector offset cold reset soft reset nmi 0xbfc0 0000 (bev is automatically set to 1) 0x0000 tlb refill (exl = 0) 0x0000 xtlb refill (exl = 0) 0x0080 others 0x8000 0000 (bev = 0) 0xbfc0 0200 (bev = 1) 0x0180 table 6-4. 64-bit mode exception vector base addresses exception vector base address (virtual) vector offset cold reset soft reset nmi 0xffff ffff bfc0 0000 (bev is automatically set to 1) 0x0000 tlb refill (exl = 0) 0x0000 xtlb refill (exl = 0) 0x0080 others 0xffff ffff 8000 0000 (bev = 0) 0xffff ffff bfc0 0200 (bev = 1) 0x0180 (1) vector of cold reset, soft reset, and nmi exceptions the cold reset, soft reset, and nmi exceptions are always branched to the following reset exception vector address (virtual). this address is in an uncached, unmapped space. ? 0xbfc0 0000 in 32-bit mode ? 0xffff ffff bfc0 0000 in 64-bit mode (2) tlb refill exception vector when bev bit = 0, the vector base address (virtual) for the tlb refill exception is in kseg0 (unmapped) space. ? 0x8000 0000 in 32-bit mode ? 0xffff ffff 8000 0000 in 64-bit mode when bev bit = 1, the vector base address (virtual) for the tlb refill exception is in kseg1 (uncached, unmapped) space. ? 0xbfc0 0200 in 32-bit mode ? 0xffff ffff bfc0 0200 in 64-bit mode this is an uncached, non-tlb-mapped space, allowing the exception handler to bypass the cache and tlb. (3) common exception vector addresses for the remaining exceptions are a combination of a vector offset and a base address.
chapter 6 exception processing user?s manual u15509ej2v0um 175 6.3.3 priority of exceptions while more than one exception can occur for a single instruction, only the exception with the highest priority is reported. table 6-5 lists the priorities. table 6-5. exception priority order priority exceptions high ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? low cold reset soft reset nmi address error (instruction fetch) tlb/xtlb refill (instruction fetch) tlb invalid (instruction fetch) bus error (instruction fetch) system call breakpoint coprocessor unusable reserved instruction trap integer overflow address error (data access) tlb/xtlb refill (data access) tlb invalid (data access) tlb modified (data write) watch bus error (data access) interrupt (other than nmi) hereafter, handling exceptions by hardware is referred to as ?process?, and handling exception by software is referred to as ?service?.
chapter 6 exception processing user?s manual u15509ej2v0um 176 6.4 details of exceptions 6.4.1 cold reset exception cause the cold reset exception occurs when the coldreset# signal (internal) is asserted and then deasserted. this exception is not maskable. the reset# signal (internal) must be asserted along with the coldreset# signal (for details, see hardware user's manual of each processor). processing the cpu provides a special interrupt vector for this exception: ? 0xbfc0 0000 (virtual) in 32-bit mode ? 0xffff ffff bfc0 0000 (virtual) in 64-bit mode the cold reset vector resides in unmapped and uncached cpu address space, so the hardware need not initialize the tlb or the cache to process this exception. it also means the processor can fetch and execute instructions while the caches and virtual memory are in an undefined state. the contents of all registers in the cpu are undefined when this exception occurs, except for the following register fields: ? when the mips16 instruction execution is disabled while the erl of status register is 0, the pc value at which an exception occurs is set to the errorepc register. when the mips16 instruction execution is enabled while the erl of status register is 0, the pc value at which an exception occurs is set to the errorepc register and the isa mode in which an exception occurs is set to the least significant bit of the errorepc register. ? ts (v r 4181 only) and sr of the status register are cleared to 0. ? erl and bev of the status register are set to 1. ? the random register is initialized to the value of its upper bound (31). ? the wired register and the count register are initialized to 0. ? r and w of the watchlo register are cleared to 0 (other than v r 4181). ? is and bp of the config register are cleared to 0 (v r 4122, v r 4131, and v r 4181a only). ? in the v r 4121 and v r 4181, bits 31 to 28 and bits 22 to 3 of the config register are set to fixed values. ? in the v r 4122, bits 30 to 28, bits 22 to 17, bits 15 to 6, bit 4, and bit 3 of the config register are set to fixed values. ? in the v r 4131 and v r 4181a, bits 30 to 28, bits 22 to 17, bits 15 to 6, and bit 3 of the config register are set to fixed values. ? all other bits are undefined. servicing the cold reset exception is serviced by: ? initializing all processor registers, coprocessor registers, tlb, caches, and the memory system ? performing diagnostic tests ? bootstrapping the operating system
chapter 6 exception processing user?s manual u15509ej2v0um 177 6.4.2 soft reset exception cause a soft reset (sometimes called warm reset) occurs when the coldreset# signal remains deasserted while the reset# signal goes from assertion to deassertion (for details, see hardware user's manual of each processor). a soft reset immediately resets all state machines, and sets the sr bit of the status register. execution begins at the reset vector when the reset# is deasserted. this exception is not maskable. caution in the v r 4100 series, a soft reset never occurs. processing the cpu provides a special interrupt vector for this exception (same location as cold reset): ? 0xbfc0 0000 (virtual) in 32-bit mode ? 0xffff ffff bfc0 0000 (virtual) in 64-bit mode this vector is located within unmapped and uncached address space, so that the cache and tlb need not be initialized to process this exception. the sr bit of the status register is set to 1 to distinguish this exception from a cold reset exception. when this exception occurs, the contents of all registers are preserved except for the following registers: ? when the mips16 instruction execution is disabled, the pc value at which an exception occurs is set to the errorepc register. when the mips16 instruction execution is enabled, the pc value at which an exception occurs is set to the errorepc register and the isa mode in which an exception occurs is set to the least significant bit of the errorepc register. ? ts bit of the status register is cleared to 0 (v r 4181 only). ? erl, sr, and bev bits of the status register are set to 1. ? r and w of the watchlo register are cleared to 0 (other than v r 4181). during a soft reset, access to the operating cache or system interface may be aborted. this means that the contents of the cache and memory will be undefined if a soft reset occurs. servicing the soft reset exception is serviced by: ? preserving the current processor states for diagnostic tests ? reinitializing the system in the same way as for a cold reset exception
chapter 6 exception processing user?s manual u15509ej2v0um 178 6.4.3 nmi exception cause the nonmaskable interrupt (nmi) exception occurs when the nmi signal (internal) becomes active. this interrupt is not maskable; it occurs regardless of the settings of the exl, erl, and the ie bits in the status register (for details, see chapter 8 cpu core interrupts ). processing the cpu provides a special interrupt vector for this exception: ? 0xbfc0 0000 (virtual) in 32-bit mode ? 0xffff ffff bfc0 0000 (virtual) in 64-bit mode this vector is located within unmapped and uncached address space so that the cache and tlb need not be initialized to process an nmi interrupt. the sr bit of the status register is set to 1 to distinguish this exception from a cold reset exception. unlike cold reset and soft reset, but like other exceptions, nmi is taken only at instruction boundaries. the states of the caches and memory system are preserved by this exception. when this exception occurs, the contents of all registers are preserved except for the following registers: ? when the mips16 instruction execution is disabled, the pc value at which an exception occurs is set to the errorepc register. when the mips16 instruction execution is enabled, the pc value at which an exception occurs is set to the errorepc register and the isa mode in which an exception occurs is set to the least significant bit of the errorepc register. ? the ts bit of the status register is cleared to 0 (v r 4181 only). ? the erl, sr, and bev bits of the status register are set to 1. servicing the nmi exception is serviced by: ? preserving the current processor states for diagnostic tests ? reinitializing the system in the same way as for a cold reset exception
chapter 6 exception processing user?s manual u15509ej2v0um 179 6.4.4 address error exception cause the address error exception occurs when an attempt is made to execute one of the following. this exception is not maskable. ? execution of the lw, lwu, sw, or cache instruction for word data that is not located on a word boundary ? execution of the lh, lhu, or sh instruction for half-word data that is not located on a half-word boundary ? execution the ld or sd instruction for double-word data that is not located on a double-word boundary ? referencing the kernel address space in user or supervisor mode ? referencing the supervisor space in user mode ? referencing an address that does not exist in the kernel, user, or supervisor address space in 64-bit kernel, user, or supervisor mode ? branching to an address that was not located on a ward boundary when the mips16 instruction is disabled ? branching to address whose least-significant 2 bits are 10 when the mips16 instruction is enabled processing the common exception vector is used for this exception. the adel or ades code in the cause register is set. if this exception has been caused by an instruction reference or load operation, adel is set. if it has been caused by a store operation, ades is set. when this exception occurs, the badvaddr register stores the virtual address that was not properly aligned or was referenced in protected address space. the contents of the vpn field of the context and entryhi registers are undefined, as are the contents of the entrylo register. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing the kernel reports the unix tm sigsegv (segmentation violation) signal to the current process, and this exception is usually fatal.
chapter 6 exception processing user?s manual u15509ej2v0um 180 6.4.5 tlb exceptions three types of tlb exceptions can occur: ? a tlb refill exception occurs when there is no tlb entry that matches a referenced address. ? a tlb invalid exception occurs when a tlb entry that matches a referenced virtual address is marked as being invalid (with the v bit set to 0). ? a tlb modified exception occurs when a tlb entry that matches a virtual address referenced by the store instruction is marked as being valid (with the v bit set to 1) though a write to it is disabled (with the d bit set to 0). the following three sections describe these tlb exceptions. (1) tlb refill exception (32-bit space mode)/xtlb refill exception (64-bit space mode) cause the tlb refill exception occurs when there is no tlb entry to match a reference to a mapped address space. this exception is not maskable. processing there are two special exception vectors for this exception; one for references to 32-bit address spaces, and one for references to 64-bit address spaces. the ux, sx, and kx bits of the status register determine whether the user, supervisor or kernel address spaces referenced are 32-bit or 64-bit spaces. when the exl bit of the status register is set to 0, either of these two special vectors is referenced. when the exl bit is set to 1, the common exception vector is referenced. this exception sets the tlbl or tlbs code in the exccode field of the cause register. if this exception has been caused by an instruction reference or load operation, tlbl is set. if it has been caused by a store operation, tlbs is set. when this exception occurs, the badvaddr, context, xcontext and entryhi registers hold the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the random register normally contains a valid location in which to place the replacement tlb entry. the contents of the entrylo register are undefined. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1.
chapter 6 exception processing user?s manual u15509ej2v0um 181 servicing to service this exception, the contents of the context or xcontext register are used as a virtual address to fetch memory words containing the physical page frame and access control bits for a pair of tlb entries. the memory word is written into the tlb entry by using the entrylo0, entrylo1, or entryhi register. it is possible that the physical page frame and access control bits are placed in a page where the virtual address is not resident in the tlb. this condition is processed by allowing a tlb refill exception in the tlb refill exception handler. in this case, the common exception vector is used because the exl bit of the status register is set to 1. (2) tlb invalid exception cause the tlb invalid exception occurs when the tlb entry that matches with the virtual address to be referenced is invalid (the v bit is set to 0). this exception is not maskable. processing the common exception vector is used for this exception. the tlbl or tlbs code in the exccode field of the cause register is set. if this exception has been caused by an instruction reference or load operation, tlbl is set. if it has been caused by a store operation, tlbs is set. when this exception occurs, the badvaddr, context, xcontext, and entryhi registers contain the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the random register normally stores a valid location in which to place the replacement tlb entry. the contents of the entrylo register are undefined. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing usually, the v bit of a tlb entry is cleared in the following cases: ? when the virtual address does not exist ? when the virtual address exists, but is not in main memory (a page fault) ? when a trap is required on any reference to the page (for example, to maintain a reference bit) after servicing the cause of a tlb invalid exception, the tlb entry is located with a tlbp (tlb probe) instruction, and replaced by an entry with its v bit set to 1.
chapter 6 exception processing user?s manual u15509ej2v0um 182 (3) tlb modified exception cause the tlb modified exception occurs when the tlb entry that matches with the virtual address referenced by the store instruction is valid (bit v is 1) but is not writable (bit d is 0). this exception is not maskable. processing the common exception vector is used for this exception, and the mod code in the exccode field of the cause register is set. when this exception occurs, the badvaddr, context, xcontext, and entryhi registers contain the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the contents of the entrylo register are undefined. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing the kernel uses the failed virtual address or virtual page number to identify the corresponding access control bits. the page identified may or may not permit write accesses; if writes are not permitted, a write protection violation occurs. if write accesses are permitted, the page frame is marked dirty (i.e. writable) by the kernel in its own data structures. the tlbp instruction places the index of the tlb entry that must be altered into the index register. the word data containing the physical page frame and access control bits (with the d bit set to 1) is loaded to the entrylo register, and the contents of the entryhi and entrylo registers are written into the tlb.
chapter 6 exception processing user?s manual u15509ej2v0um 183 6.4.6 bus error exception cause a bus error exception is raised by board-level circuitry for events such as bus time-out, local bus parity errors, and invalid physical memory addresses or access types. this exception is not maskable. a bus error exception occurs only when a cache miss refill, uncached reference, or unbuffered write occurs synchronously. processing the common interrupt vector is used for a bus error exception. the ibe or dbe code in the exccode field of the cause register is set, signifying whether the instruction caused the exception by an instruction reference, load operation, or store operation. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. note that the epc register may indicate a succeeding instruction instead of the instruction that caused the exception if the instruction streaming function is on in the v r 4122, v r 4131, and v r 4181a. servicing the physical address at which the fault occurred can be computed from information available in the system control coprocessor (cp0) registers. ? if the ibe code in the cause register is set (indicating an instruction fetch), the virtual address is contained in the epc register. ? if the dbe code is set (indicating a load or store), the virtual address of the instruction that caused the exception is saved to the epc register. the virtual address of the load and store instruction can then be obtained by interpreting the instruction. the physical address can be obtained by using the tlbp instruction and reading the entrylo register to compute the physical page number. at the time of this exception, the kernel reports the unix sigbus (bus error) signal to the current process, but the exception is usually fatal.
chapter 6 exception processing user?s manual u15509ej2v0um 184 6.4.7 system call exception cause a system call exception occurs during an attempt to execute the syscall instruction. this exception is not maskable. processing the common exception vector is used for this exception, and the sys code in the exccode field of the cause register is set. the epc register contains the address of the syscall instruction unless it is in a branch delay slot, in which case the epc register contains the address of the preceding branch instruction. if the syscall instruction is in a branch delay slot, the bd bit of the status register is set to 1; otherwise this bit is cleared. servicing when this exception occurs, control is transferred to the applicable system routine. to resume execution, the epc register must be altered so that the syscall instruction does not re-execute; this is accomplished by adding a value of 4 to the epc register before returning. if a syscall instruction is in a branch delay slot, interpretation of the branch instruction is required to resume execution.
chapter 6 exception processing user?s manual u15509ej2v0um 185 6.4.8 breakpoint exception cause a breakpoint exception occurs when an attempt is made to execute the break instruction. this exception is not maskable. processing the common exception vector is used for this exception, and the bp code in the exccode field of the cause register is set. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. if the break instruction is in a branch delay slot, the bd bit of the status register is set to 1; otherwise this bit is cleared. servicing when the breakpoint exception occurs, control is transferred to the applicable system routine. additional distinctions can be made by analyzing the unused bits of the break instruction (bits 25 to 6), and loading the contents of the instruction whose address the epc register contains. a value of 4 must be added to the contents of the epc register to locate the instruction if it resides in a branch delay slot. to resume execution, the epc register must be altered so that the break instruction does not re-execute; this is accomplished by adding a value of 4 to the epc register before returning. when a breakpoint exception occurs while executing the mips16 instruction, a valve of 2 should be added to the epc register before returning. if a break instruction is in a branch delay slot, interpretation (decoding) of the branch instruction is required to resume execution.
chapter 6 exception processing user?s manual u15509ej2v0um 186 6.4.9 coprocessor unusable exception cause the coprocessor unusable exception occurs when an attempt is made to execute a coprocessor instruction for either: ? a corresponding coprocessor unit that has not been marked usable (status register bit, cu0 = 0), or ? cp0 instructions, when the unit has not been marked usable (status register bit, cu0 = 0) and the process executes in user or supervisor mode. this exception is not maskable. processing the common exception vector is used for this exception, and the cpu code in the exccode field of the cause register is set. the ce bit of the cause register indicates which of the four coprocessors was referenced. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing the coprocessor unit to which an attempted reference was made is identified by the ce bit of the cause register. one of the following processing is performed by the handler: ? if the process is entitled access to the coprocessor, the coprocessor is marked usable and the corresponding state is restored to the coprocessor. ? if the process is entitled access to the coprocessor, but the coprocessor does not exist or has failed, interpretation of the coprocessor instruction is possible. ? if the bd bit in the cause register is set to 1, the branch instruction must be interpreted; then the coprocessor instruction can be emulated and execution resumed with the epc register advanced past the coprocessor instruction. ? if the process is not entitled access to the coprocessor, the kernel reports unix sigill/ill_privin_fault (illegal instruction/privileged instruction fault) signal to the current process, and this exception is fatal.
chapter 6 exception processing user?s manual u15509ej2v0um 187 6.4.10 reserved instruction exception cause the reserved instruction exception occurs when an attempt is made to execute one of the following instructions: ? instruction with an undefined major opcode (bits 31 to 26) ? special instruction with an undefined minor opcode (bits 5 to 0) ? regimm instruction with an undefined minor opcode (bits 20 to 16) ? 64-bit instructions in 32-bit user or supervisor mode ? rr instruction with an undefined minor op code (bits 4 to 0) when executing the mips16 instruction ? i8 instruction with an undefined minor op code (bits 10 to 8) when executing the mips16 instruction 64-bit operations are always valid in kernel mode regardless of the value of the kx bit in the status register. this exception is not maskable. processing the common exception vector is used for this exception, and the ri code in the exccode field of the cause register is set. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing all currently defined mips isa instructions can be executed. the process executing at the time of this exception is handled by a unix sigill/ill_resop_fault (illegal instruction/reserved operand fault) signal. this error is usually fatal.
chapter 6 exception processing user?s manual u15509ej2v0um 188 6.4.11 trap exception cause the trap exception occurs when a tge, tgeu, tlt, tltu, teq, tne, tgei, tgeui, tlti, tltui, teqi, or tnei instruction results in a true condition. this exception is not maskable. processing the common exception vector is used for this exception, and the tr code in the exccode field of the cause register is set. the epc register contains the address of the trap instruction causing the exception unless the instruction is in a branch delay slot, in which case the epc register contains the address of the preceding branch instruction and the bd bit of the cause register is set to 1. servicing at the time of a trap exception, the kernel reports the unix sigfpe/fpe_intovf_trap (floating-point exception/integer overflow) signal to the current process, but the exception is usually fatal. 6.4.12 integer overflow exception cause an integer overflow exception occurs when an add, addi, sub, dadd, daddi, or dsub instruction results in a 2?s complement overflow. this exception is not maskable. processing the common exception vector is used for this exception, and the ov code in the exccode field of the cause register is set. the epc register contains the address of the instruction that caused the exception unless the instruction is in a branch delay slot, in which case the epc register contains the address of the preceding branch instruction and the bd bit of the cause register is set to 1. servicing at the time of the exception, the kernel reports the unix sigfpe/fpe_intovf_trap (floating-point exception/integer overflow) signal to the current process, and this exception is usually fatal.
chapter 6 exception processing user?s manual u15509ej2v0um 189 6.4.13 watch exception cause a watch exception occurs when a load or store instruction references the physical address specified by the watchlo/watchhi registers. the watchlo/watchhi registers specify whether a load or store or both could have initiated this exception. ? when the r bit of the watchlo register is set to 1: load instruction ? when the w bit of the watchlo register is set to 1: store instruction ? when both the r bit and w bit of the watchlo register are set to 1: load instruction or store instruction the cache instruction never causes a watch exception. the watch exception is postponed while the exl bit in the status register is set to 1, and watch exception is maskable by setting the exl bit in the status register to 1 or by setting the r or w bit in the watchlo register to 0. processing the common exception vector is used for this exception, and the watch code in the exccode field of the cause register is set. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing the watch exception is a debugging aid; typically the exception handler transfers control to a debugger, allowing the user to examine the situation. to continue, once the watch exception must be disabled to execute the faulting instruction. the watch exception must then be reenabled. the faulting instruction can be executed either by the debugger or by setting breakpoints. the contents of the watchlo/watchhi register after reset are undefined so that they, especially the r and w bits, must be initialized by software, otherwise a watch exception may occur after reset.
chapter 6 exception processing user?s manual u15509ej2v0um 190 6.4.14 interrupt exception cause the interrupt exception occurs when one of the eight interrupt conditions note is asserted. in the v r 4100 series, interrupt requests from internal peripheral units first enter the icu and are then notified to the cpu core via one of five interrupt sources (int(4:0)) or nmi. each of the eight interrupts can be masked by clearing the corresponding bit in the im field of the status register, and all of the eight interrupts can be masked at once by clearing the ie bit of the status register or setting the exl/erl bit. note they are 1 timer interrupt, 5 ordinary interrupts, and 2 software interrupts. of the five ordinary interrupts, int3 becomes active in the v r 4121 and v r 4181a only, and int4 in the v r 4181a only. for details about the interrupt control unit (icu), refer to hardware user's manual of each processor. processing the common exception vector is used for this exception, and the int code in the exccode field of the cause register is set. the ip field of the cause register indicates current interrupt requests. it is possible that more than one of the bits can be simultaneously set (or cleared) if the interrupt request signal is asserted (or deasserted) before this register is read. when the mips16 instruction is disabled, the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding jump or branch instruction, and the bd bit of the cause register is set to 1. when the mips16 instruction is enabled, the epc register contains the address of the instruction that caused the exception, and the least significant bit stores the isa mode in which an exception occurs. however, if this instruction is in a branch delay slot or is the instruction following the extend instruction, the epc register contains the address of the preceding jump or extend instruction, and the bd bit of the cause register is set to 1. servicing if the interrupt is caused by one of the two software-generated exceptions, the interrupt condition is cleared by setting the corresponding cause register bit to 0. if the interrupt is caused by hardware, the interrupt condition is cleared by deactivating the corresponding interrupt request signal.
chapter 6 exception processing user?s manual u15509ej2v0um 191 6.5 exception processing and servicing flowcharts the remainder of this chapter contains flowcharts for the following exceptions and guidelines for their handlers: ? common exceptions and a guideline to their exception handler ? tlb/xtlb refill exception and a guideline to their exception handler ? cold reset, soft reset and nmi exceptions, and a guideline to their handler.
chapter 6 exception processing user?s manual u15509ej2v0um 192 figure 6-17. common exception handling (1/2) (a) processing by hardware exl bit 1 normal bootstrap entryhi vpn2, asid xcontext/context vpn2 set exccode, ce fields bd bit 1 epc pc ? 4 bd bit 0 epc pc yes exl bit = 0? no no yes bev bit = 0? m16 bit = 0? instruction in branch delay slot? pc 0xffff ffff 8000 0000+180 (unmapped, cacheable) pc 0xffff ffff bfc0 0200+180 (unmapped, uncacheable) no bd bit 1 epc pc ? 4 note1 eim bit 0/1 bd bit 0 epc pc note2 eim bit 0/1 no yes yes entryhi, xcontext/context registers are set when a tlb refill, tlb invalid, or tlb modified exception occurs. kernel mode is set and interrupts are disabled. badvaddr register is set only when a tlb refill, tlb invalid, or tlb modified exception occurs (it is not set when a bus error exception occurs). check for multiple exceptions start a instruction in branch delay slot? no yes notes 1. pc ? 2 when the jr or jalr instruction of mips16 instructions 2. pc ? 2 when the extend instruction of mips16 instructions remark the interrupts can be masked by setting the ie or im bit. the watch exception can be set to pending state by setting the exl bit to 1.
chapter 6 exception processing user?s manual u15509ej2v0um 193 figure 6-17. common exception handling (2/2) (b) servicing by software execute mfc0 instruction xcontext/context register epc register status register cause register execute mtc0 instruction (status register setting) ksu bits 00 exl bit 0 ie bit 1 execute mtc0 instruction epc register status register servicing by each exception routine check the cause register, and jump to each routine yes no exl bit = 1 the occurrence of tlb refill, tlb invalid, and tlb modified exceptions is disabled by using an unmapped space. the occurrence of the watch and interrupt exceptions is disabled setting exl = 1. other exceptions are avoided in the os programs. the execution of the eret instruction is disabled in the delay slots for the other jump instructions. the processor does not execute an instruction in the branch delay slot for the eret instruction. pc epc register, exl bit 0 the cold reset, soft reset, and nmi exceptions are enabled. in kernel mode, interrupts are enabled. after exl = 0 is set, all exceptions are enabled (although the interrupt exception can be masked by the ie and im bits). the register files are saved. ts bit = 0? execute eret instruction the processor is reset. a v r 4181 only. end
chapter 6 exception processing user?s manual u15509ej2v0um 194 figure 6-18. tlb/xtlb refill exception handling (1/2) (a) processing by hardware yes no no yes bev bit = 0? no yes xtlb exception? xtlb refill vector offset = 0x080 tlb refill vector offset = 0x000 tlb refill vector offset = 0x180 no no yes yes exl bit 1 normal bootstrap check for multiple exceptions bd bit 1 epc pc ? 4 bd bit 0 epc pc pc 0xffff ffff 8000 0000 + vector offset (unmapped, cacheable) pc 0xffff ffff bfc0 0200 + vector offset (unmapped, uncacheable) bd bit 1 epc pc ? 4 note1 eim bit 0/1 bd bit 0 epc pc note2 eim bit 0/1 entryhi vpn2, asid xcontext/context vpn2 sets exccode, ce fields start b exl bit = 0? m16 bit = 0? instruction in branch delay slot? instruction in branch delay slot? no yes kernel mode is set and interrupts are disabled. notes 1. pc ? 2 when the jr or jalr instruction of mips16 instructions 2. pc ? 2 when the extend instruction of mips16 instructions
chapter 6 exception processing user?s manual u15509ej2v0um 195 figure 6-18. tlb/xtlb refill exception handling (2/2) (b) servicing by software execute mfc0 instruction xcontext/context register servicing by each exception routine note the occurrence of tlb refill, tlb invalid, and tlb modified exceptions is disabled by using an unmapped space. the occurrence of the watch and interrupt exceptions is disabled by setting exl = 1. other exceptions are avoided in the os programs. the execution of the eret instruction is not allowed in the branch delay slots for other jump instructions. the processor does not execute an instruction in the branch delay slot for the eret instruction. pc epc register, exl bit 0 however, the cold reset, soft reset, and nmi exceptions are enabled. the physical address for a virtual address that is loaded into the context register is loaded into the entrylo register and written to the tlb. execute eret instruction b end note as long as a data/instruction address exists in the mapping space, another tlb refill exception may occur. in such a case, exl = 1 is set, causing a jump to the common exception vector. in this case, the common exception handler handles the tlb miss, the eret instruction returns control to the user program, then a tlb refill exception is generated again.
chapter 6 exception processing user?s manual u15509ej2v0um 196 figure 6-19. cold reset exception handling pc 0xffff ffff bfc0 0000 servicing by nmi exception routine servicing by soft reset exception routine servicing by cold reset exception routine yes no no yes yes software hardware nmi? no yes no no yes bd bit 1 errorepc pc ? 4 bd bit 0 errorepc pc bd bit 1 errorepc pc ? 4 note1 erim bit 0/1 bd bit 0 errorepc pc note2 erim bit 0/1 random register 31 wired register 0 count register 0 update config register bits set watchlo register r bit 0 w bit 0 set status register bev bit 1 sr bit 0 ts bit 0 erl bit 1 the processor provides no means of distinguishing between an nmi exception and soft reset exception, so that this must be determined at the system level. execute eret instruction end sr bit = 1? erl bit = 0? m16 bit = 0? instruction in branch delay slot? instruction in branch delay slot? yes no start refer to 6. 4. 1 about config register bits to be updated. manipulation of ts bit is for v r 4181 only. setting watchlo register is for processors other than v r 4181. notes 1. pc ? 2 when the jr or jalr instruction of mips16 instructions 2. pc ? 2 when the extend instruction of mips16 instructions
chapter 6 exception processing user?s manual u15509ej2v0um 197 figure 6-20. soft reset and nmi exception handling yes no no yes yes software hardware nmi? no no yes yes no bd bit 1 errorepc pc ? 4 note1 erim bit 0/1 bd bit 0 errorepc pc note2 erim bit 0/1 pc 0xffff ffff bfc0 0000 bd bit 1 errorepc pc ? 4 bd bit 0 errorepc pc set watchlo register r bit 0 w bit 0 set status register bev bit 1 sr bit 1 ts bit 0 erl bit 1 sr bit = 1? erl bit = 0? m16 bit = 0? instruction in branch delay slot? instruction in branch delay slot? servicing by nmi exception routine servicing by soft reset exception routine servicing by cold reset exception routine execute eret instruction end yes no the processor provides no means of distinguishing between an nmi exception and soft reset exception, so that this must be determined at the system level. manipulation of ts bit is for v r 4181 only. start setting watchlo register (only for soft reset) is for processors other than v r 4181. notes 1. pc ? 2 when the jr or jalr instruction of mips16 instructions 2. pc ? 2 when the extend instruction of mips16 instructions
user?s manual u15509ej2v0um 198 chapter 7 cache memory this chapter describes in detail the cache memory of the v r 4100 series: its place in the cpu core memory organization, and individual organization of the caches. this chapter uses the following terminology: ? the data cache may also be referred to as the d-cache. ? the instruction cache may also be referred to as the i-cache. these terms are used interchangeably throughout this book. 7.1 memory organization figure 7-1 shows the cpu core system memory hierarchy. in the logical memory hierarchy, the caches lie between the cpu and main memory. they are designed to make the speedup of memory accesses transparent to the user. each functional block in figure 7-1 has the capacity to hold more data than the block above it. for instance, physical main memory has a larger capacity than the caches. at the same time, each functional block takes longer to access than any block above it. for instance, it takes longer to access data in main memory than in the cpu on-chip registers. figure 7-1. logical hierarchy of memory cpu core register register cache register memory faster access time increasing data capacity media instruction cache data cache main memory disks, cd-roms, tapes, etc.
chapter 7 cache memory user?s manual u15509ej2v0um 199 7.1.1 on-chip caches the cpu core has two on-chip caches: one holds instructions (the instruction cache), the other holds data (the data cache). the instruction and data caches can be read in one pclock cycle. 2 pcycles are needed to write data. however, data writes are pipelined and can complete at a rate of one per pclock cycle. in the first stage of the cycle, the store address is translated and the tag is checked; in the second stage, the data is written into the data ram. figure 7-2 provides a relationship between cache and memory. figure 7-2. on-chip caches and main memory cpu core cache controller data cache instruction cache main memory on-chip caches have the following characteristics: ? indexed with a virtual address ? holds physical address with a tag ? maintains coherency to memory with writeback the cache data of the v r 4121, v r 4122, v r 4181, and v r 4181a are directly mapped; on the other hand those of the v r 4131 are mapped in 2-way set associative format. in addition, the caches of the v r 4131 have line lock function.
chapter 7 cache memory user?s manual u15509ej2v0um 200 7.2 cache organization this section describes the organization of the on-chip data and instruction caches. a cache consists of blocks called cache lines, which is the smallest unit of information that can be fetched from main memory to a cache. a cache line itself has tag and data fields. two types of line size can be selectable by setting the config register of the cp0 for the instruction cache line of the v r 4122 and for the instruction/data cache line of the v r 4131. 7.2.1 instruction cache line figure 7-3 shows the format of a 4-word (16-byte) i-cache line. figure 7-3. instruction cache line format (a) v r 4121, v r 4122, v r 4181 22 21 ptag v 0 data 127 31 0 data data data 32 63 64 95 96 data tag (b) v r 4131 23 22 ptag v 0 data 127 31 0 data data data 32 63 64 95 96 data tag 21 l v : valid bit (line status) l : lock bit (line lock status) ptag : physical tag (bits 31 to 10 of physical address) data : cache data remarks 1. in the v r 4181a, the data field has 256 bits since the line size is 8 words (32 bytes), though the tag format is the same as that of the v r 4121, v r 4122, and v r 4181. 2. when the line size is specified as 8 words (32 bytes) in the v r 4122 or v r 4131, the data field becomes 256 bits wide.
chapter 7 cache memory user?s manual u15509ej2v0um 201 7.2.2 data cache line figure 7-4 shows the format of a 4-word (16-byte) d-cache line. figure 7-4. data cache line format (a) v r 4121, v r 4122, v r 4181 24 23 ptag w 0 22 21 v d 127 0 data data 63 64 data tag (b) v r 4131 24 23 ptag w 0 22 21 v l 127 0 data data 63 64 data tag w : write-back bit (set if cache line has been written) v : valid bit (line status) d : dirty bit (write status) l : lock bit (line lock status) ptag : physical tag (bits 31 to 10 of physical address) data : d-cache data remarks 1. in the v r 4181a, the data field has 256 bits since the line size is 8 words (32 bytes), though the tag format is the same as that of the v r 4121, v r 4122, and v r 4181. 2. when the line size is specified as 8 words (32 bytes) in the v r 4131, the data field becomes 256 bits wide.
chapter 7 cache memory user?s manual u15509ej2v0um 202 7.2.3 placement of cache data the cache data of the v r 4121, v r 4122, v r 4181, and v r 4181a are directly mapped; on the other hand those of the v r 4131 are mapped in 2-way set associative format. (1) direct mapping in this format, a cache is dealt with one block of memory space, and cache lines are placed linearly. (2) 2-way set associative in this format, the memory space of a cache is divided into two blocks (ways), and two cache lines are placed in the same index (of different ways). 7.3 cache operations as described earlier, caches provide fast temporary data storage, and they make the speedup of memory accesses transparent to the user. in general, the cpu core accesses cache-resident instructions or data through the following procedure: 1. the cpu core, through the on-chip cache controller, attempts to access the next instruction or data in the appropriate cache. 2. the cache controller checks to see if this instruction or data is present in the cache. ? if the instruction/data is present, the cpu core retrieves it. this is called a cache hit. ? if the instruction/data is not present in the cache, the cache controller must retrieve it from memory. this is called a cache miss. 3. the cpu core retrieves the instruction/data from the cache and operation continues. it is possible for the same data to be in two places simultaneously: main memory and cache. this data is kept consistent through the use of a writeback methodology; that is, modified data is not written back to memory until the cache line is to be replaced.
chapter 7 cache memory user?s manual u15509ej2v0um 203 7.3.1 cache data coherency the cpu core of the v r 4100 series manages its data cache by using a writeback policy; that is, it stores write data into the cache, instead of writing it directly to memory. some time later this data is independently written into memory. in the v r 4100 series implementation, a modified cache line is not written back to memory until the cache line is to be replaced. when the cpu core writes a cache line back to memory, it does not ordinarily retain a copy of the cache line, and the state of the cache line is changed to invalid. remark contrary to the writeback, the write-through cache policy stores write data into the memory and cache simultaneously. (1) v r 4121, v r 4122, v r 4181, and v r 4181a on a store miss writeback, data tag is checked and data is transferred to the write buffer. if an error is detected in the data field, the writeback is not terminated; the erroneous data is still written out to main memory. if an error is detected in the tag field, the writeback bus cycle is not issued. the cache data may not be checked during cache operation. (2) v r 4131 on a store miss writeback, data tag is checked, a refill request is issued, and data is transferred to the write buffer. the writeback is performed after the refill is completed. 7.3.2 replacement of cache line when a cache miss occurs or when the fill operation (for instruction cache only) or the fetch_and_lock operation (for v r 4131 only) of cache instruction is executed, one of the cache lines is overwritten with data that is read from main memory. such an overwriting is called replacement of a cache line. the on-chip caches of the v r 4131 are 2-way set associative memory where two cache lines are placed to one index. when a cache miss occurs, the way to be replaced is determined by the lru (least recently used) algorithm. it is indicated in the taglo register of the cp0. the on-chip caches of the v r 4131 also have the line lock function. if a line is set locked on its placement, it will not be replaced even when a cache miss occurs. cache line locking is set or cancelled with cache instruction, and locking status is indicated in the taglo register of the cp0.
chapter 7 cache memory user?s manual u15509ej2v0um 204 7.3.3 accessing the caches cache instruction is used to change cache line states or to write back cache data (for details, refer to chapter 9 cpu instruction set details ). some bits of the virtual address (va) are used to index into the caches. the number of virtual address bits used to index the instruction and data caches depends on the cache size. in addition, bit 13 of the virtual address specifies the way to be accessed in the v r 4131. table 7-1. cache size, line size, and index processor cache cache size line size index v r 4121 instruction 16 kb 4 words va(13:4) data 8 kb 4 words va(12:4) v r 4122 instruction 32 kb 4 words or 8 words va(14:4) data 16 kb 4 words va(13:4) v r 4131 instruction 16 kb 4 words or 8 words va(12:4) data 16 kb 4 words or 8 words va(12:4) v r 4181 instruction 4 kb 4 words va(11:4) data 4 kb 4 words va(11:4) v r 4181a instruction 8 kb 8 words va(12:5) data 8 kb 8 words va(12:5) figure 7-5 shows index into caches and data output. figure 7-5. cache index and data output d v w data ptag 64 (data cache)/ 32 (instruction cache) cache memory tag line data line l internal address bus cache index internal data bus
chapter 7 cache memory user?s manual u15509ej2v0um 205 7.4 cache states there are three cache line states that indicate validity and consistency with main memory of line data. (1) instruction cache the instruction cache supports two cache states: ? invalid: a cache line that does not contain valid information must be marked invalid, and cannot be used. ? valid: a cache line that contains valid data. (2) data cache the data cache supports three cache states: ? invalid: a cache line that does not contain valid information must be marked invalid, and cannot be used. ? valid clean: a cache line that contains data that has not changed since it was loaded from memory. ? valid dirty: a cache line containing data that has changed since it was loaded from memory. the state of a valid cache line may be modified when the processor executes some operations of cache instruction. cache instruction and its operations are described in chapter 9 cpu instruction set details .
chapter 7 cache memory user?s manual u15509ej2v0um 206 7.4.1 cache state transition diagrams the following section describes the cache state diagrams for the data and instruction cache lines. these state diagrams do not cover the initial state of the system, since the initial state is system-dependent. (1) instruction cache state transition the following diagram illustrates the instruction cache state transition sequence. ? read (1) indicates a read operation from main memory to cache, inducing a cache state transition. ? read (2) indicates a read operation from cache to the cpu core, which induces no cache state transition. figure 7-6. instruction cache state diagram valid read (1) cache instruction read (2) invalid (2) data cache state transition the following diagram illustrates the data cache state transition sequence. a load or store operation may include one or more of the atomic read and/or write operations shown in the state diagram below, which may cause cache state transitions. ? read (1) indicates a read operation from main memory to cache, inducing a cache state transition. ? write (1) indicates a write operation from cpu core to cache, inducing a cache state transition. ? read (2) indicates a read operation from cache to the cpu core, which induces no cache state transition. ? write (2) indicates a write operation from cpu core to cache, which induces no cache state transition. figure 7-7. data cache state diagram write (1) cache instruction write-back read (2) read (2) write (2) cache instruction write (1) read (1) cache instruction invalid valid dirty valid clean
chapter 7 cache memory user?s manual u15509ej2v0um 207 7.5 cache access flow figures 7-8 to 7-23 show operation flows for various cache accesses. figure 7-8. flow on instruction fetch start end miss hit refill (see figure 7-22 ) data fetch tag check (a) v r 4121, v r 4122, v r 4181, v r 4181a start end miss hit refill (see figure 7-22 ) data fetch tag check (b) v r 4131 r bit update r bit check
chapter 7 cache memory user?s manual u15509ej2v0um 208 figure 7-9. flow on load operations miss or invalid v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) hit start end tag check v bit, w bit writeback and refill (see figure 7-23 ) data write to register refill (see figure7-22 ) (a) v r 4121, v r 4122, v r 4181, v r 4181a miss or invalid v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) hit start end tag check v bit, w bit writeback and refill (see figure 7-23 ) data write to register refill (see figure 7-22 ) (b) v r 4131 r bit check r bit update
chapter 7 cache memory user?s manual u15509ej2v0um 209 figure 7-10. flow on store operations miss v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) hit start end tag check v bit, w bit writeback and refill (see figure 7-23 ) data write to data cache refill (see figure 7-22 ) (a) v r 4121, v r 4122, v r 4181, v r 4181a miss v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) hit start end tag check v bit, w bit writeback and refill (see figure 7-23 ) data write to data cache refill (see figure7-22 ) (b) v r 4131 r bit check r bit update
chapter 7 cache memory user?s manual u15509ej2v0um 210 figure 7-11. flow on index_invalidate operations start end v bit clear (a) v r 4121, v r 4122, v r 4181, v r 4181a start end (b) v r 4131 r bit update v bit clear
chapter 7 cache memory user?s manual u15509ej2v0um 211 figure 7-12. flow on index_writeback_invalidate operations = 1 (valid) = 0 (invalid) = 0 (clean) = 1 (dirty) start end v bit w bit writeback (see figure 7-21 ) v bit and w bit clear (a) v r 4121, v r 4122, v r 4181, v r 4181a = 1 (valid) = 0 (invalid) = 0 (clean) = 1 (dirty) start end v bit w bit (b) v r 4131 r bit update writeback (see figure 7-21 ) v bit and w bit clear figure 7-13. flow on index_load_tag operations for data cache start end tag read to taglo w bit read to taglo
chapter 7 cache memory user?s manual u15509ej2v0um 212 figure 7-14. flow on index_store_tag operations start end tag write from taglo figure 7-15. flow on create_dirty operations hit miss start end writeback (see fitgure 7-21 ) tag check v bit and w bit set tag write (a) v r 4121, v r 4122, v r 4181, v r 4181a v bit, w bit v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) hit miss start end writeback (see figure 7-21 ) tag check v bit, w bit (b) v r 4131 r bit check v = 1 (valid) and w = 1 (dirty) v = 0 (invalid) or w = 0 (clean) v bit and w bit set tag write
chapter 7 cache memory user?s manual u15509ej2v0um 213 figure 7-16. flow on hit_invalidate operations hit miss or invalid start end v bit clear tag check (a) v r 4121, v r 4122, v r 4181, v r 4181a hit miss or invalid start end tag check (b) v r 4131 r bit update v bit clear
chapter 7 cache memory user?s manual u15509ej2v0um 214 figure 7-17. flow on hit_writeback_invalidate operations miss or invalid hit = 0 (clean) = 1 (dirty) start end w bit writeback (see figure 7-21 ) tag check v bit clear (a) v r 4121, v r 4122, v r 4181, v r 4181a miss or invalid hit = 0 (clean) = 1 (dirty) start end w bit writeback (see figure 7-21 ) tag check (b) v r 4131 r bit update v bit clear
chapter 7 cache memory user?s manual u15509ej2v0um 215 figure 7-18. flow on fill operations start end refill (see figure 7-22 ) (a) v r 4121, v r 4122, v r 4181, v r 4181a start end refill (see figure 7-22 ) (b) v r 4131 hit miss or invalid tag check r bit check r bit update
chapter 7 cache memory user?s manual u15509ej2v0um 216 figure 7-19. flow on hit_writeback operations hit miss or invalid for data cache for data cache = 0 (clean) = 1 (dirty) start end w bit writeback (see figure 7-21 ) tag check w bit clear
chapter 7 cache memory user?s manual u15509ej2v0um 217 figure 7-20. flow on fetch_and_lock operations (v r 4131 only) miss or invalid hit for data cache for data cache = 0 (clean) = 1 (dirty) start end w bit writeback (see figure 7-21 ) tag check w bit clear for data cache refill (see figure 7-22 ) l bit set r bit update r bit check
chapter 7 cache memory user?s manual u15509ej2v0um 218 figure 7-21. writeback flow no yes writeback to memory eod? figure 7-22. refill flow error no yes no error data write to cache eod? erroneous bit cache line invalidate bus error exception
chapter 7 cache memory user?s manual u15509ej2v0um 219 figure 7-23. writeback & refill flow no yes no yes error no error writeback to memory eod? refill start erroneous bit data write to cache eod? cache line invalidate bus error exception (a) v r 4121, v r 4122, v r 4181, v r 4181a no yes no yes error no error eod? refill request erroneous bit data write to cache eod? cache line invalidate bus error exception (b) v r 4131 writeback to memory
chapter 7 cache memory user?s manual u15509ej2v0um 220 7.6 manipulation of the caches by an external agent the v r 4100 series does not provide any mechanisms for an external agent to examine and manipulate the state and contents of the caches. 7.7 initialization of the caches the caches of the v r 4100 series also need an initialization on reset or such cases. for procedures and program examples of initialization, refer to v r series programming guide application note .
user?s manual u15509ej2v0um 221 chapter 8 cpu core interrupts four types of interrupt are available on the cpu core of the v r 4100 series. these are: ? one non-maskable interrupt, nmi ? five ordinary interrupts ? two software interrupts ? one timer interrupt for the interrupt request input to the cpu core from on-chip peripheral units, see hardware user's manual of each product. 8.1 types of interrupt request 8.1.1 non-maskable interrupt (nmi) the non-maskable interrupt is acknowledged by asserting the nmi signal (internal), forcing the processor to branch to the reset exception vector. this signal is latched into an internal register at the rising edge of masterout (internal), as shown in figure 8-1. nmi only takes effect when the processor pipeline is running. this interrupt cannot be masked. figure 8-1 shows the internal service of the nmi signal. the nmi signal is latched into an internal register by the rising edge of masterout. the latched signal is inverted to be transferred to inside the device as an nmi request. figure 8-1. non-maskable interrupt signal nmi request nmi masterout (internal register) 8.1.2 ordinary interrupts ordinary interrupts are acknowledged by asserting the int(4:0) signals (internal). however, int3 occurs in the v r 4121 and v r 4181a only, and int4 in the v r 4181a only. this interrupt request can be masked with the im (6:2), ie, exl, and erl fields of the status register.
chapter 8 cpu core interrupts user?s manual u15509ej2v0um 222 8.1.3 software interrupts generated in cpu core software interrupts generated in the cpu core use bits 1 and 0 of the ip (interrupt pending) field in the cause register. these may be written by software, but there is no hardware mechanism to set or clear these bits. after the processing of a software interrupt exception, corresponding bit of the ip field in the cause register must be cleared before enabling multiple interrupts or until the operation returns to normal routine. this interrupt request is maskable through the im (1:0), ie, exl, and erl fields of the status register. 8.1.4 timer interrupt the timer interrupt uses bit 7 of the ip (interrupt pending) field of the cause register. this bit is set automatically whenever the value of the count register equals the value of the compare register, and an interrupt request is acknowledged. this interrupt is maskable through im7, ie, exl, and erl fields of the status register. 8.2 acknowledging interrupts 8.2.1 detecting hardware interrupts figure 8-2 shows how the hardware interrupts are readable through the cause register. ? the timer interrupt signal of the cpu core is directly readable as bit 15 (ip7) of the cause register. ? the int(4:0) signals are directly readable as bits 14 to 10 (ip(6:2)) of the cause register. ip(1:0) of the cause register are used for software interrupt requests. there is no hardware mechanism for setting or clearing the software interrupts. figure 8-2. hardware interrupt signals ip2 ip3 ip4 ip5 ip6 4 3 2 1 0 ip7 int4 int3 int2 int1 int0 10 11 12 13 see figure 8-3 cause register bits 15 to 10 (internal register) masterout timer interrupt 14 15 remark int3 occurs in the v r 4121 and v r 4181a only, and int4 in the v r 4181a only.
chapter 8 cpu core interrupts user?s manual u15509ej2v0um 223 8.2.2 masking interrupt signals figure 8-3 shows the masking of the cpu core interrupt signals. ? cause register bits 15 to 8 (ip(7:0)) are and-ored with status register interrupt mask bits 15 to 8 (im(7:0)) to mask individual interrupts. ? status register bit 0 is a global interrupt enable (ie) bit. it is anded with the output of the and-or logic to produce the cpu core interrupt signal. the exl bit in the status register also enables these interrupts. figure 8-3. masking of the interrupt request signals im0 ie status register bit 0 software interrupts of cpu core ordinary interrupts timer interrupt im1 im2 im3 im4 im5 im6 im7 ip0 ip1 ip2 ip3 ip4 ip5 ip6 ip7 8 and-or block and block 1 1 cpu core interrupt 8 8 status register bits 15 to 8 cause register bits 15 to 8 9 10 11 12 13 14 15 8 9 10 11 12 13 14 15 bit function setting ie whole interrupts enable 1 : enable 0 : disable im(7:0) interrupt mask each bit 1 : enable 0 : disable ip(7:0) interrupt request each bit 1 : pending 0 : not pending
user?s manual u15509ej2v0um 224 chapter 9 cpu instruction set details this chapter provides a detailed description of the operation of each v r 4100 series instruction in both 32- and 64- bit modes. the instructions are listed in alphabetical order. 9.1 instruction notation conventions in this chapter, all variable subfields in an instruction format (such as rs , rt , immediate , etc.) are shown in lowercase names. for the sake of clarity, we sometimes use an alias for a variable subfield in the formats of specific instructions. for example, we use rs = base in the format for load and store instructions. such an alias is always lower case, since it refers to a variable subfield. figures with the actual bit encoding for all the mnemonics are located at the end of this chapter ( 9.4 cpu instruction opcode bit encoding ), and the bit encoding also accompanies each instruction. in the instruction descriptions that follow, the operation section describes the operation performed by each instruction using a high-level language notation. the v r 4100 series can operate as either a 32- or 64-bit microprocessor and the operation for both modes is included with the instruction description. special symbols used in the notation are described in table 9-1.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 225 table 9-1. cpu instruction operation notations symbol meaning <- assignment. || bit string concatenation. x y replication of bit value x into a y -bit string. x is always a single-bit value. x y?z selection of bits y through z of bit string x . little-endian bit notation is always used. if y is less than z , this expression is an empty (zero length) bit string. + 2?s complement or floating-point addition. - 2?s complement or floating-point subtraction. * 2?s complement or floating-point multiplication. div 2?s complement integer division. mod 2?s complement modulo. / floating-point division. < 2?s complement less than comparison. and bit-wise logical and. or bit-wise logical or. xor bit-wise logical xor. nor bit-wise logical nor. gpr [ x ] general-register x . the content of gpr [0] is always zero. attempts to alter the content of gpr [0] have no effect. cpr [ z, x ] coprocessor unit z , general register x . ccr [ z, x ] coprocessor unit z , control register x . coc [ z ] coprocessor unit z condition signal. bigendianmem big-endian mode as configured at reset (0 little, 1 big). specifies the endianness of the memory interface (see table 9-2 ), and the endianness of kernel and supervisor mode execution. however, this value is always 0 in the v r 4121, v r 4122, v r 4181, and v r 4181a since they support the little endian order only. reverseendian signal to reverse the endianness of load and store instructions. this feature is available in user mode only, and is effected by setting the re bit of the status register. thus, reverseendian may be computed as (sr 25 and user mode). however, this value is always 0 since the v r 4100 series does not support the reverse of the endianness. bigendiancpu the endianness for load and store instructions (0 little, 1 big). in user mode, this endianness may be reversed by setting sr 25 . thus, bigendiancpu may be computed as bigendianmem xor reverseendian. however, this value is always 0 in the v r 4121, v r 4122, v r 4181, and v r 4181a since they support the little endian order only. t + i : indicates the time steps between operations. each of the statements within a time step are defined to be executed in sequential order (as modified by conditional and loop constructs). operations which are marked t + i : are executed at instruction cycle i relative to the start of execution of the instruction. thus, an instruction which starts at time j executes operations marked t + i : at time i + j . the interpretation of the order of execution between two instructions or two operations that execute at the same time should be pessimistic; the order is not defined.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 226 the following examples illustrate the application of some of the instruction notation conventions: example #1: gpr [rt] immediate || 0 16 sixteen zero bits are concatenated with an immediate value (typically 16 bits), and the 32-bit string (with the lower 16 bits set to zero) is assigned to general-purpose register rt . example #2: (immediate 15 ) 16 || immediate 15...0 bit 15 (the sign bit) of an immediate value is extended for 16 bit positions, and the result is concatenated with bits 15 through 0 of the immediate value to form a 32-bit sign extended value. 9.2 notes on using cpu instructions 9.2.1 load and store instructions in the v r 4100 series implementation, the instruction immediately following a load may use the loaded contents of the register. in such cases, the hardware interlocks, requiring additional real cycles, so scheduling load delay slots is still desirable, although not required for functional code. in the load and store descriptions, the functions listed in table 9-2 are used to summarize the handling of virtual addresses and physical memory. table 9-2. load and store common functions function meaning address translation uses the tlb to find the physical address given the virtual address. the function fails and an exception is taken if the required translation is not present in the tlb. load memory uses the cache and main memory to find the contents of the word containing the specified physical address. the low-order three bits of the address and the access type field indicate which of each of the four bytes within the data word need to be returned. if the cache is enabled for this access, the entire word is returned and loaded into the cache. if the specified data is short of word length, the data position to which the contents of the specified data is stored is determined considering the endian mode and reverse endian mode. store memory uses the cache, write buffer, and main memory to store the word or part of word specified as data in the word containing the specified physical address. the low-order three bits of the address and the access type field indicate which of each of the four bytes within the data word should be stored. if the specified data is short of word length, the data position to which the contents of the specified data is stored is determined considering the endian mode and reverse endian mode.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 227 as shown in table 9-3, the access type field indicates the size of the data item to be loaded or stored. regardless of access type or byte-numbering order (endianness), the address specifies the byte that has the smallest byte address in the addressed field. for a big-endian machine, this is the leftmost byte and contains the sign for a 2's complement number; for a little-endian machine, this is the rightmost byte. table 9-3. access type specifications for loads/stores access type mnemonic value in internal command meaning doubleword septibyte sextibyte quintibyte word triplebyte halfword byte 7 6 5 4 3 2 1 0 8 bytes (64 bits) 7 bytes (56 bits) 6 bytes (48 bits) 5 bytes (40 bits) 4 bytes (32 bits) 3 bytes (24 bits) 2 bytes (16 bits) 1 byte (8 bits) the bytes within the addressed doubleword that are used can be determined directly from the access type and the three low-order bits of the address. 9.2.2 jump and branch instructions all jump and branch instructions have an architectural delay of exactly one instruction. that is, the instruction immediately following a jump or branch (that is, occupying the delay slot) is always executed while the target instruction is being fetched from storage. a delay slot may not itself be occupied by a jump or branch instruction; however, this error is not detected and the results of such an operation are undefined. if an exception or interrupt prevents the completion of a legal instruction during a delay slot, the hardware sets the epc register to point at the jump or branch instruction that precedes it. when the code is restarted, both the jump or branch instructions and the instruction in the delay slot are reexecuted. because jump and branch instructions may be restarted after exceptions or interrupts, they must be restartable. therefore, when a jump or branch instruction stores a return link value, register r31 (the register in which the link is stored) may not be used as a source register. since instructions must be word-aligned, a jump register or jump and link register instruction must use a register which contains an address whose two low-order bits (low-order one bit in the 16-bit mode) are zero. if these low-order bits are not zero, an address exception will occur when the jump target instruction is subsequently fetched.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 228 9.2.3 system control coprocessor (cp0) instructions there are some special limitations imposed on operations involving cp0 that is incorporated within the cpu. although load and store instructions to transfer data to/from coprocessors and to move control to/from coprocessor instructions are generally permitted by the mips architecture, cp0 is given a somewhat protected status since it has responsibility for exception handling and memory management. therefore, the move to/from coprocessor instructions are the only valid mechanism for writing to and reading from the cp0 registers. several cp0 instructions are defined to directly read, write, and probe tlb entries and to modify the operating modes in preparation for returning to user mode or interrupt-enabled states. 9.3 cpu instructions this section describes the functions of cpu instructions in detail for both 32-bit address mode and 64-bit address mode. the exception that may occur by executing each instruction is shown in the last of each instruction?s description. for details of exceptions and their processes, see chapter 6 exception processing .
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 229 add add add special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 add 1 0 0 0 0 0 format: add rd, rs, rt description: the contents of general register rs and the contents of general register rt are added to form the result. the result is placed into general register rd . in 64-bit mode, the operands must be valid sign-extended, 32-bit values. an overflow exception occurs if the carries out of bits 30 and 31 differ (2?s complement overflow). the destination register rd is not modified when an integer overflow exception occurs. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rd] gpr [rs] + gpr [rt] 64 t: temp gpr [rs] + gpr [rt] gpr [rd] (temp 31 ) 32 || temp 31?0 exceptions: integer overflow exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 230 addi add immediate addi addi 31 26 25 21 20 16 15 0 rs 0 0 1 0 0 0 rt immediate format: addi rt, rs, immediate description: the 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. the result is placed into general register rt. in 64-bit mode, the operand must be valid sign-extended, 32-bit values. an overflow exception occurs if carries out of bits 30 and 31 differ (2?s complement overflow). the destination register rt is not modified when an integer overflow exception occurs. restrictions: if the value of general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rt] gpr [rs] + (immediate 15 ) 16 || immediate 15?0 64 t: temp gpr [rs] + (immediate 15 ) 48 || immediate 15?0 gpr [rt] (temp 31 ) 32 || temp 31?0 exceptions: integer overflow exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 231 addiu add immediate unsigned addiu addi 31 26 25 21 20 16 15 0 rs 0 0 1 0 0 1 rt immediate format: addiu rt, rs, immediate description: the 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. the result is placed into general register rt. no integer overflow exception occurs under any circumstances. in 64-bit mode, the operand must be valid sign-extended, 32-bit values. the only difference between this instruction and the addi instruction is that addiu never causes an integer overflow exception. restrictions: if the value of general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rt] gpr [rs] + (immediate 15 ) 16 || immediate 15?0 64 t: temp gpr [rs] + (immediate 15 ) 48 || immediate 15?0 gpr [rt] (temp 31 ) 32 || temp 31?0 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 232 addu add unsigned addu special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 addu 1 0 0 0 0 1 format: addu rd, rs, rt description: the contents of general register rs and the contents of general register rt are added to form the result. the result is placed into general register rd . no integer overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid sign-extended, 32-bit values. the only difference between this instruction and the add instruction is that addu never causes an integer overflow exception. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rt] gpr [rs] + gpr [rt] 64 t: temp gpr [rs] + gpr [rt] gpr [rd] (temp 31 ) 32 || temp 31?0 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 233 and and and special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 and 1 0 0 1 0 0 format: and rd, rs, rt description: the contents of general register rs are combined with the contents of general register rt in a bit-wise logical and operation. the result is placed into general register rd . operation: 32 t: gpr [rd] gpr [rs] and gpr [rt] 64 t: gpr [rd] gpr [rs] and gpr [rt] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 234 andi and immediate andi andi 31 26 25 21 20 16 15 0 rs 0 0 1 1 0 0 rt immediate format: andi rt, rs, immediate description: the 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical and operation. the result is placed into general register rt . operation: 32 t: gpr [rt] 0 16 || (immediate and gpr [rs] 15?0 ) 64 t: gpr [rt] 0 48 || (immediate and gpr [rs] 15?0 ) exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 235 bc0f branch on coprocessor 0 false bc0f copz 31 26 25 21 20 16 15 0 0 1 0 0 x x n o t e offset bc 0 1 0 0 0 bcf 0 0 0 0 0 format: bc0f offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the coprocessor 0?s condition signal (cpcond), as sampled during the previous instruction, is false, then the program branches to the target address with a delay of one instruction. because the condition signal is sampled during the previous instruction, there must be at least one instruction between this instruction and a coprocessor instruction that changes the condition signal. operation: 32 t?1: condition not sr 18 t: target (offset 15 ) 14 || offset || 0 2 t+1: if condition then pc pc + target endif 64 t?1: condition not sr 18 t: target (offset 15 ) 46 || offset || 0 2 t+1: if condition then pc pc + target endif exceptions: coprocessor unusable exception note see the opcode table below, or 9.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 0 26 0 25 0 24 1 23 0 22 0 21 0 20 0 19 0 18 0 17 0 16 0 0 bc0f opcode coprocessor number bc sub-opcode branch condition
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 236 bc0fl branch on coprocessor 0 false likely bfc0fl copz 31 26 25 21 20 16 15 0 0 1 0 0 x x n o t e offset bc 0 1 0 0 0 bcfl 0 0 0 1 0 format: bc0fl offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the coprocessor 0?s condition signal (cpcond), as sampled during the previous instruction, is false, the target address is branched to with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. because the condition signal is sampled during the previous instruction, there must be at least one instruction between this instruction and a coprocessor instruction that changes the condition signal. operation: 32 t?1: condition not sr 18 t: target (offset 15 ) 14 || offset || 0 2 t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t?1: condition not sr 18 t: target (offset 15 ) 46 || offset || 0 2 t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: coprocessor unusable exception note see the opcode table below, or 9.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 0 26 0 25 0 24 1 23 0 22 0 21 0 20 0 19 0 18 0 17 1 16 0 0 bc0fl opcode coprocessor number bc sub-opcode branch condition
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 237 bc0t branch on coprocessor 0 true bc0t copz 31 26 25 21 20 16 15 0 0 1 0 0 x x n o t e offset bc 0 1 0 0 0 bct 0 0 0 0 1 format: bc0t offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the coprocessor 0?s condition signal (cpcond), as sampled during the previous instruction, is true, then the program branches to the target address, with a delay of one instruction. because the condition signal is sampled during the previous instruction, there must be at least one instruction between this instruction and a coprocessor instruction that changes the condition signal. operation: 32 t?1: condition sr 18 t: target (offset 15 ) 14 || offset || 0 2 t+1: if condition then pc pc + target endif 64 t?1: condition sr 18 t: target (offset 15 ) 46 || offset || 0 2 t+1: if condition then pc pc + target endif exceptions: coprocessor unusable exception note see the opcode table below, or 9.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 0 26 0 25 0 24 1 23 0 22 0 21 0 20 0 19 0 18 0 17 0 16 1 0 bc0t opcode coprocessor number bc sub-opcode branch condition
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 238 bc0tl branch on coprocessor 0 true likely bc0tl copz 31 26 25 21 20 16 15 0 0 1 0 0 x x n o t e offset bc 0 1 0 0 0 bctl 0 0 0 1 1 format: bc0tl offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the coprocessor 0?s condition signal (cpcond), as sampled during the previous instruction, is true, the target address is branched to with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. because the condition signal is sampled during the previous instruction, there must be at least one instruction between this instruction and a coprocessor instruction that changes the condition signal. operation: 32 t?1: condition sr 18 t: target (offset 15 ) 14 || offset || 0 2 t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t?1: condition sr 18 t: target (offset 15 ) 46 || offset || 0 2 t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: coprocessor unusable exception note see the opcode table below, or 9.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 0 26 0 25 0 24 1 23 0 22 0 21 0 20 0 19 0 18 0 17 1 16 1 0 bc0tl opcode coprocessor number bc sub-opcode branch condition
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 239 beq branch on equal beq beq 31 26 25 21 20 16 15 0 rs 0 0 0 1 0 0 rt offset format: beq rs, rt, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs and the contents of general register rt are compared. if the two registers are equal, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] = gpr [rt]) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] = gpr [rt]) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 240 beql branch on equal likely beql beql 31 26 25 21 20 16 15 0 rs 0 1 0 1 0 0 rt offset format: beql rs, rt, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs and the contents of general register rt are compared. if the two registers are equal, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] = gpr [rt]) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] = gpr [rt]) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 241 bgez branch on greater than or equal to zero bgez regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bgez 0 0 0 0 1 format: bgez rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the contents of general register rs have the sign bit cleared, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 242 bgezal branch on greater than or equal to zero and link bgezal regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bgezal 1 0 0 0 1 format: bgezal rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is placed in the link register, r31 . if the contents of general register rs have the sign bit cleared, then the program branches to the target address, with a delay of one instruction. general register rs may not be general register r31 , because such an instruction is not restartable. an attempt to execute such an instruction is not trapped, however. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) gpr [31] pc + 8 t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) gpr [31] pc + 8 t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 243 bgezall branch on greater than or equal to zero and link likely bgezall regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bgezall 1 0 0 1 1 format: bgezall rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is placed in the link register, r31 . if the contents of general register rs have the sign bit cleared, then the program branches to the target address, with a delay of one instruction. general register rs may not be general register r31 , because such an instruction is not restartable. an attempt to execute such an instruction is not trapped, however. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) gpr [31] pc + 8 t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) gpr [31] pc + 8 t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 244 bgezl branch on greater than or equal to zero likely bgezl regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bgezl 0 0 0 1 1 format: bgezl rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the contents of general register rs have the sign bit cleared, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 245 bgtz branch on greater than zero bgtz bgtz 31 26 25 21 20 16 15 0 rs 0 0 0 1 1 1 offset 0 0 0 0 0 0 format: bgtz rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs are compared to zero. if the contents of general register rs have the sign bit cleared and are not equal to zero, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) or (gpr [rs] 0 32 ) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) or (gpr [rs] 0 64 ) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 246 bgtzl branch on greater than zero likely bgtzl bgtzl 31 26 25 21 20 16 15 0 rs 0 1 0 1 1 1 offset 0 0 0 0 0 0 format: bgtzl rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs are compared to zero. if the contents of general register rs have the sign bit cleared and are not equal to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 0) or (gpr [rs] 0 32 ) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 0) or (gpr [rs] 0 64 ) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 247 blez branch on less than or equal to zero blez blez 31 26 25 21 20 16 15 0 rs 0 0 0 1 1 0 offset 0 0 0 0 0 0 format: blez rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs are compared to zero. if the contents of general register rs have the sign bit set or are equal to zero, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) or (gpr [rs] = 0 32 ) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) or (gpr [rs] = 0 64 ) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 248 blezl branch on less than or equal to zero likely blezl blezl 31 26 25 21 20 16 15 0 rs 0 1 0 1 1 0 offset 0 0 0 0 0 0 format: blezl rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. the contents of general register rs is compared to zero. if the contents of general register rs have the sign bit set or are equal to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) or (gpr [rs] = 0 32 ) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) or (gpr [rs] = 0 64 ) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 249 bltz branch on less than zero bltz regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bltz 0 0 0 0 0 format: bltz rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the contents of general register rs have the sign bit set, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 250 bltzal branch on less than zero and link bltzal regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bltzal 1 0 0 0 0 format: bltzal rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is placed in the link register, r31 . if the contents of general register rs have the sign bit set, then the program branches to the target address, with a delay of one instruction. general register rs may not be general register r31 , because such an instruction is not restartable. an attempt to execute such an instruction is not trapped, however. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) gpr [31] pc + 8 t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) gpr [31] pc + 8 t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 251 bltzall branch on less than zero and link likely bltzall regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bltzall 1 0 0 1 0 format: bltzall rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is placed in the link register, r31 . if the contents of general register rs have the sign bit set, then the program branches to the target address, with a delay of one instruction. general register rs may not be general register r31 , because such an instruction is not restartable. an attempt to execute such an instruction is not trapped, however. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) gpr [31] pc + 8 t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) gpr [31] pc + 8 t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 252 bltzl branch on less than zero likely bltzl regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 offset bltzl 0 0 0 1 0 format: bltz rs, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset , shifted left two bits and sign-extended. if the contents of general register rs have the sign bit set, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] 31 = 1) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] 63 = 1) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 253 bne branch on not equal bne bne 31 26 25 21 20 16 15 0 rs 0 0 0 1 0 1 rt offset format: bne rs, rt, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. the contents of general register rs and the contents of general register rt are compared. if the two registers are not equal, then the program branches to the target address, with a delay of one instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] gpr [rt]) t+1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] gpr [rt]) t+1: if condition then pc pc + target endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 254 bnel branch on not equal likely bnel bnel 31 26 25 21 20 16 15 0 rs 0 1 0 1 0 1 rt offset format: bnel rs, rt, offset description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16- bit offset, shifted left two bits and sign-extended. the contents of general register rs and the contents of general register rt are compared. if the two registers are not equal, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is nullified. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr [rs] gpr [rt]) t+1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr [rs] gpr [rt]) t+1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 255 break breakpoint break special 31 26 25 6 5 0 code 0 0 0 0 0 0 break 0 0 1 1 0 1 format: break description: a breakpoint trap occurs, immediately and unconditionally transferring control to the exception handler. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: breakpointexception exceptions: breakpoint exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 256 cache cache operation cache cache 31 26 25 21 20 16 15 0 base 1 0 1 1 1 1 op offset format: cache op, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the 5-bit sub-opcode op specifies a cache operation for that address. if cp0 is not usable (user or supervisor mode) and the cp0 enable bit in the status register is cleared, a coprocessor unusable exception is taken. the operation of this instruction on any operation/cache combination not listed below, or on a secondary cache, is undefined. the operation of this instruction on uncached addresses is also undefined. the index operation uses part of the virtual address to specify a cache block. for a cache of 2 cachebits bytes with 2 linebits bytes per tag, vaddr cachebits...linebits in the v r 4121, v r 4122, v r 4181, and v r 4181a or vaddr cachebits ? 2...linebits in the v r 4131 specifies the block. in the v r 4131, bit 31 of the virtual address indicates the way of cache to be used. the hit operation translates the virtual address to a physical address using the tlb, accesses the specified cache as normal data references, and performs the specified operation if the cache block contains valid data with the specified physical address (a hit). if the cache block is invalid or contains a different address (a miss), no operation is performed.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 257 cache cache cache (continued) write back from a primary cache goes to memory. the address to be written is specified by the cache tag and not the translated physical address. tlb refill and tlb invalid exceptions can occur on any operation. for index operations (where the physical address is used to index the cache but need not match the cache tag) to unmapped addresses may be used to avoid tlb exceptions. this operation never causes a tlb modified exception. bits 17 and 16 (op 1..0 ) of the instruction code specify the cache as follows: op 1..0 name cache 0 i instruction cache 1 d data cache 2 ? reserved 3 ? reserved
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 258 cache cache cache (continued) bits 20 to 18 (op 4..2 ) of the instruction specify the operation as follows: op 4..2 cache name operation 0 i index_invalidate set the cache state of the cache block to invalid. this operation can also be used to cancel lock of a cache block in the v r 4131. 0 d index_write_ back_invalidate examine the cache state and w bit of the primary data cache block at the index specified by the virtual address. if the state is not invalid and the w bit is set, then write back the block to memory. the address to write is taken from the primary cache tag. set cache state of primary cache block to invalid. this operation can also be used to cancel lock of a cache block in the v r 4131. 1 i, d index_load_tag read the tag for the cache block at the specified index and place it into the taglo register of the cp0. 2 i, d index_store_ tag write the tag for the cache block at the specified index from the taglo register of the cp0. 3 d create_dirty_ exclusive this operation is used to avoid loading data needlessly from memory when writing new contents into an entire cache block. if the cache block does not contain the specified address, and the block is dirty, write it back to the memory. in all cases, set the cache state to dirty. 4 i, d hit_invalidate if the cache block contains the specified address, mark the cache block invalid. this operation can also be used to cancel lock of a cache block in the v r 4131. 5 d hit_write_back invalidate if the cache block contains the specified address, write back the data if it is dirty, and mark the cache block invalid. 5 i fill fill the primary instruction cache block from memory. this operation can also be used to cancel lock of a cache block in the v r 4131. 6 d hit_write_back if the cache block contains the specified address, and the w bit is set, write back the data to memory and clear the w bit. 6 i hit_write_back if the cache block contains the specified address, write back the data unconditionally. 7 i, d fetch_and_lock for the v r 4131 only. if the cache block contains the specified address, fill the cache block from memory. locks the cache line r egardless of refilling the cache block.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 259 cache cache cache (continued) operation: 32, 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) cacheop (op, vaddr, paddr) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 260 dadd doubleword add dadd special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dadd 1 0 1 1 0 0 format: dadd rd, rs, rt description: the contents of general register rs and the contents of general register rt are added to form the result. the result is placed into general register rd . an integer overflow exception occurs if the carries out of bits 62 and 63 differ (2?s complement overflow). the destination register rd is not modified when an integer overflow exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: gpr [rd] gpr [rs] + gpr [rt] exceptions: integer overflow exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 261 daddi doubleword add immediate daddi daddi 31 26 25 21 20 16 15 0 rs 0 1 1 0 0 0 rt immediate format: daddi rt, rs, immediate description: the 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. the result is placed into general register rt. an integer overflow exception occurs if carries out of bits 62 and 63 differ (2?s complement overflow). the destination register rt is not modified when an integer overflow exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: gpr [rt] gpr [rs] + (immediate 15 ) 48 || immediate 15?0 exceptions: integer overflow exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 262 daddiu doubleword add immediate unsigned daddiu daddiu 31 26 25 21 20 16 15 0 rs 0 1 1 0 0 1 rt immediate format: daddiu rt, rs, immediate description: the 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. the result is placed into general register rt. the only difference between this instruction and the daddi instruction is that daddiu never causes an overflow exception. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: gpr [rt] gpr [rs] + (immediate 15 ) 48 || immediate 15?0 exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 263 daddu doubleword add unsigned daddu special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 daddu 1 0 1 1 0 1 format: daddu rd, rs, rt description: the contents of general register rs and the contents of general register rt are added to form the result. the result is placed into general register rd . the only difference between this instruction and the dadd instruction is that daddu never causes an overflow exception. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: gpr [rd] gpr [rs] + gpr [rt] exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 264 ddiv doubleword divide ddiv special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 ddiv 0 1 1 1 1 0 format: ddiv rs, rt description: the contents of general register rs are divided by the contents of general register rt, treating both operands as 2?s complement values. no overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. this instruction is typically followed by additional instructions to check for a zero divisor and for overflow. when the operation completes, the doubleword quotient of the result is loaded into special register lo , and the doubleword remainder of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. correct operation requires separating reads of hi or lo from writes by two or more instructions. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo gpr [rs] div gpr [rt] hi gpr [rs] mod gpr [rt] exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 265 ddivu doubleword divide unsigned ddivu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 ddivu 0 1 1 1 1 1 format: ddivu rs, rt description: the contents of general register rs are divided by the contents of general register rt, treating both operands as unsigned values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. this instruction may be followed by additional instructions to check for a zero divisor. when the operation completes, the doubleword quotient of the result is loaded into special register lo , and the doubleword remainder of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. correct operation requires separating reads of hi or lo from writes by two or more instructions. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo (0 || gpr [rs]) div (0 || gpr [rt]) hi (0 || gpr [rs]) mod (0 || gpr [rt]) exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 266 div divide div special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 div 0 1 1 0 1 0 format: div rs, rt description: the contents of general register rs are divided by the contents of general register rt, treating both operands as 2?s complement values. no overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. in 64-bit mode, the operands must be valid sign-extended, 32-bit values. this instruction is typically followed by additional instructions to check for a zero divisor and for overflow. when the operation completes, the doubleword quotient of the result is loaded into special register lo , and the doubleword remainder of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. correct operation requires separating reads of hi or lo from writes by two or more instructions. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo gpr [rs] div gpr [rt] hi gpr [rs] mod gpr [rt] 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: q gpr [rs] 31?0 div gpr [rt] 31?0 r gpr [rs] 31?0 mod gpr [rt] 31?0 lo (q 31 ) 32 || q 31?0 hi (r 31 ) 32 || r 31?0 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 267 divu divide unsigned divu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 divu 0 1 1 0 1 1 format: divu rs, rt description: the contents of general register rs are divided by the contents of general register rt, treating both operands as unsigned values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. in 64-bit mode, the operands must be valid sign-extended, 32-bit values. this instruction is typically followed by additional instructions to check for a zero divisor. when the operation completes, the doubleword quotient of the result is loaded into special register lo , and the doubleword remainder of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. correct operation requires separating reads of hi or lo from writes by two or more instructions. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo (0 || gpr [rs]) div (0 || gpr [rt]) hi (0 || gpr [rs]) mod (0 || gpr [rt]) 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: q (0 || gpr [rs] 31?0 ) div (0 || gpr [rt] 31?0 ) r (0 || gpr [rs] 31?0 ) mod (0 || gpr [rt] 31?0 ) lo (q 31 ) 32 || q 31?0 hi (r 31 ) 32 || r 31?0 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 268 dmacc doubleword multiply and add accumulate dmacc (for v r 4121, v r 4122, v r 4131, and v r 4181a) rs special 0 0 0 0 0 0 rt 31 26 25 21 20 16 15 0 dmacc 1 0 1 0 0 1 65 rd sat us 11 10 9 7 hi 0 0 8 format: dmacc rd, rs, rt dmaccu rd, rs, rt dmacchi rd, rs, rt dmacchiu rd, rs, rt dmaccs rd, rs, rt dmaccus rd, rs, rt dmacchis rd, rs, rt dmacchiusrd, rs, rt description: the mnemonics of the dmacc instruction differ as shown in the table below by the setting of the sat , hi , or us bits. mnemonic sat hi us dmacc 0 0 0 dmaccu 0 0 1 dmacchi 0 1 0 dmacchiu 0 1 1 dmaccs 1 0 0 dmaccus 1 0 1 dmacchis 1 1 0 dmacchius 1 1 1 the number of valid bits in the operands differs depending on whether saturation processing is executed ( sat = 1) or not ( sat = 0). ? ? ? ? when saturation processing is executed ( sat = 1): dmaccs, dmaccus, dmacchis, and dmacchius instructions the contents of general register rs are multiplied by the contents of general register rt . if us = 1, the contents of both operands are handled as 16-bit unsigned data. if us = 0, the contents are handled as 16-bit signed integers. sign/zero extension by software is required for bits 16 to 31 in the operands.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 269 dmacc doubleword multiply and add accumulate dmacc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) the product of this multiply operation is added to the value in special register lo . if us = 1, this add operation handles the values being added as 32-bit unsigned data. if us = 0, the values are handled as 32-bit signed integers. sign/zero extension by software is required for bits 32 to 63 in special register lo . after saturation processing of 32 bits has been performed (refer to the table below), the sum from this add operation is loaded to special register lo . when hi = 1, data that is the same as the data loaded to special register hi is also loaded to general register rd . when hi = 0, data that is the same as the data loaded to special register lo is also loaded to general register rd . overflow exceptions do not occur. ? ? ? ? when saturation processing is not executed ( sat = 0): dmacc, dmaccu, dmacchi, and dmacchiu instructions the contents of general register rs are multiplied by the contents of general register rt . if us = 1, the contents of both operands are handled as 32-bit unsigned data. if us = 0, the contents are handled as 32-bit signed integers. sign/zero extension by software is required for bits 32 to 63 in the operands. the product of this multiply operation is added to the value in special register lo . if us = 1, this add operation handles the values being added as 64-bit unsigned data. if us = 0, the values are handled as 64-bit signed integers. the sum from this add operation is loaded to special register lo . when hi = 1, data that is the same as the data loaded to special register hi is also loaded to general register rd . when hi = 0, data that is the same as the data loaded to special register lo is also loaded to general register rd . overflow exceptions do not occur. these operations are defined for 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if one of these instructions is executed during 32-bit user/supervisor mode.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 270 dmacc doubleword multiply and add accumulate dmacc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) the correspondence of us and sat settings and values stored during saturation processing is shown below, along with the hazard cycles required between execution of the instruction for manipulating the hi and lo registers and execution of the dmacc instruction. values stored during saturation processing hazard cycle counts us sat overflow underflow instruction cycle count 0 0 store calculation result as is store calculation result as is 1 0 store calculation result as is store calculation result as is 0 1 0x0000 0000 7fff ffff 0xffff ffff 8000 0000 1 1 0xffff ffff ffff ffff none mult, multu dmult, dmultu div, divu ddiv, ddivu mfhi, mflo mthi, mtlo macc dmacc note1 3 36 68 note2 0 0 0 notes 1. v r 4121, v r 4122 ? 1 v r 4131 ? 0 v r 4181a ? 1 2. v r 4121, v r 4122 ? 2 v r 4131 ? 0 v r 4181a ? 2 operation: 32, 64, sat = 0, hi = 0, us = 0 (dmacc instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr [rs]) * ((gpr[rt] 31 ) 32 || gpr [rt]) temp2 temp1 + lo lo temp2 gpr[rd] lo 32, 64, sat = 0, hi = 0, us = 1 (dmaccu instruction) t: temp1 (0 32 || gpr [rs]) * (0 32 || gpr [rt]) temp2 temp1 + lo lo temp2 gpr[rd] lo 32, 64, sat = 0, hi = 1, us = 0 (dmacchi instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr [rs]) * ((gpr[rt] 31 ) 32 || gpr [rt]) temp2 temp1 + lo lo temp2 gpr[rd] hi
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 271 dmacc doubleword multiply and add accumulate dmacc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) 32, 64, sat = 0, hi = 1, us = 1 (dmacchiu instruction) t: temp1 (0 32 || gpr [rs]) * (0 32 || gpr [rt]) temp2 temp1 + lo lo temp2 gpr[rd] hi 32, 64, sat = 1, hi = 0, us = 0 (dmaccs instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr [rs]) * ((gpr[rt] 31 ) 32 || gpr [rt]) temp2 saturation(temp1 + lo) lo temp2 gpr[rd] lo 32, 64, sat = 1, hi = 0, us = 1 (dmaccus instruction) t: temp1 (0 32 || gpr [rs]) * (0 32 || gpr [rt]) temp2 saturation(temp1 + lo) lo temp2 gpr[rd] lo 32, 64, sat = 1, hi = 1, us = 0 (dmacchis instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr [rs]) * ((gpr[rt] 31 ) 32 || gpr [rt]) temp2 saturation(temp1 + lo) lo temp2 gpr[rd] hi 32, 64, sat = 1, hi = 1, us = 1 (dmacchius instruction) t: temp1 (0 32 || gpr [rs]) * (0 32 || gpr [rt]) temp2 saturation(temp1 + lo) lo temp2 gpr[rd] hi exceptions: reserved instruction exception (in 32-bit user/supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 272 dmadd16 doubleword multiply and add 16-bit integer dmadd16 (for v r 4181 only) special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 dmadd16 1 0 1 0 0 1 format: dmadd16 rs, rt description: the contents of general registers rs and rt are multiplied, treating both operands as 16-bit 2?s complement values. bits 62 to 15 of the operand must be sign-extended values. this multiplied result and the contents of special register lo are added to form the result as a signed integer. when the operation completes, the doubleword result is loaded into special register lo . no integer overflow exception occurs under any circumstances. this operation is defined for the v r 4181 operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. the following table shows hazard cycles between dmadd16 and other instructions. instruction sequence no. of cycles mult/multu dmadd16 1 cycle dmult/dmultu dmadd16 4 cycles div/divu dmadd16 36 cycles ddiv/ddivu dmadd16 68 cycles mfhi/mflo dmadd16 2 cycles madd16 dmadd16 0 cycles dmadd16 dmadd16 0 cycles operation: 32, 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: temp gpr [rs] * gpr [rt] temp temp + lo lo temp hi undefined exceptions: reserved instruction exception (v r 4181 in 32-bit user mode, v r 4181 in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 273 dmfc0 doubleword move from system control coprocessor dmfc0 cop0 31 26 25 21 20 16 15 11 10 0 0 1 0 0 0 0 rt rd dmf 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 format: dmfc0 rt, rd description: the contents of coprocessor register rd of the cp0 are loaded into general register rt. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. all 64-bits of the general register destination are written from the coprocessor register source. the operation of dmfc0 on a 32-bit coprocessor 0 register is undefined. operation: 32, 64 t: data cpr [0, rd] t+1: gpr [rt] data exceptions: coprocessor unusable exception (user and supervisor mode if cp0 not enabled) reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 274 dmtc0 doubleword move to system control coprocessor dmtc0 cop0 31 26 25 21 20 16 15 11 10 0 0 1 0 0 0 0 rt rd dmt 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 format: dmtc0 rt, rd description: the contents of general register rt are loaded into coprocessor register rd of the cp0. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. all 64-bits of the coprocessor register destination are written from the general register source. the operation of dmtc0 on a 32-bit coprocessor 0 register is undefined. because the state of the virtual address translation system may be altered by this instruction, the operation of load instructions, store instructions, and tlb operations immediately prior to and after this instruction are undefined. operation: 32, 64 t: data gpr [rt] t+1: cpr [0, rd] data exceptions: coprocessor unusable exception (user and supervisor mode if cp0 not enabled) reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 275 dmult doubleword multiply dmult special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 dmult 0 1 1 1 0 0 format: dmult rs, rt description: the contents of general registers rs and rt are multiplied, treating both operands as 2?s complement values. no integer overflow exception occurs under any circumstances. when the operation completes, the low-order doubleword of the result is loaded into special register lo , and the high-order doubleword of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. correct operation requires separating reads of hi or lo from writes by a minimum of two other instructions. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr [rs] * gpr [rt] lo t 63?0 hi t 127?64 exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 276 dmultu doubleword multiply unsigned dmultu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 dmultu 0 1 1 1 0 1 format: dmultu rs, rt description: the contents of general register rs and the contents of general register rt are multiplied, treating both operands as unsigned values. no overflow exception occurs under any circumstances. when the operation completes, the low-order doubleword of the result is loaded into special register lo , and the high-order doubleword of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. correct operation requires separating reads of hi or lo from writes by a minimum of two instructions. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr [rs]) * (0 || gpr [rt]) lo t 63?0 hi t 127?64 exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 277 dsll doubleword shift left logical dsll special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsll 1 1 1 0 0 0 format: dsll rd, rt, sa description: the contents of general register rt are shifted left by sa bits, inserting zeros into the low-order bits. the result is placed in general register rd. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 0 || sa gpr [rd] gpr [rt] 63 ? s?0 || 0 s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 278 dsllv doubleword shift left logical variable dsllv special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsllv 0 1 0 1 0 0 format: dsllv rd, rt, rs description: the contents of general register rt are shifted left by the number of bits specified by the low-order six bits contained in general register rs , inserting zeros into the low-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s gpr [rs] 5?0 gpr [rd] gpr [rt] 63 ? s?0 || 0 s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 279 dsll32 doubleword shift left logical + 32 dsll32 special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsll32 1 1 1 1 0 0 format: dsll32 rd, rt, sa description: the contents of general register rt are shifted left by 32 + sa bits, inserting zeros into the low-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 1 || sa gpr [rd] gpr [rt] 63 ? s?0 || 0 s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 280 dsra doubleword shift right arithmetic dsra special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsra 1 1 1 0 1 1 format: dsra rd, rt, sa description: the contents of general register rt are shifted right by sa bits, sign-extending the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 0 || sa gpr [rd] (gpr [rt] 63 ) s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 281 dsrav doubleword shift right arithmetic variable dsrav special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsrav 0 1 0 1 1 1 format: dsrav rd, rt, rs description: the contents of general register rt are shifted right by the number of bits specified by the low-order six bits of general register rs , sign-extending the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s gpr [rs] 5?0 gpr [rd] (gpr [rt] 63 ) s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 282 dsra32 doubleword shift right arithmetic + 32 dsra32 special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsra32 1 1 1 1 1 1 format: dsra32 rd, rt, sa description: the contents of general register rt are shifted right by 32 + sa bits, sign-extending the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 1 || sa gpr [rd] (gpr [rt] 63 ) s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 283 dsrl doubleword shift right logical dsrl special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsrl 1 1 1 0 1 0 format: dsrl rd, rt, sa description: the contents of general register rt are shifted right by sa bits, inserting zeros into the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 0 || sa gpr [rd] 0 s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 284 dsrlv doubleword shift right logical variable dsrlv special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsrlv 0 1 0 1 1 0 format: dsrlv rd, rt, rs description: the contents of general register rt are shifted right by the number of bits specified by the low-order six bits of general register rs, inserting zeros into the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s gpr [rs] 5?0 gpr [rd] 0 s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 285 dsrl32 doubleword shift right logical + 32 dsrl32 special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsrl32 1 1 1 1 1 0 format: dsrl32 rd, rt, sa description: the contents of general register rt are shifted right by 32 + sa bits, inserting zeros into the high-order bits. the result is placed in general register rd . this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: s 1 || sa gpr [rd] 0 s || gpr [rt] 63?s exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 286 dsub doubleword subtract dsub special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsub 1 0 1 1 1 0 format: dsub rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs to form a result. the result is placed into general register rd. an integer overflow exception takes place if the carries out of bits 62 and 63 differ (2?s complement overflow). the destination register rd is not modified when an integer overflow exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: gpr [rd] gpr [rs] ? gpr [rt] exceptions: integer overflow exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 287 dsubu doubleword subtract unsigned dsubu special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 dsubu 1 0 1 1 1 1 format: dsubu rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs to form a result. the result is placed into general register rd . the only difference between this instruction and the dsub instruction is that dsubu never traps on overflow. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32, 64 t: gpr [rd] gpr [rs] ? gpr [rt] exceptions: reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 288 eret exception return eret cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eret 0 1 1 0 0 0 format: eret description: eret is the instruction for returning from an interrupt, exception, or error trap. unlike a branch or jump instruction, eret does not execute the next instruction. eret must not itself be placed in a branch delay slot. if the processor is servicing an error trap ( sr 2 = 1), then load the pc from the errorepc register and clear the erl bit of the status register ( sr 2 = 0). otherwise ( sr 2 = 0), load the pc from the epc register, and clear the exl bit of the status register ( sr 1 = 0). when mips16 instructions are enabled, the value of clearing the least significant bit of the epc or errorepc register to 0 is loaded to pc. this means the content of the least significant bit is reflected on the isa mode bit (internal). operation: 32, 64 t: if sr 2 = 1 then if mips16en = 1 then pc errorepc 63?1 || 0 isa mode errorepc 0 else pc errorepc endif sr sr 31?3 || 0 || sr 1?0 else if mips16en = 1 then pc epc 63?1 || 0 isa mode epc 0 else pc epc endif sr sr 31?2 || 0 || sr 0 endif exceptions: coprocessor unusable exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 289 hibernate hibernate hibernate cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 hibernate 1 0 0 0 1 1 format: hibernate description: hibernate instruction starts mode transition from fullspeed mode to hibernate mode. when the hibernate instruction finishes the wb stage, the v r 4100 series wait by the sysad bus is idle state, and then fix the all clocks generated by the cpu core to high level, thus freezing the pipeline. once the v r 4100 series is in hibernate mode, the cold reset sequence will cause the v r 4100 series to exit hibernate mode and to enter fullspeed mode. operation: 32, 64 t: t+1: hibernate operation ( ) exceptions: coprocessor unusable exception remark refer to hardware user's manual of each product for details about the operation of the peripheral units at mode transition.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 290 j jump j j 31 26 25 0 target 0 0 0 0 1 0 format: j target description: the 26-bit target address is shifted left by two bits and combined with the high-order four bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. operation: 32 t: temp target t+1: pc pc 31?28 || temp || 0 2 64 t: temp target t+1: pc pc 63?28 || temp || 0 2 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 291 jal jump and link jal jal 31 26 25 0 target 0 0 0 0 1 1 format: jal target description: the 26-bit target address is shifted left by two bits and combined with the high-order four bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. the address of the instruction immediately after a delay slot is placed in the link register ( r31 ). when mips16 instructions are enabled, the value of bit 0 of r31 indicates the isa mode bit (internal) before jump. operation: 32 t: temp target if mips16en = 1 then gpr [31] (pc + 8) 31?1 || isa mode else gpr [31] pc + 8 endif t+1: pc pc 31?28 || temp || 0 2 64 t: temp target if mips16en = 1 then gpr [31] (pc + 8) 63?1 || isa mode else gpr [31] pc + 8 endif t+1: pc pc 63?28 || temp || 0 2 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 292 jalr jump and link register jalr special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rd 0 0 0 0 0 0 jalr 0 0 1 0 0 1 0 0 0 0 0 0 format: jalr rs jalr rd, rs description: the program unconditionally jumps to the address contained in general register rs , with a delay of one instruction. when mips16 instructions are enabled, the program unconditionally jumps with a delay of one instruction to the address indicated by the value of clearing the least significant bit of the general register rs to 0. then, the content of the least significant bit of the general register rs is set to the isa mode bit (internal). the address of the instruction immediately after the delay slot is placed in general register rd . the default value of rd , if omitted in the assembly language instruction, is 31. when mips16 instructions are enabled, the value of bit 0 of rd indicates the isa mode bit before jump. register specifiers rs and rd should not be equal since such an instruction does not have the same effect when re-executed because storing a link address destroys the contents of rs if they are equal. however, an attempt to execute this instruction is not trapped, and the result of executing such an instruction is undefined. since 32-bit length instructions must be word-aligned, a jump and link register (jalr) instruction must specify a target register ( rs ) that contains an address whose two low-order bits are zero when mips16 instructions are enabled. if these low-order bits are not zero, an address error exception will occur when the jump target instruction is subsequently fetched. operation: 32, 64 t: temp gpr [rs] if mips16en = 1 then gpr [rd] (pc + 8) 63?1 || isa mode else gpr [rd] pc + 8 endif t+1: if mips16en = 1 then pc temp 63?1 || 0 isa mode temp 0 else pc temp endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 293 jalx jump and link exchange jalx jalx 31 26 25 0 target 0 1 1 1 0 1 format: jalx target description: when mips16 instructions are enabled, a 26-bit target address is shifted to left by two bits and combined with the high-order four bits of the address or the delay slot. the program unconditionally jumps to the calculated address with a delay of one instruction. the address of the instruction immediately after a delay slot is placed in the link register ( r31 ). the isa mode bit is inverted with a delay of one instruction. the value of bit 0 of the link register ( r31 ) indicates the isa mode bit (internal) before jump. operation: 32 t: temp target gpr [31] (pc + 8) 31?1 || isa mode t+1: pc pc 31?28 || temp || 0 2 isa mode toggle 64 t: temp target gpr [31] (pc + 8) 63?1 || isa mode t+1: pc pc 63?28 || temp || 0 2 isa mode toggle exceptions: reserved instruction exception (when mips16 instruction execution disabled)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 294 jr jump register jr special 31 26 25 21 20 6 5 0 rs 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 jr 0 0 1 0 0 0 format: jr rs description: the program unconditionally jumps to the address contained in general register rs , with a delay of one instruction. when mips16 instructions are enabled, the program unconditionally jumps with a delay of one instruction to the address indicated by the value of clearing the least significant bit of the general register rs to 0. then, the content of the least significant bit of the general register rs is set to the isa mode bit (internal). since 32-bit length instructions must be word-aligned, a jump register (jr) instruction must specify a target register ( rs ) that contains an address whose two low-order bits are zero when mips16 instructions are enabled. if these low-order bits are not zero, an address error exception will occur when the jump target instruction is subsequently fetched. operation: 32, 64 t: temp gpr [rs] t+1: if mips16en = 1 then pc temp 63?1 || 0 isa mode temp 0 else pc temp endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 295 lb load byte lb lb 31 26 25 21 20 16 15 0 base 1 0 0 0 0 0 rt offset format: lb rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the byte at the memory location specified by the effective address are sign-extended and loaded into general register rt . operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2?0 xor bigendiancpu 3 gpr [rt] (mem 7 + 8*byte ) 24 || mem 7 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2?0 xor bigendiancpu 3 gpr [rt] (mem 7 + 8*byte ) 56 || mem 7 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 296 lbu load byte unsigned lbu lbu 31 26 25 21 20 16 15 0 base 1 0 0 1 0 0 rt offset format: lbu rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the byte at the memory location specified by the effective address are zero-extended and loaded into general register rt . operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2?0 xor bigendiancpu 3 gpr [rt] 0 24 || mem 7 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2?0 xor bigendiancpu 3 gpr [rt] 0 56 || mem 7 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 297 ld load doubleword ld ld 31 26 25 21 20 16 15 0 base 1 1 0 1 1 1 rt offset format: ld rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the 64-bit doubleword at the memory location specified by the effective address are loaded into general register rt . if any of the three least-significant bits of the effective address are non-zero, an address error exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) data loadmemory (uncached, doubleword, paddr, vaddr, data) gpr [rt] data 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) data loadmemory (uncached, doubleword, paddr, vaddr, data) gpr [rt] data exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 298 ldl load doubleword left ldl ldl 31 26 25 21 20 16 15 0 base 0 1 1 0 1 0 rt offset format: ldl rt, offset (base) description: this instruction can be used in combination with the ldr instruction to load a register with eight consecutive bytes from memory, when the bytes cross a doubleword boundary. ldl loads the left portion of the register with the appropriate part of the high-order doubleword in memory; ldr loads the right portion of the register with the appropriate part of the low-order doubleword. the ldl instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that can specify an arbitrary byte. it reads bytes only from the doubleword in memory that contains the specified starting byte, and places them in the high-order part of general register rt . the contents of the remaining part of general register rt is retained. from one to eight bytes will be loaded, depending on the starting byte specified. conceptually, it starts at the specified byte in memory and loads that byte into the high-order (left-most) byte of the register; then it loads bytes from memory into the register until it reaches the low-order byte of the doubleword in memory. the least-significant (right-most) byte(s) of the register will not be changed. address 8 address 0 memory (little endian) before after $24 $24 register abcdefgh 12 11 10 9 8 f g h 15 14 13 12 11 10 9 8 76543210 ldl $24, 12 ($0)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 299 ldl load doubleword left ldl (continued) the contents of general register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following ldl (or ldr) instruction which also specifies register rt . no address error exceptions due to alignment are possible. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 mem loadmemory (uncached, byte, paddr, vaddr, data) gpr [rt] mem 7 + 8*byte?0 || gpr [rt] 55 ? 8*byte?0 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 mem loadmemory (uncached, byte, paddr, vaddr, data) gpr [rt] mem 7 + 8*byte?0 || gpr [rt] 55 ? 8*byte?0
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 300 ldl load doubleword left ldl (continued) given a doubleword in a register and a doubleword in memory, the operation of ldl is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 pbcdefgh opcde fgh nopde fgh mnop e f gh lmnopfgh klmnopgh jklmnoph ijklmnop 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 7 6 5 4 3 2 1 0 ijklmnop jklmnoph klmnopgh lmnopfgh mnop e f gh nopde fgh opcde fgh pbcdefgh 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 301 ldr load doubleword right ldr ldr 31 26 25 21 20 16 15 0 base 0 1 1 0 1 1 rt offset format: ldr rt, offset (base) description: this instruction can be used in combination with the ldl instruction to load a register with eight consecutive bytes from memory, when the bytes cross a doubleword boundary. ldr loads the right portion of the register with the appropriate part of the low-order doubleword in memory; ldl loads the left portion of the register with the appropriate part of the high-order doubleword. the ldr instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that can specify an arbitrary byte. it reads bytes only from the doubleword in memory that contains the specified starting byte, and places them in the low-order part of general register rt . the contents of the remaining part of general register rt is retained. from one to eight bytes will be loaded, depending on the starting byte specified. conceptually, it starts at the specified byte in memory and loads that byte into the low-order (right-most) byte of the register; then it loads bytes from memory into the register until it reaches the high-order byte of the doubleword in memory. the most significant (left-most) byte(s) of the register will not be changed. address 8 address 0 memory (little endian) before after $24 $24 register abcdefgh abcde76 5 15 14 13 12 11 10 9 8 76543210 ldr $24, 5 ($0)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 302 ldr load doubleword right ldr (continued) the contents of general register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following ldr (or ldl) instruction which also specifies register rt . no address error exceptions due to alignment are possible. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 mem loadmemory (uncached, doubleword-byte, paddr, vaddr, data) gpr [rt] gpr [rt] 63?64 ? 8*byte || mem 63? 8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 mem loadmemory (uncached, doubleword-byte, paddr, vaddr, data) gpr [rt] gpr [rt] 63?64 ? 8*byte || mem 63? 8*byte
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 303 ldr load doubleword right ldr (continued) given a doubleword in a register and a doubleword in memory, the operation of ldr is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 ijklmnop aijklmno ab i jklmn abc i jklm abcd i jkl abcde i jk abcdef i j abcdefg i 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 abcdefg i abcdef i j abcde i jk abcd i jkl abc i jklm ab i jklmn aijklmno ijklmnop 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 304 lh load halfword lh lh 31 26 25 21 20 16 15 0 base 1 0 0 0 0 1 rt offset format: lh rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the halfword at the memory location specified by the effective address are sign-extended and loaded into general register rt . if the least-significant bit of the effective address is non-zero, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu 2 || 0) gpr [rt] (mem 15 + 8*byte ) 16 || mem 15 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu 2 || 0) gpr [rt] (mem 15 + 8*byte ) 48 || mem 15 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 305 lhu load halfword unsigned lhu lhu 31 26 25 21 20 16 15 0 base 1 0 0 1 0 1 rt offset format: lhu rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the halfword at the memory location specified by the effective address are zero-extended and loaded into general register rt . if the least-significant bit of the effective address is non-zero, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu 2 || 0) gpr [rt] 0 16 || mem 15 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu 2 || 0) gpr [rt] 0 48 || mem 15 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 306 lui load upper immediate lui lui 31 26 25 21 20 16 15 0 0 0 1 1 1 1 rt immediate 0 0 0 0 0 0 format: lui rt, immediate description: the 16-bit immediate is shifted left by 16 bits and concatenated to 16 bits of zeros. the result is placed into general register rt . in 64-bit mode, the loaded word is sign-extended. operation: 32 t: gpr [rt] immediate || 0 16 64 t: gpr [rt] (immediate 15 ) 32 || immediate || 0 16 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 307 lw load word lw lw 31 26 25 21 20 16 15 0 base 1 0 0 0 1 1 rt offset format: lw rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the word at the memory location specified by the effective address are loaded into general register rt . in 64-bit mode, the loaded word is sign-extended. if either of the two least-significant bits of the effective address is non-zero, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) gpr [rt] mem 31 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) gpr [rt] (mem 31 + 8*byte ) 32 || mem 31 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 308 lwl load word left lwl lwl 31 26 25 21 20 16 15 0 base 1 0 0 0 1 0 rt offset format: lwl rt, offset (base) description: this instruction can be used in combination with the lwr instruction to load a register with four consecutive bytes from memory, when the bytes cross a word boundary. lwl loads the left portion of the register with the appropriate part of the high-order word in memory; lwr loads the right portion of the register with the appropriate part of the low-order word. the lwl instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that can specify an arbitrary byte. it reads bytes only from the word in memory that contains the specified starting byte, and places them in the high-order part of general register rt . the contents of the remaining part of general register rt are retained. from one to four bytes will be loaded, depending on the starting byte specified. in 64-bit mode, the loaded word is sign-extended. conceptually, it starts at the specified byte in memory and loads that byte into the high-order (left-most) byte of the register; then it loads bytes from memory into the register until it reaches the low-order byte of the word in memory. the least-significant (right-most) byte(s) of the register will not be changed. address 4 address 0 memory (little endian) 7 before after $24 $24 register lwl $24, 4 ($0) 654 3210 abcd 4bcd
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 309 lwl load word left lwl (continued) the contents of general register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following lwl (or lwr) instruction which also specifies register rt . no address error exceptions due to alignment are possible. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, byte, paddr, vaddr, data) temp mem 32*word + 8*byte + 7?32*word || gpr [rt] 23 ? 8*byte?0 gpr [rt] temp 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, 0 || byte, paddr, vaddr, data) temp mem 32*word + 8*byte + 7?32*word || gpr [rt] 23 ? 8*byte?0 gpr [rt] (temp 31 ) 32 || temp
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 310 lwl load word left lwl (continued) given a doubleword in a register and a doubleword in memory, the operation of lwl is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 sssspfgh ssssopgh ssssnoph ssssmnop sssslfgh ssssklgh ssssjklh ssss i jkl 0 1 2 3 0 1 2 3 0 0 0 0 4 4 4 4 7 6 5 4 3 2 1 0 ssss i jkl ssssjklh ssssklgh sssslfgh ssssmnop ssssnoph ssssopgh sssspfgh 3 2 1 0 3 2 1 0 4 4 4 4 0 0 0 0 0 1 2 3 4 5 6 7 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) s : sign-extend of destination 31 exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 311 lwr load word right lwr lwr 31 26 25 21 20 16 15 0 base 1 0 0 1 1 0 rt offset format: lwr rt, offset (base) description: this instruction can be used in combination with the lwl instruction to load a register with four consecutive bytes from memory, when the bytes cross a word boundary. lwr loads the right portion of the register with the appropriate part of the low-order word in memory; lwl loads the left portion of the register with the appropriate part of the high-order word. the lwr instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that can specify an arbitrary byte. it reads bytes only from the word in memory that contains the specified starting byte, and places them in the low-order part of general register rt . the contents of the remaining part of general register rt are retained. from one to four bytes will be loaded, depending on the starting byte specified. in 64-bit mode, the loaded word is sign-extended. conceptually, it starts at the specified byte in memory and loads that byte into the low-order (right-most) byte of the register; then it loads bytes from memory into the register until it reaches the high-order byte of the word in memory. the most significant (left-most) byte(s) of the register will not be changed. address 4 address 0 memory (little endian) 7 before after $24 $24 register lwr $24, 1 ($0) 654 3210 abcd a321
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 312 lwr load word right lwr (continued) the contents of general register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following lwr (or lwl) instruction which also specifies register rt . no address error exceptions due to alignment are possible. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 1?0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, 0 || byte, paddr, vaddr, data) temp gpr [rt] 31?32 ? 8*byte || mem 31 + 32*word?32*word + 8*byte gpr [rt] temp 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 1?0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, word-byte, paddr, vaddr, data) temp gpr [rt] 31?32 ? 8*byte || mem 31 + 32*word?32*word + 8*byte gpr [rt] (temp 31 ) 32 || temp
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 313 lwr load word right lwr (continued) given a word in a register and a word in memory, the operation of lwr is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 ssssmnop ssssemno ssssefmn ssssefgm ssss i jkl sssse i jk ssssef i j ssssefg i 3 2 1 0 3 2 1 0 0 1 2 3 4 5 6 7 4 4 4 4 0 0 0 0 ssssefg i ssssef i j sssse i jk ssss i jkl ssssefgm ssssefmn ssssemno ssssmnop 0 1 2 3 0 1 2 3 7 6 5 4 3 2 1 0 0 0 0 0 4 4 4 4 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) s : sign-extend of destination 31 exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 314 lwu load word unsigned lwu lwu 31 26 25 21 20 16 15 0 base 1 0 0 1 1 1 rt offset format: lwu rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of the word at the memory location specified by the effective address are loaded into general register rt . the loaded word is zero-extended. if either of the two least-significant bits of the effective address is non-zero, an address error exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) gpr [rt] 0 32 || mem 31 + 8*byte?8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) gpr [rt] 0 32 || mem 31 + 8*byte?8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 315 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) rs special 0 0 0 0 0 0 rt 31 26 25 21 20 16 15 0 macc 1 0 1 0 0 0 65 rd sat us 11 10 9 7 hi 0 0 8 format: macc rd, rs, rt maccu rd, rs, rt macchi rd, rs, rt macchiu rd, rs, rt maccs rd, rs, rt maccus rd, rs, rt macchis rd, rs, rt macchius rd, rs, rt description: the mnemonics of the macc instruction differ as shown in the table below by the setting of the sat , hi , or us bits. mnemonic sat hi us macc 0 0 0 maccu 0 0 1 macchi 0 1 0 macchiu 0 1 1 maccs 1 0 0 maccus 1 0 1 macchis 1 1 0 macchius 1 1 1 the number of valid bits in the operands differs depending on whether saturation processing is executed ( sat = 1) or not ( sat = 0). ? ? ? ? when saturation processing is executed ( sat = 1): maccs, maccus, macchis, and macchius instructions the contents of general register rs are multiplied by the contents of general register rt . if us = 1, the contents of both operands are handled as 16-bit unsigned data. if us = 0, the contents are handled as 16-bit signed integers. sign/zero extension by software is required for bits 16 to 31 in the operands.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 316 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) the product of this multiply operation is added to the 64-bit value (of which only the low-order 32 bits are valid) formed by concatenating special registers hi and lo . if us = 1, this add operation handles the values being added as 32-bit unsigned data. if us = 0, the values are handled as 32-bit signed integers. sign/zero extension by software is required for bits 32 to 63 of the value formed by concatenating special registers hi and lo . after saturation processing of 32 bits has been performed (refer to the table below), the sum from this add operation is loaded to special registers hi and lo . when hi = 1, data that is the same as the data loaded to special register hi is also loaded to general register rd . when hi = 0, data that is the same as the data loaded to special register lo is also loaded to general register rd . overflow exceptions do not occur. ? ? ? ? when saturation processing is not executed ( sat = 0): macc, maccu, macchi, and macchiu instructions the contents of general register rs are multiplied by the contents of general register rt . if us = 1, the contents of both operands are handled as 32-bit unsigned data. if us = 0, the contents are handled as 32-bit signed integers. sign/zero extension by software is required for bits 32 to 63 in the operands. the product of this multiply operation is added to the 64-bit value formed by concatenating special registers hi and lo . if us = 1, this add operation handles the values being added as 64-bit unsigned data. if us = 0, the values are handled as 64-bit signed integers. the low-order word of the sum from this add operation is loaded to special register lo , and the high-order word to special register hi . when hi = 1, data that is the same as the data loaded to special register hi is also loaded to general register rd . when hi = 0, data that is the same as the data loaded to special register lo is also loaded to general register rd . overflow exceptions do not occur.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 317 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) the correspondence of us and sat settings and values stored during saturation processing is shown below, along with the hazard cycles required between execution of the instruction for manipulating the hi and lo registers and execution of the macc instruction. values stored during saturation processing hazard cycle counts us sat overflow underflow instruction cycle count 0 0 store calculation result as is store calculation result as is 1 0 store calculation result as is store calculation result as is 0 1 0x0000 0000 7fff ffff 0xffff ffff 8000 0000 1 1 0xffff ffff ffff ffff none mult, multu dmult, dmultu div, divu ddiv, ddivu mfhi, mflo mthi, mtlo macc dmacc note1 3 36 68 note2 0 0 0 notes 1. v r 4121, v r 4122 ? 1 v r 4131 ? 0 v r 4181a ? 1 2. v r 4121, v r 4122 ? 2 v r 4131 ? 0 v r 4181a ? 2 operation: 32, sat = 0, hi = 0, us = 0 (macc instruction) t: temp1 gpr[rs] * gpr[rt] temp2 temp1 + (hi || lo) lo temp2 63..32 hi temp2 31..0 gpr[rd] lo 32, sat = 0, hi = 0, us = 1 (maccu instruction) t: temp1 (0 || gpr[rs]) * (0 || gpr[rt]) temp2 temp1 + ((0 || hi) || (0 || lo)) lo temp2 63..32 hi temp2 31..0 gpr[rd] lo
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 318 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) 32, sat = 0, hi = 1, us = 0 (macchi instruction) t: temp1 gpr[rs] * gpr[rt] temp2 temp1 + (hi || lo) lo temp2 63..32 hi temp2 31..0 gpr[rd] hi 32, sat = 0, hi = 1, us = 1 (macchiu instruction) t: temp1 (0 || gpr[rs]) * (0 || gpr[rt]) temp2 temp1 + ((0 || hi) || (0 || lo)) lo temp2 63..32 hi temp2 31..0 gpr[rd] hi 32, sat = 1, hi = 0, us = 0 (maccs instruction) t: temp1 gpr[rs] * gpr[rt] temp2 saturation(temp1 + (hi || lo)) lo temp2 63..32 hi temp2 31..0 gpr[rd] lo 32, sat = 1, hi = 0, us = 1 (maccus instruction) t: temp1 (0 || gpr[rs]) * (0 || gpr[rt]) temp2 saturation(temp1 + ((0 || hi) || (0 || lo))) lo temp2 63..32 hi temp2 31..0 gpr[rd] lo 32, sat = 1, hi = 1, us = 0 (macchis instruction) t: temp1 gpr[rs] * gpr[rt] temp2 saturation(temp1 + (hi || lo)) lo temp2 63..32 hi temp2 31..0 gpr[rd] hi 32, sat = 1, hi = 1, us = 1 (macchius instruction) t: temp1 (0 || gpr[rs]) * (0 || gpr[rt]) temp2 saturation(temp1 + ((0 || hi) || (0 || lo))) lo temp2 63..32 hi temp2 31..0 gpr[rd] hi
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 319 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) 64, sat = 0, hi = 0, us = 0 (macc instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr[rs]) * ((gpr[rt] 31 ) 32 || gpr[rt]) temp2 temp1 + (hi 31..0 || lo 31..0 ) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] lo 64, sat = 0, hi = 0, us = 1 (maccu instruction) t: temp1 (0 32 || gpr[rs]) * (0 32 || gpr[rt]) temp2 temp1 + (hi 31..0 || lo 31..0 ) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] lo 64, sat = 0, hi = 1, us = 0 (macchi instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr[rs]) * ((gpr[rt] 31 ) 32 || gpr[rt]) temp2 temp1 + (hi 31..0 || lo 31..0 ) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] hi 64, sat = 0, hi = 1, us = 1 (macchiu instruction) t: temp1 (0 32 || gpr[rs]) * (0 32 || gpr[rt]) temp2 temp1 + (hi 31..0 || lo 31..0 ) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] hi 64, sat = 1, hi = 0, us = 0 (maccs instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr[rs]) * ((gpr[rt] 31 ) 32 || gpr[rt]) temp2 saturation(temp1 + (hi 31..0 || lo 31..0 )) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] lo 64, sat = 1, hi = 0, us = 1 (maccus instruction) t: temp1 (0 32 || gpr[rs]) * (0 32 || gpr[rt]) temp2 saturation(temp1 + (hi 31..0 || lo 31..0 )) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] lo
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 320 macc multiply and add accumulate macc (for v r 4121, v r 4122, v r 4131, and v r 4181a) (continued) 64, sat = 1, hi = 1, us = 0 (macchis instruction) t: temp1 ((gpr[rs] 31 ) 32 || gpr[rs]) * ((gpr[rt] 31 ) 32 || gpr[rt]) temp2 saturation(temp1 + (hi 31..0 || lo 31..0 )) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] hi 64, sat = 1, hi = 1, us = 1 (macchius instruction) t: temp1 (0 32 || gpr[rs]) * (0 32 || gpr[rt]) temp2 saturation(temp1 + (hi 31..0 || lo 31..0 ) lo ((temp2 63 ) 32 || temp2 63..32 ) hi ((temp2 31 ) 32 || temp2 31..0 ) gpr[rd] hi exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 321 madd16 multiply and add 16-bit integer madd16 (for v r 4181 only) special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 madd16 1 0 1 0 0 0 format: madd16 rs, rt description: the contents of general registers rs and rt are multiplied, treating both operands as 16-bit 2?s complement values. bits 62 to 15 of the operand must be valid sign-extended values. if not, the result is unpredictable. this multiplied result and the 64-bit data joined special register hi to lo are added to form the result. when the operation completes, the low-order word of the result is loaded into special register lo , and the high-order word of the result is loaded into special register hi . no integer overflow exception occurs under any circumstances. hazard cycles required between madd16 and other instructions are as follows. instruction sequence no. of cycles mult/multu madd16 1 cycle dmult/dmultu madd16 4 cycles div/divu madd16 36 cycles ddiv/ddivu madd16 68 cycles mfhi/mflo madd16 2 cycles dmadd16 madd16 0 cycles madd16 madd16 0 cycles operation: 32, 64 t: temp1 gpr [rs] * gpr [rt] temp2 temp1 + (hi 31?0 || lo 31?0 ) lo (temp2 31 ) 32 || temp2 31?0 hi (temp2 63 ) 32 || temp2 63?32 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 322 mfc0 move from system control coprocessor mfc0 cop0 31 26 25 21 20 16 15 11 10 0 0 1 0 0 0 0 rt rd mf 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 format: mfc0 rt, rd description: the contents of coprocessor register rd of the cp0 are loaded into general register rt. operation: 32 t: data cpr [0, rd] t+1: gpr [rt] data 64 t: data cpr [0, rd] t+1: gpr [rt] (data 31 ) 32 || data 31?0 exceptions: coprocessor unusable exception (user and supervisor mode if cp0 not enabled)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 323 mfhi move from hi mfhi special 31 26 25 16 15 11 10 6 5 0 0 0 0 0 0 0 rd 0 0 0 0 0 0 mfhi 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 format: mfhi rd description: the contents of special register hi are loaded into general register rd . to ensure proper operation in the event of interruptions, the two instructions which follow a mfhi instruction may not be any of the instructions which modify the hi register: macc, dmacc, madd16, dmadd16, mult, multu, div, divu, mthi, dmult, dmultu, ddiv, ddivu. operation: 32, 64 t: gpr [rd] hi exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 324 mflo move from lo mflo special 31 26 25 16 15 11 10 6 5 0 0 0 0 0 0 0 rd 0 0 0 0 0 0 mflo 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 format: mflo rd description: the contents of special register lo are loaded into general register rd . to ensure proper operation in the event of interruptions, the two instructions which follow a mflo instruction may not be any of the instructions which modify the lo register: macc, dmacc, madd16, dmadd16, mult, multu, div, divu, mtlo, dmult, dmultu, ddiv, ddivu. operation: 32, 64 t: gpr [rd] lo exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 325 mtc0 move to coprocessor0 mtc0 cop0 31 26 25 21 20 16 15 11 10 0 0 1 0 0 0 0 rt rd mt 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 format: mtc0 rt, rd description: the contents of general register rt are loaded into coprocessor register rd of cp0. because the state of the virtual address translation system may be altered by this instruction, the operation of load instructions, store instructions, and tlb operations immediately prior to and after this instruction are undefined. when using a register used by the mtc0 by means of instructions before and after it, refer to chapter 11 coprocessor 0 hazards and place the instructions in the appropriate location. operation: 32, 64 t: data gpr [rt] t+1: cpr [0, rd] data exceptions: coprocessor unusable exception (user and supervisor mode if cp0 not enabled)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 326 mthi move to hi mthi special 31 26 25 21 20 6 5 0 rs 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 mthi 0 1 0 0 0 1 format: mthi rs description: the contents of general register rs are loaded into special register hi . restrictions: the operation results written to the hi / lo register pair via a ddiv, ddivu, div, divu, dmult, dmultu, mult, or multu instruction should be read by the mfhi or mflo instruction before another result is written to either of the registers. if the mthi instruction is executed prior to the mflo or mfhi instruction following the execution of any one of the arithmetic instructions, the contents of the lo register are undefined as shown in the example below. mult r2, r4 # start operation that will eventually write to hi, lo ? # code not containing mfhi or mflo mthi r6 ? # code not containing mflo mflo r3 # this mflo would get an undefined value operation: 32, 64 t ? 2: hi undefined t ? 1: hi undefined t: hi gpr [rs] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 327 mtlo move to lo mtlo special 31 26 25 21 20 6 5 0 rs 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 mtlo 0 1 0 0 1 1 format: mtlo rs description: the contents of general register rs are loaded into special register lo. restrictions: the operation results written to the hi / lo register pair via a ddiv, ddivu, div, divu, dmult, dmultu, mult, or multu instruction should be read by the mfhi or mflo instruction before another result is written to either of the registers. if the mtlo instruction is executed prior to the mflo or mfhi instruction following the execution of any one of the arithmetic instructions, the contents of the hi register are undefined as shown in the example below. mult r2, r4 # start operation that will eventually write to hi, lo ? # code not containing mfhi or mflo mtlo r6 ? # code not containing mfhi mfhi r3 # this mfhi would get an undefined value operation: 32, 64 t ? 2: lo undefined t ? 1: lo undefined t: lo gpr [rs] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 328 mult multiply mult special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 mult 0 1 1 0 0 0 format: mult rs, rt description: the contents of general registers rs and rt are multiplied, treating both operands as signed 32-bit integer. no integer overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid 32-bit, sign-extended values. when the operation completes, the low-order doubleword of the result is loaded into special register lo , and the high-order doubleword of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. correct operation requires separating reads of hi or lo from writes by a minimum of two other instructions. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr [rs] * gpr [rt] lo t 31?0 hi t 63?32 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr [rs] 31?0 * gpr [rt] 31?0 lo (t 31 ) 32 || t 31?0 hi (t 63 ) 32 || t 63?32 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 329 multu multiply unsigned multu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt 0 0 0 0 0 0 0 0 0 0 0 multu 0 1 1 0 0 1 format: multu rs, rt description: the contents of general registers rs and rt are multiplied, treating both operands as unsigned values. no overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid 32-bit, sign- extended values. when the operation completes, the low-order doubleword of the result is loaded into special register lo , and the high-order doubleword of the result is loaded into special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. correct operation requires separating reads of hi or lo from writes by a minimum of two instructions. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr [rs]) * (0 || gpr [rt]) lo t 31?0 hi t 63?32 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr [rs] 31?0 ) * (0 || gpr [rt] 31?0 ) lo (t 31 ) 32 || t 31?0 hi (t 63 ) 32 || t 63?32 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 330 nor nor nor special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 nor 1 0 0 1 1 1 format: nor rd, rs, rt description: the contents of general register rs are combined with the contents of general register rt in a bit-wise logical nor operation. the result is placed into general register rd . operation: 32, 64 t: gpr [rd] gpr [rs] nor gpr [rt] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 331 or or or special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 or 1 0 0 1 0 1 format: or rd, rs, rt description: the contents of general register rs are combined with the contents of general register rt in a bit-wise logical or operation. the result is placed into general register rd . operation: 32, 64 t: gpr [rd] gpr [rs] or gpr [rt] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 332 ori or immediate ori ori 31 26 25 21 20 16 15 0 rs 0 0 1 1 0 1 rt immediate format: ori rt, rs, immediate description: the 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical or operation. the result is placed into general register rt . operation: 32 t: gpr [rt] gpr [rs] 31?16 || (immediate or gpr [rs] 15?0 ) 64 t: gpr [rt] gpr [rs] 63?16 || (immediate or gpr [rs] 15?0 ) exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 333 sb store byte sb sb 31 26 25 21 20 16 15 0 base 1 0 1 0 0 0 rt offset format: sb rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the least-significant byte of register rt is stored at the effective address. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) byte vaddr 2?0 xor bigendiancpu 3 data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) byte vaddr 2?0 xor bigendiancpu 3 data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, byte, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 334 sd store doubleword sd sd 31 26 25 21 20 16 15 0 base 1 1 1 1 1 1 rt offset format: sd rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of general register rt are stored at the memory location specified by the effective address. if either of the three least-significant bits of the effective address are non-zero, an address error exception occurs. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) data gpr [rt] storememory (uncached, doubleword, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) data gpr [rt] storememory (uncached, doubleword, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 335 sdl store doubleword left sdl sdl 31 26 25 21 20 16 15 0 base 1 0 1 1 0 0 rt offset format: sdl rt, offset (base) description: this instruction can be used with the sdr instruction to store the contents of a register into eight consecutive bytes of memory, when the bytes cross a doubleword boundary. sdl stores the left portion of the register into the appropriate part of the high-order doubleword in memory; sdr stores the right portion of the register into the appropriate part of the low-order doubleword. the sdl instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that may specify an arbitrary byte. it alters only the doubleword in memory that contains the specified starting byte, with the high-order part of general register rt . from one to eight bytes will be stored, depending on the starting byte specified. conceptually, it starts at the most-significant (leftmost) byte of the register and copies it to the specified byte in memory; then it copies bytes from register to memory until it reaches the low-order byte of the doubleword in memory. address 8 address 0 memory (little endian) before after $24 register abcdefgh 15 14 13 12 11 10 9 8 76543210 sdl $24, 8 ($0) address 8 address 0 15 14 13 12 11 10 9 a 76543210
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 336 sdl store doubleword left sdl (continued) no address error exceptions due to alignment are possible. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 data 0 56 ? 8*byte || gpr [rt] 63?56 ? 8*byte storememory (uncached, byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 data 0 56 ? 8*byte || gpr [rt] 63?56 ? 8*byte storememory (uncached, byte, data, paddr, vaddr, data)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 337 sdl store doubleword left sdl (continued) given a doubleword in a register and a doubleword in memory, the operation of sdl is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 ijklmnoa ijklmnab ijklmabc ijklabcd i jkabcde ijabcdef i abcdefg abcdefgh 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 7 6 5 4 3 2 1 0 abcdefgh i abcdefg ijabcdef i jkabcde ijklabcd ijklmabc ijklmnab ijklmnoa 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 338 sdr store doubleword right sdr sdr 31 26 25 21 20 16 15 0 base 1 0 1 1 0 1 rt offset format: sdr rt, offset (base) description: this instruction can be used with the sdl instruction to store the contents of a register into eight consecutive bytes of memory, when the bytes cross a doubleword boundary. sdr stores the right portion of the register into the appropriate part of the low-order doubleword in memory; sdl stores the left portion of the register into the appropriate part of the high-order doubleword. the sdr instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that may specify an arbitrary byte. it alters only the doubleword in memory that contains the specified starting byte, with the low-order part of general register rt . from one to eight bytes will be stored, depending on the starting byte specified. conceptually, it starts at the least-significant (rightmost) byte of the register and copies it to the specified byte in memory; then it copies bytes from register to memory until it reaches the high-order byte of the doubleword in memory. address 8 address 0 memory (little endian) before after $24 register abcdefgh 15 14 13 12 11 10 9 8 76543210 sdr $24, 1 ($0) address 8 address 0 15 14 13 12 11 10 9 8 bcdefgh0
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 339 sdr store doubleword right sdr (continued) no address error exceptions due to alignment are possible. this operation is defined for the v r 4100 series operating in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 data gpr [rt] 63 ? 8*byte || 0 8*byte storememory (uncached, doubleword-byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?3 || 0 3 endif byte vaddr 2?0 xor bigendiancpu 3 data gpr [rt] 63 ? 8*byte || 0 8*byte storememory (uncached, doubleword-byte, data, paddr, vaddr, data)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 340 sdr store doubleword right sdr (continued) given a doubleword in a register and a doubleword in memory, the operation of sdr is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 abcdefgh bcdefghp cdefghop defghnop efghmnop fghlmnop ghk lmnop hjklmnop 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 ijklmnoa ijklmnab ijklmabc ijklabcd i jkabcde ijabcdef i abcdefg abcdefgh 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (v r 4100 series in 32-bit user mode, v r 4100 series in 32-bit supervisor mode)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 341 sh store halfword sh sh 31 26 25 21 20 16 15 0 base 1 0 1 0 0 1 rt offset format: sh rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form an effective address. the least-significant halfword of register rt is stored at the effective address. if the least-significant bit of the effective address is non-zero, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) byte vaddr 2?0 xor (bigendiancpu 2 || 0) data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, halfword, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian 2 || 0)) byte vaddr 2?0 xor (bigendiancpu 2 || 0) data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, halfword, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 342 sll shift left logical sll special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 sll 0 0 0 0 0 0 format: sll rd, rt, sa description: the contents of general register rt are shifted left by sa bits, inserting zeros into the low-order bits. the result is placed in register rd. in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. it is sign extended for all shift amounts, including zero; sll with zero shift amount truncates a 64-bit value to 32 bits and then sign extends this 32-bit value. sll, unlike nearly all other word operations, does not require an operand to be a properly sign-extended word value to produce a valid sign-extended word result. operation: 32 t: gpr [rd] gpr [rt] 31 ? sa?0 || 0 sa 64 t: s 0 || sa temp gpr [rt] 31 ? s?0 ) || 0 s gpr [rd] (temp 31 ) 32 || temp exceptions: none caution sll with a shift amount of zero may be treated as a nop by some assemblers, at some optimization levels. if using sll with a zero shift to truncate 64-bit values, check the assembler you are using.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 343 sllv shift left logical variable sllv special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 sllv 0 0 0 1 0 0 format: sllv rd, rt, rs description: the contents of general register rt are shifted left the number of bits specified by the low-order five bits contained in general register rs , inserting zeros into the low-order bits. the result is placed in register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. it is sign extended for all shift amounts, including zero; sllv with zero shift amount truncates a 64-bit value to 32 bits and then sign extends this 32-bit value. sllv, unlike nearly all other word operations, does not require an operand to be a properly sign-extended word value to produce a valid sign-extended word result. operation: 32 t: s gpr [rs] 4?0 gpr [rd] gpr [rt] 31 ? s?0 || 0 s 64 t: s 0 || gpr [rs] 4?0 temp gpr [rt] 31 ? s?0 ) || 0 s gpr [rd] (temp 31 ) 32 || temp exceptions: none caution sllv with a shift amount of zero may be treated as a nop by some assemblers, at some optimization levels. if using sllv with a zero shift to truncate 64-bit values, check the assembler you are using.
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 344 slt set on less than slt special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 slt 1 0 1 0 1 0 format: slt rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs . considering both quantities as signed integers, if the contents of general register rs are less than the contents of general register rt , the result is set to one; otherwise the result is set to zero. the result is placed into general register rd . no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if gpr [rs] < gpr [rt] then gpr [rd] 0 31 || 1 else gpr [rd] 0 32 endif 64 t: if gpr [rs] < gpr [rt] then gpr [rd] 0 63 || 1 else gpr [rd] 0 64 endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 345 slti set on less than immediate slti slti 31 26 25 21 20 16 15 0 rs 0 0 1 0 1 0 rt immediate format: slti rt, rs, immediate description: the 16-bit immediate is sign-extended and subtracted from the contents of general register rs. considering both quantities as signed integers, if the contents of general register rs are less than the sign-extended immediate, the result is set to 1; otherwise the result is set to 0. the result is placed into general register rt. no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if gpr [rs] < (immediate 15 ) 16 || immediate 15?0 then gpr [rt] 0 31 || 1 else gpr [rt] 0 32 endif 64 t: if gpr [rs] < (immediate 15 ) 48 || immediate 15?0 then gpr [rt] 0 63 || 1 else gpr [rt] 0 64 endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 346 sltiu set on less than immediate unsigned sltiu sltiu 31 26 25 21 20 16 15 0 rs 0 0 1 0 1 1 rt immediate format: sltiu rt, rs, immediate description: the 16-bit immediate is sign-extended and subtracted from the contents of general register rs. considering both quantities as unsigned integers, if the contents of general register rs are less than the sign-extended immediate, the result is set to 1; otherwise the result is set to 0. the result is placed into general register rt . no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if (0 || gpr [rs]) < (0 || (immediate 15 ) 16 || immediate 15?0 ) then gpr [rt] 0 31 || 1 else gpr [rt] 0 32 endif 64 t: if (0 || gpr [rs]) < (0 || (immediate 15 ) 48 || immediate 15?0 ) then gpr [rt] 0 63 || 1 else gpr [rt] 0 64 endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 347 sltu set on less than unsigned sltu special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 sltu 1 0 1 0 1 1 format: sltu rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs. considering both quantities as unsigned integers, if the contents of general register rs are less than the contents of general register rt , the result is set to 1; otherwise the result is set to 0. the result is placed into general register rd . no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if (0 || gpr [rs]) < (0 || gpr [rt]) then gpr [rd] 0 31 || 1 else gpr [rd] 0 32 endif 64 t: if (0 || gpr [rs]) < (0 || gpr [rt]) then gpr [rd] 0 63 || 1 else gpr [rd] 0 64 endif exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 348 sra shift right arithmetic sra special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 sra 0 0 0 0 1 1 format: sra rd, rt, sa description: the contents of general register rt are shifted right by sa bits, sign-extending the high-order bits. the result is placed in register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. restrictions: if the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rd] (gpr [rt] 31 ) sa || gpr [rt] 31?sa 64 t: s 0 || sa temp (gpr [rt] 31 ) s || gpr [rt] 31?s gpr [rd] (temp 31 ) 32 || temp exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 349 srav shift right arithmetic variable srav special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 srav 0 0 0 1 1 1 format: srav rd, rt, rs description: the contents of general register rt are shifted right by the number of bits specified by the low-order five bits of general register rs , sign-extending the high-order bits. the result is placed in register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. restrictions: if the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: s gpr [rs] 4?0 gpr [rd] (gpr [rt] 31 ) s || gpr [rt] 31?s 64 t: s gpr [rs] 4?0 temp (gpr [rt] 31 ) s || gpr [rt] 31?s gpr [rd] (temp 31 ) 32 || temp exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 350 srl shift right logical srl special 31 26 25 21 20 16 15 11 10 6 5 0 sa 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 srl 0 0 0 0 1 0 format: srl rd, rt, sa description: the contents of general register rt are shifted right by sa bits, inserting zeros into the high-order bits. the result is placed in register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. restrictions: if the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rd] 0 sa || gpr [rt] 31?sa 64 t: s 0 || sa temp 0 s || gpr [rt] 31?s gpr [rd] (temp 31 ) 32 || temp exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 351 srlv shift right logical variable srlv special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 srlv 0 0 0 1 1 0 format: srlv rd, rt, rs description: the contents of general register rt are shifted right by the number of bits specified by the low-order five bits of general register rs, inserting zeros into the high-order bits. the result is placed in register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. restrictions: if the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: s gpr [rs] 4?0 gpr [rd] 0 s || gpr [rt] 31?s 64 t: s gpr [rs] 4?0 temp 0 s || gpr [rt] 31?s gpr [rd] (temp 31 ) 32 || temp exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 352 standby standby standby cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 standby 1 0 0 0 0 1 format: standby description: standby instruction starts mode transition from fullspeed mode to standby mode. when the standby instruction finishes the wb stage, the v r 4100 series wait by the sysad bus is idle state, and then fix the internal clocks to high level, thus freezing the pipeline. in the v r 4131 and v r 4181a, ie bit of the status register in the cp0 is also set to 1. the pll, timer/interrupt clocks and the internal bus clocks (tclock and masterout) will continue to run. once the v r 4100 series is in standby mode, any interrupt, including the internally generated timer interrupt, nmi, soft reset, and cold reset will cause the v r 4100 series to exit standby mode and to enter fullspeed mode. operation: 32, 64 t: t+1: standby operation ( ) exceptions: coprocessor unusable exception remark refer to hardware user's manual of each product for details about the operation of the peripheral units at mode transition. program examples to enter standby mode are shown below. ? for v r 4121, v r 4122, and v r 4181 # insert process to mask interrupts in the interrupt control unit (icu) ? # insert process for entering standby mode ? # insert process to enable interrupts in the icu standby ? for v r 4131 and v r 4181a mfc0 t5, psr ori t5, t5, 1 xori t5, t5, 1 mtc0 t5, psr # insert process for entering standby mode standby
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 353 sub subtract sub special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 sub 1 0 0 0 1 0 format: sub rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs to form a result. the result is placed into general register rd. in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. an integer overflow exception takes place if the carries out of bits 30 and 31 differ (2?s complement overflow). the destination register rd is not modified when an integer overflow exception occurs. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rd] gpr [rs] ? gpr [rt] 64 t: temp gpr [rs] ? gpr [rt] gpr [rd] (temp 31 ) 32 || temp 31?0 exceptions: integer overflow exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 354 subu subtract unsigned subu special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 subu 1 0 0 0 1 1 format: subu rd, rs, rt description: the contents of general register rt are subtracted from the contents of general register rs to form a result. the result is placed into general register rd . in 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. the only difference between this instruction and the sub instruction is that subu never traps on overflow. no integer overflow exception occurs under any circumstances. restrictions: if the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the result of this operation will be undefined. operation: 32 t: gpr [rd] gpr [rs] ? gpr [rt] 64 t: temp gpr [rs] ? gpr [rt] gpr [rd] (temp 31 ) 32 || temp 31?0 exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 355 suspend suspend suspend cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 suspend 1 0 0 0 1 0 format: suspend description: suspend instruction starts mode transition from fullspeed mode to suspend mode. when the suspend instruction finishes the wb stage, the v r 4100 series wait by the sysad bus is idle state, and then fix the internal clocks including the tclock to high level, thus freezing the pipeline. in the v r 4131 and v r 4181a, ie bit of the status register in the cp0 is also set to 1. the pll, timer/interrupt clocks and masterout, will continue to run. once the v r 4100 series is in suspend mode, any interrupt, including the internally generated timer interrupt, nmi, soft reset and cold reset will cause the v r 4100 series to exit suspend mode and to enter fullspeed mode. operation: 32, 64 t: t+1: suspend operation ( ) exceptions: coprocessor unusable exception remark refer to hardware user's manual of each product for details about the operation of the peripheral units at mode transition. program examples to enter suspend mode are shown below. ? for v r 4121, v r 4122, and v r 4181 # insert process to mask interrupts in the interrupt control unit (icu) ? # insert process for entering suspend mode ? # insert process to enable interrupts in the icu suspend ? for v r 4131 and v r 4181a mfc0 t5, psr ori t5, t5, 1 xori t5, t5, 1 mtc0 t5, psr # insert process for entering suspend mode suspend
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 356 sw store word sw sw 31 26 25 21 20 16 15 0 base 1 0 1 0 1 1 rt offset format: sw rt, offset (base) description: the 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address. the contents of general register rt are stored at the memory location specified by the effective address. if either of the two least-significant bits of the effective address are non-zero, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, word, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor (reverseendian || 0 2 )) byte vaddr 2?0 xor (bigendiancpu || 0 2 ) data gpr [rt] 63 ? 8*byte?0 || 0 8*byte storememory (uncached, word, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 357 swl store word left swl swl 31 26 25 21 20 16 15 0 base 1 0 1 0 1 0 rt offset format: swl rt, offset (base) description: this instruction can be used with the swr instruction to store the contents of a register into four consecutive bytes of memory, when the bytes cross a word boundary. swl stores the left portion of the register into the appropriate part of the high-order word in memory; swr stores the right portion of the register into the appropriate part of the low-order word. the swl instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that may specify an arbitrary byte. it alters only the word in memory that contains the specified starting byte, with the high-order part of general register rt . from one to four bytes will be stored, depending on the starting byte specified. conceptually, it starts at the most-significant (leftmost) byte of the register and copies it to the specified byte in memory; then it copies bytes from register to memory until it reaches the low-order byte of the word in memory. no address error exceptions due to alignment are possible. address 4 address 0 memory (little endian) 7 before after $24 register swl $24, 4 ($0) 654 3210 abcd address 4 address 0 765a 3210
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 358 swl store word left swl (continued) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || 0 24 ? 8*byte || gpr [rt] 31?24 ? 8*byte else data 0 24 ? 8*byte || gpr [rt] 31?24 ? 8*byte || 0 32 endif storememory (uncached, byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || 0 24 ? 8*byte || gpr [rt] 31?24 ? 8*byte else data 0 24 ? 8*byte || gpr [rt] 31?24 ? 8*byte || 0 32 endif storememory (uncached, byte, data, paddr, vaddr, data)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 359 swl store word left swl (continued) given a doubleword in a register and a doubleword in memory, the operation of swl is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 ijklmnoe ijklmnef ijklmefg ijklefgh ijkemnop ijefmnop iefgmnop efghmnop 0 1 2 3 0 1 2 3 0 0 0 0 4 4 4 4 7 6 5 4 3 2 1 0 efghmnop iefgmnop ijefmnop ijkemnop ijklefgh ijklmefg ijklmnef ijklmnoe 3 2 1 0 3 2 1 0 4 4 4 4 0 0 0 0 0 1 2 3 4 5 6 7 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 360 swr store word right swr swr 31 26 25 21 20 16 15 0 base 1 0 1 1 1 0 rt offset format: swr rt, offset (base) description: this instruction can be used with the swl instruction to store the contents of a register into four consecutive bytes of memory, when the bytes cross a word boundary. swr stores the right portion of the register into the appropriate part of the low-order word in memory; swl stores the left portion of the register into the appropriate part of the high-order word. the swr instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual address that may specify an arbitrary byte. it alters only the word in memory that contains the specified starting byte, with low-order part of general register rt . from one to four bytes will be stored, depending on the starting byte specified. conceptually, it starts at the least-significant (rightmost) byte of the register and copies it to the specified byte in memory; then copies bytes from register to memory until it reaches the high-order byte of the word in memory. no address error exceptions due to alignment are possible. address 4 address 0 memory (little endian) 7 before after $24 register swr $24, 1 ($0) 654 3210 abcd address 4 address 0 7654 bcd0
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 361 swr store word right swr (continued) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || gpr [rt] 31 ? 8*byte?0 || 0 8*byte else data gpr [rt] 31 ? 8*byte || 0 8*byte || 0 32 endif storememory (uncached, word-byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15?0 ) + gpr [base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1?3 || (paddr 2?0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1?2 || 0 2 endif byte vaddr 1?0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || gpr [rt] 31 ? 8*byte?0 || 0 8*byte else data gpr [rt] 31 ? 8*byte || 0 8*byte || 0 32 endif storememory (uncached, word-byte, data, paddr, vaddr, data)
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 362 swr store word right swr (continued) given a doubleword in a register and a doubleword in memory, the operation of swr is as follows: bcdefg ah jklmno ip register memory vaddr 2..0 bigendiancpu = 0 bigendiancpu = 1 note destination type offset destination type offset lem bem note lem bem 0 1 2 3 4 5 6 7 ijklefgh ijklfghp ijklghop ijklhnop efghmnop fghlmnop ghk lmnop hjklmnop 3 2 1 0 3 2 1 0 0 1 2 3 4 5 6 7 4 4 4 4 0 0 0 0 hjklmnop ghk lmnop fghlmnop efghmnop ijklhnop ijklghop ijklfghp ijklefgh 0 1 2 3 0 1 2 3 7 6 5 4 3 2 1 0 0 0 0 0 4 4 4 4 note for v r 4131 only remark type : access type (see figure 2-2 ) sent to memory offset : paddr 2..0 sent to memory lem : little-endian memory (bigendianmem = 0) bem : big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 363 sync synchronize sync special 31 26 25 6 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sync 0 0 1 1 1 1 format: sync description: the sync instruction is executed as a nop on the v r 4100 series. this operation is compatible with code compiled for the v r 4000. this instruction is defined for the purpose of maintaining software compatibility with the v r 4000 and v r 4400. operation: 32, 64 t: syncoperation ( ) exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 364 syscall system call syscall special 31 26 25 6 5 0 code 0 0 0 0 0 0 syscall 0 0 1 1 0 0 format: syscall description: a system call exception occurs by executing this instruction, immediately and unconditionally transferring control to the exception handler. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: systemcallexception exceptions: system call exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 365 teq trap if equal teq special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt teq 1 1 0 1 0 0 code format: teq rs, rt description: the contents of general register rt are compared to general register rs . if the contents of general register rs are equal to the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr [rs] = gpr [rt] then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 366 teqi trap if equal immediate teqi regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate teqi 0 1 1 0 0 format: teqi rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . if the contents of general register rs are equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr [rs] = (immediate 15 ) 16 || immediate 15?0 then trapexception endif 64 t: if gpr [rs] = (immediate 15 ) 48 || immediate 15?0 then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 367 tge trap if greater than or equal tge special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt tge 1 1 0 0 0 0 code format: tge rs, rt description: the contents of general register rt are compared to the contents of general register rs . considering both quantities as signed integers, if the contents of general register rs are greater than or equal to the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr [rs] gpr [rt] then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 368 tgei trap if greater than or equal immediate tgei regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate tgei 0 1 0 0 0 format: tgei rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . considering both quantities as signed integers, if the contents of general register rs are greater than or equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr [rs] (immediate 15 ) 16 || immediate 15?0 then trapexception endif 64 t: if gpr [rs] (immediate 15 ) 48 || immediate 15?0 then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 369 tgeiu trap if greater than or equal immediate unsigned tgeiu regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate tgeiu 0 1 0 0 1 format: tgeiu rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . considering both quantities as unsigned integers, if the contents of general register rs are greater than or equal to the sign- extended immediate , a trap exception occurs. operation: 32 t: if (0 || gpr [rs]) (0 || (immediate 15 ) 16 || immediate 15?0 ) then trapexception endif 64 t: if (0 || gpr [rs]) (0 || (immediate 15 ) 48 || immediate 15?0 ) then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 370 tgeu trap if greater than or equal unsigned tgeu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt tgeu 1 1 0 0 0 1 code format: tgeu rs, rt description: the contents of general register rt are compared to the contents of general register rs . considering both quantities as unsigned integers, if the contents of general register rs are greater than or equal to the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if (0 || gpr [rs]) (0 || gpr [rt]) then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 371 tlbp probe tlb for matching entry tlbp cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tlbp 0 0 1 0 0 0 format: tlbp description: the index register is loaded with the address of the tlb entry whose contents match the contents of the entryhi register. if no tlb entry matches, the high-order bit of the index register is set. the architecture does not specify the operation of memory references associated with the instruction immediately after a tlbp instruction, nor is the operation specified if more than one tlb entry matches. operation: 32 t: index 1 || 0 25 || undefined 6 for i in 0?tlbentries ? 1 if (tlb [i] 95?77 = entryhi 31?13 ) and (tlb [i] 76 or (tlb [i] 71?64 = entryhi 7?0 )) then index 0 26 || i 5?0 endif endfor 64 t: index 1 || 0 25 || undefined 6 for i in 0?tlbentries ? 1 if (tlb [i] 167?141 and not (0 15 || tlb [i] 216?205 )) = (entryhi 39?13 and not (0 15 || tlb [i] 216?205 )) and (tlb [i] 140 or (tlb [i] 135?126 = entryhi 7?0 )) then index 0 26 || i 5?0 endif endfor exceptions: coprocessor unusable exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 372 tlbr read indexed tlb entry tlbr cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tlbr 0 0 0 0 0 1 format: tlbr description: the entryhi and entrylo registers are loaded with the contents of the tlb entry pointed at by the contents of the index register. the g bit (which controls asid matching) read from the tlb is written into both of the entrylo0 and entrylo1 registers. the operation is invalid (and the results are unspecified) if the contents of the index register are greater than the number of tlb entries in the processor. operation: 32 t: pagemask tlb [index 5?0 ] 127?96 entryhi tlb [index 5?0 ] 95?64 and not tlb [index 5?0 ] 127?96 entrylo1 tlb [index 5?0 ] 63?33 || tlb [index 5?0 ] 76 entrylo0 tlb [index 5?0 ] 31?1 || tlb [index 5?0 ] 76 64 t: pagemask tlb [index 5?0 ] 255?192 entryhi tlb [index 5?0 ] 191?128 and not tlb [index 5?0 ] 255?192 entrylo1 tlb [index 5?0 ] 127?65 || tlb [index 5?0 ] 140 entrylo0 tlb [index 5?0 ] 63?1 || tlb [index 5?0 ] 140 exceptions: coprocessor unusable exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 373 tlbwi write indexed tlb entry tlbwi cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tlbwi 0 0 0 0 1 0 format: tlbwi description: the tlb entry pointed at by the contents of the index register is loaded with the contents of the entryhi and entrylo registers. the g bit of the tlb is written with the logical and of the g bits in the entrylo0 and entrylo1 registers. the operation is invalid (and the results are unspecified) if the contents of the index register are greater than the number of tlb entries in the processor. operation: 32, 64 t: tlb [index 5?0 ] pagemask || (entryhi and not pagemask) || entrylo1 || entrylo0 exceptions: coprocessor unusable exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 374 tlbwr write random tlb entry tlbwr cop0 31 26 25 24 6 5 0 co 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tlbwr 0 0 0 1 1 0 format: tlbwr description: the tlb entry pointed at by the contents of the random register is loaded with the contents of the entryhi and entrylo registers. the g bit of the tlb is written with the logical and of the g bits in the entrylo0 and entrylo1 registers. operation: 32, 64 t: tlb [random 5?0 ] pagemask || (entryhi and not pagemask) || entrylo1 || entrylo0 exceptions: coprocessor unusable exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 375 tlt trap if less than tlt special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt tlt 1 1 0 0 1 0 code format: tlt rs, rt description: the contents of general register rt are compared to the contents of general register rs . considering both quantities as signed integers, if the contents of general register rs are less than the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr [rs] < gpr [rt] then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 376 tlti trap if less than immediate tlti regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate tlti 0 1 0 1 0 format: tlti rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . considering both quantities as signed integers, if the contents of general register rs are less than the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr [rs] < (immediate 15 ) 16 || immediate 15?0 then trapexception endif 64 t: if gpr [rs] < (immediate 15 ) 48 || immediate 15?0 then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 377 tltiu trap if less than immediate unsigned tltiu regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate tltiu 0 1 0 1 1 format: tltiu rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . considering both quantities as unsigned integers, if the contents of general register rs are less than the sign-extended immediate , a trap exception occurs. operation: 32 t: if (0 || gpr [rs]) < (0 || (immediate 15 ) 16 || immediate 15?0 ) then trapexception endif 64 t: if (0 || gpr [rs]) < (0 || (immediate 15 ) 48 || immediate 15?0 ) then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 378 tltu trap if less than unsigned tltu special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt tltu 1 1 0 0 1 1 code format: tltu rs, rt description: the contents of general register rt are compared to the contents of general register rs . considering both quantities as unsigned integers, if the contents of general register rs are less than the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if (0 || gpr [rs]) < (0 || gpr [rt]) then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 379 tne trap if not equal tne special 31 26 25 21 20 16 15 6 5 0 rs 0 0 0 0 0 0 rt tne 1 1 0 1 1 0 code format: tne rs, rt description: the contents of general register rt are compared to the contents of general register rs . if the contents of general register rs are not equal to the contents of general register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr [rs] gpr [rt] then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 380 tnei trap if not equal immediate tnei regimm 31 26 25 21 20 16 15 0 rs 0 0 0 0 0 1 immediate tnei 0 1 1 1 0 format: tnei rs, immediate description: the 16-bit immediate is sign-extended and compared to the contents of general register rs . if the contents of general register rs are not equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr [rs] (immediate 15 ) 16 || immediate 15?0 then trapexception endif 64 t: if gpr [rs] (immediate 15 ) 48 || immediate 15?0 then trapexception endif exceptions: trap exception
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 381 xor exclusive or xor special 31 26 25 21 20 16 15 11 10 6 5 0 rs 0 0 0 0 0 0 rt rd 0 0 0 0 0 0 xor 1 0 0 1 1 0 format: xor rd, rs, rt description: the contents of general register rs are combined with the contents of general register rt in a bit-wise logical exclusive or operation. the result is placed into general register rd. operation: 32, 64 t: gpr [rd] gpr [rs] xor gpr [rt] exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 382 xori exclusive or immediate xori xori 31 26 25 21 20 16 15 0 rs 0 0 1 1 1 0 immediate rt format: xori rt, rs, immediate description: the 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical exclusive or operation. the result is placed into general register rt. operation: 32 t: gpr [rt] gpr [rs] xor (0 16 || immediate) 64 t: gpr [rt] gpr [rs] xor (0 48 || immediate) exceptions: none
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 383 9.4 cpu instruction opcode bit encoding the remainder of this chapter presents the opcode bit encoding for the cpu instruction set (isa and extensions), as implemented by the v r 4100 series. figure 9-1 lists the v r 4100 series opcode bit encoding. figure 9-1. cpu instruction opcode bit encoding (1/3) 28...26 opcode 31...29 0 1 2 3 4 5 6 7 0 special regimm j jal beq bne blez bgtz 1 addi addiu slti sltiu andi ori xori lui 2cop0 * beql bnel blezl bgtzl 3 daddi daddiu ldl ldr *jalx ** 4 lb lh lwl lw lbu lhu lwr lwu 5sb shswlswsdl sdr swr cache 6* ** ld 7* ** sd 2...0 special function 5...301234567 0 sll * srl sra sllv * srlv srav 1 jr jalr * * syscall break * sync 2 mfhi mthi mflo mtlo dsllv *dsrlv dsrav 3 mult multu div divu dmult dmultu ddiv ddivu 4 add addu sub subu and or xor nor 5 note 1 note 2 slt sltu dadd daddu dsub dsubu 6 tge tgeu tlt tltu teq * tne * 7dsll *dsrl dsra dsll32 * dsrl32 dsra32 18...16 regimm rt 20...19 0 1 2 3 4 5 6 7 0 bltz bgez bltzl bgezl * * * * 1 tgei tgeiu tlti tltiu teqi * tnei * 2 bltzal bgezal bltzall bgezall * * * * 3******** notes 1. v r 4121, v r 4122, v r 4131, v r 4181a ? macc v r 4181 ? madd16 2. v r 4121, v r 4122, v r 4131, v r 4181a ? dmacc v r 4181 ? dmadd16
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 384 figure 9-1. cpu instruction opcode bit encoding (2/3) 23...21 cop0 rs 25?2401234567 0mfdmf ? mt dmt ? 1bc ?????? 2co 3 18...16 cop0 rt 20...19 0 1 2 3 4 5 6 7 0 bcf bct bcfl bctl ??? 1 ??????? 2 ??????? 3 ??????? 2...0 cp0 function 5...3 0 1 2 3 4 5 6 7 0 tlbr tlbwi ?? tlbwr 1tlbp ?????? 2 ?????? 3eret ??????? 4 standby suspend hibernate ??? 5 ??????? 6 ??????? 7 ???????
chapter 9 cpu instruction set details user?s manual u15509ej2v0um 385 figure 9-1. cpu instruction opcode bit encoding (3/3) key: * operation codes marked with an asterisk cause reserved instruction exceptions in all current implementations and are reserved for future versions of the architecture. operation codes marked with a gamma cause a reserved instruction exception. they are reserved for future versions of the architecture. operation codes marked with a delta are valid only for processors conforming to mips iii instruction set or later with cp0 enabled, and cause a reserved instruction exception on other processors. operation codes marked with a phi are invalid but do not cause reserved instruction exceptions in v r 4100 series implementations. operation codes marked with a xi cause a reserved instruction exception on v r 4100 series processors. operation codes marked with a chi are valid on processors conforming to mips iii instruction set or later only. operation codes marked with an epsilon are valid when the processor operating in 64-bit mode or in 32-bit kernel mode. these instructions will cause a reserved instruction exception if the processor operates in 32-bit user or supervisor mode. operation codes marked with a pi are invalid and cause coprocessor unusable exception on v r 4100 series processors. operation codes marked with a theta are valid when mips16 instruction execution is enabled, and cause a reserved instruction exception when mips16 instruction execution is disabled.
user?s manual u15509ej2v0um 386 chapter 10 mips16 instruction set format this chapter describes the format of each mips16 instruction, and the format of the mips instructions that are made by converting mips16 instructions in alphabetical order. for details of mips16 instruction conversion and opcode, refer to chapter 3 mips16 instruction set . caution for some instructions, their format or syntax may become ineffective after they are converted to a 32-bit instruction. for details of formats and syntax of 32-bit instructions, refer to chapter 2 cpu instruction set summary and chapter 9 cpu instruction set details.
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 387 addiu add immediate unsigned (1/2) 0 3 4 15 16 20 21 25 26 31 0 3 4 5 7 8 10 11 15 addiu 0 0 1 0 0 1 trx try sign immediate immediate ry rx rri-a 0 1 0 0 0 addiu ry, rx, immediate a d d i u 0 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 addiu 0 0 1 0 0 1 trx trx sign immediate immediate rx addiu8 0 1 0 0 1 addiu rx, immediate 0 2 3 10 11 15 16 20 21 25 26 31 0 7 8 10 11 15 addiu 0 0 1 0 0 1 sp 1 1 1 0 1 sp 1 1 1 0 1 sign immediate immediate adjsp 0 1 1 i8 0 1 1 0 0 addiu sp, immediate 0 0 0 0
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 388 addiu add immediate unsigned (2/2) 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 addiu 0 0 1 0 0 1 0 0 0 0 0 0 trx 0 0 0 0 0 0 0 immediate immediate rx addiusp 0 0 0 0 1 addiu rx, pc, immediate 0 0 0 9 10 note note zeros are shown in the field of bits 21 to 25 as placeholders. the 32-bit pc-relative instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 3 for a complete definition of the semantics of the mips16 pc-relative instructions. 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 addiu 0 0 1 0 0 1 sp 1 1 1 0 1 trx 0 0 0 0 0 0 0 immediate immediate rx addiusp 0 0 0 0 0 addiu rx, sp, immediate 0 0 0 9 10
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 389 addu add unsigned 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 addu 1 0 0 0 0 1 addu 0 1 ry rx rrr 1 1 1 0 0 addu rz, rx, ry rz trz and and 0 5 6 10 11 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 and 1 0 0 1 0 0 and 0 1 1 0 0 ry rx rr 1 1 1 0 1 and, rx, ry trx
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 390 b branch unconditional 0 10 11 15 20 21 25 26 31 0 10 11 15 beq 0 0 0 1 0 0 zero 0 0 0 0 0 immediate note immediate b 0 0 0 1 0 b immediate zero 0 0 0 0 0 sign 16 note in mips16 mode, the branch offset is interpreted as halfword aligned. this is unlike 32-bit mips mode which interprets the offset value as word aligned. the 32-bit branch instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 2 and chapter 9 for a complete definition of the semantics of the branch instructions. beqz branch on equal to zero 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 beq 0 0 0 1 0 0 trx zero 0 0 0 0 0 sign immediate note immediate rx beqz 0 0 1 0 0 beqz rx, immediate note in mips16 mode, the branch offset is interpreted as halfword aligned. this is unlike 32-bit mips mode which interprets the offset value as word aligned. the 32-bit branch instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 2 and chapter 9 for a complete definition of the semantics of the branch instructions.
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 391 bnez branch on not equal to zero 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 bne 0 0 0 1 0 1 trx zero 0 0 0 0 0 sign immediate note immediate rx bnez 0 0 1 0 1 bnez rx, immediate note in mips16 mode, the branch offset is interpreted as halfword aligned. this is unlike 32-bit mips mode which interprets the offset value as word aligned. the 32-bit branch instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 2 and chapter 9 for a complete definition of the semantics of the branch instructions. break breakpoint 0 5 6 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 code note 2 break 0 0 1 1 0 1 break 0 0 1 0 1 rx note 1 rx note 1 rr 1 1 1 0 1 break immediate notes 1. the two register fields in the mips16 break instruction may be used as a 6-bit code (immediate) field for software parameters. the 6-bit code can be retrieved by the exception handler. 2. the 32-bit break instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. the code field is entirely ignored by the pipeline, and it is not visible in any way to the software executing on the processor.
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 392 bteqz branch on t equal to zero 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 beq 0 0 0 1 0 0 t8 1 1 0 0 0 zero 0 0 0 0 0 sign immediate note immediate bteqz 0 0 0 i8 0 1 1 0 0 bteqz immediate note in mips16 mode, the branch offset is interpreted as halfword aligned. this is unlike 32-bit mips mode which interprets the offset value as word aligned. the 32-bit branch instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 2 and chapter 9 for a complete definition of the semantics of the branch instructions. btnez branch on t not equal to zero 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 bne 0 0 0 1 0 1 t8 1 1 0 0 0 zero 0 0 0 0 0 sign immediate note immediate btnez 0 0 1 i8 0 1 1 0 0 btnez immediate note in mips16 mode, the branch offset is interpreted as halfword aligned. this is unlike 32-bit mips mode which interprets the offset value as word aligned. the 32-bit branch instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 2 and chapter 9 for a complete definition of the semantics of the branch instructions.
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 393 cmp c ompare 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try t8 1 1 0 0 0 xor 1 0 0 1 1 0 cmp 0 1 0 1 0 ry rx rr 1 1 1 0 1 cmp rx, ry 0 0 0 0 0 0 25 cmpi compare immediate 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 xori 0 0 1 1 1 0 trx t8 1 1 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate rx cmpi 0 1 1 1 0 cmpi rx, immediate
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 394 daddiu doubleword add immediate unsigned (1/2) 0 3 4 15 16 20 21 25 26 31 0 3 4 5 7 8 10 11 15 daddiu 0 1 1 0 0 1 trx try sign immediate immediate ry rx rri-a 0 1 0 0 0 daddiu ry, rx, immediate d a d d i u 1 0 4 5 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 daddiu 0 1 1 0 0 1 try try sign immediate immediate ry dadd iu5 1 0 1 i64 1 1 1 1 1 daddiu ry, immediate 10 2 6 7 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 daddiu 0 1 1 0 0 1 0 0 0 0 0 0 try 0 0 0 0 0 0 0 0 0 0 immediate immediate ry daddiu pc 1 1 1 i64 1 1 1 1 1 daddiu ry, pc, immediate 0 0 0 note note zeros are shown in the field of bits 21 to 25 as placeholders. the 32-bit pc-relative instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 3 for a complete definition of the semantics of the mips16 pc-relative instructions.
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 395 daddiu doubleword add immediate unsigned (2/2) 10 2 6 7 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 daddiu 0 1 1 0 0 1 sp 1 1 1 0 1 try 0 0 0 0 0 0 0 0 0 0 immediate immediate ry daddiu sp 1 1 1 i64 1 1 1 1 1 daddiu ry, sp, immediate 0 0 0 0 2 3 10 11 15 16 20 21 25 26 31 0 7 8 10 11 15 daddiu 0 1 1 0 0 1 sp 1 1 1 0 1 sp 1 1 1 0 1 sign immediate immediate dadj sp 0 1 1 i64 1 1 1 1 1 daddiu sp, immediate 0 0 0 0 daddu doubleword add unsigned 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 daddu 1 0 1 1 0 1 daddu 0 0 ry rx rrr 1 1 1 0 0 daddu rz, rx, ry rz trz
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 396 ddiv doubleword divide 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 ddiv 0 1 1 1 1 0 ddiv 1 1 1 1 0 ry rx rr 1 1 1 0 1 ddiv rx, ry ddivu doubleword divide unsigned 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 ddivu 0 1 1 1 1 1 ddivu 1 1 1 1 1 ry rx rr 1 1 1 0 1 ddivu rx, ry
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 397 div divide 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 div 0 1 1 0 1 0 div 1 1 0 1 0 ry rx rr 1 1 1 0 1 div rx, ry divu divide unsigned 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 divu 0 1 1 0 1 1 divu 1 1 0 1 1 ry rx rr 1 1 1 0 1 divu rx, ry
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 398 dmult doubleword multiply 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 dmult 0 1 1 1 0 0 dmult 1 1 1 0 0 ry rx rr 1 1 1 0 1 dmult rx, ry dmultu doubleword multiply unsigned 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 dmultu 0 1 1 1 0 1 dmultu 1 1 1 0 1 ry rx rr 1 1 1 0 1 dmultu rx, ry
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 399 dsll doubleword shift left logical 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try sa dsll 1 1 1 0 0 0 dsll 0 1 ry rx shift 0 0 1 1 0 dsll rx, ry, immediate shamt trx dsllv doubleword shift left logical variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try dsllv 0 1 0 1 0 0 dsllv 1 0 1 0 0 ry rx rr 1 1 1 0 1 dsllv ry, rx 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 400 dsra doubleword shift right arithmetic 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try try dsra 1 1 1 0 1 1 dsra 1 0 0 1 1 ry shamt rr 1 1 1 0 1 dsra ry, immediate sa 25 dsrav doubleword shift right arithmetic variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try dsrav 0 1 0 1 1 1 dsrav 1 0 1 1 1 ry rx rr 1 1 1 0 1 dsrav ry, rx 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 401 dsrl doubleword shift right logical 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try try dsrl 1 1 1 0 1 0 dsrl 0 1 0 0 0 ry shamt rr 1 1 1 0 1 dsrl ry, immediate sa 25 dsrlv doubleword shift right logical variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try dsrlv 0 1 0 1 1 0 dsrlv 1 0 1 1 0 ry rx rr 1 1 1 0 1 dsrlv ry, rx 0 0 0 0 0 0 25 dsubu doubleword subtract unsigned 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 dsubu 1 0 1 1 1 1 dsubu 1 0 ry rx rrr 1 1 1 0 0 dsubu rz, rx, ry rz trz
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 402 jal jump and link 0 25 26 31 0 15 jal 0 0 0 0 1 1 target address immediate 15:0 0 4 5 9 10 11 15 immediate 25:21 0 0 jal 0 0 0 1 1 jal target immediate 20:16 jalr jump and link register 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx 0 0 0 0 0 0 ra 1 1 1 1 1 jalr 0 0 1 0 0 1 jalr 0 0 0 0 0 jalr ra, rx rr 1 1 1 0 1 0 1 0 rx 0 0 0 0 0 25 0 2
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 403 jalx jump and link exchange 0 25 26 31 0 15 jalx 0 1 1 1 0 1 target address immediate 15:0 0 4 5 9 10 11 15 immediate 25:21 1 1 jalx 0 0 0 1 1 jalx target immediate 20:16 jr jump register 0 4 5 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 rx rr 1 1 1 0 1 jr rx jr 0 0 0 0 0 jr 0 0 1 0 0 0 0 0 4 5 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 rr 1 1 1 0 1 jr ra jr 0 0 0 0 0 jr 0 0 1 0 0 0 ra 1 1 1 1 1 01
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 404 lb load byte 0 4 5 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lb 1 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lb 1 0 0 0 0 lb ry, offset (rx) lbu load byte unsigned 0 4 5 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lbu 1 0 0 1 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lbu 1 0 1 0 0 lbu ry, offset (rx)
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 405 ld load doubleword 0 2 3 7 8 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 ld 1 1 0 1 1 1 trx try 0 0 0 0 0 0 0 0 0 immediate immediate ry rx ld 0 0 1 1 1 ld ry, offset (rx) 0 0 0 0 0 2 3 7 8 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 ld 1 1 0 1 1 1 try 0 0 0 0 0 0 0 0 0 immediate immediate ry i64 1 1 1 1 1 ld ry, offset (pc) 0 0 0 0 ldpc 1 0 0 0 0 0 0 0 0 note note zeros are shown in the field of bits 21 to 25 as placeholders. the 32-bit pc-relative instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 3 for a complete definition of the semantics of the mips16 pc-relative instructions. 0 2 3 7 8 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 ld 1 1 0 1 1 1 sp 1 1 1 0 1 try 0 0 0 0 0 0 0 0 0 immediate immediate ry ldsp 0 0 0 i64 1 1 1 1 1 ld ry, offset (sp) 0 0 0 0
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 406 lh load halfword 10 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lh 1 0 0 0 0 1 trx try 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lh 1 0 0 0 1 lh ry, offset (rx) 0 0 lhu load halfword unsigned 10 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lhu 1 0 0 1 0 1 trx try 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lhu 1 0 1 0 1 lhu ry, offset (rx) 0 0 li load immediate 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 ori 0 0 1 1 0 1 zero 0 0 0 0 0 trx 0 0 0 0 0 0 0 0 0 immediate immediate rx li 0 1 1 0 1 li rx, immediate
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 407 lw load word 10 2 6 7 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lw 1 0 0 0 1 1 trx try 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lw 1 0 0 1 1 lw ry, offset (rx) 0 0 0 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 lw 1 0 0 0 1 1 0 0 0 0 0 0 trx 0 0 0 0 0 0 0 immediate immediate rx lwpc 1 0 1 1 0 lw rx, offset (pc) 0 0 0 9 10 note note zeros are shown in the field of bits 21 to 25 as placeholders. the 32-bit pc-relative instruction format shown above is provided here only to make the description complete; it is not a valid 32-bit mips instruction. see chapter 3 for a complete definition of the semantics of the mips16 pc-relative instructions. 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 lw 1 0 0 0 1 1 sp 1 1 1 0 1 trx 0 0 0 0 0 0 0 immediate immediate rx lwsp 1 0 0 1 0 lw rx, offset (sp) 0 0 0 9 10
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 408 lwu load word unsigned 10 2 6 7 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 lwu 1 0 0 1 1 1 trx try 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx lwu 1 0 1 1 1 lwu ry, offset (rx) 0 0 0 mfhi move from hi register 0 5 6 10 11 15 16 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 trx mfhi 0 1 0 0 0 0 rx 0 0 0 rr 1 1 1 0 1 mfhi rx 0 0 0 0 0 0 25 mfhi 1 0 0 0 0 0 mflo move from lo register 0 5 6 10 11 15 16 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 trx mflo 0 1 0 0 1 0 rx 0 0 0 rr 1 1 1 0 1 mflo rx 0 0 0 0 0 0 25 mflo 1 0 0 1 0 0
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 409 move move 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 r32 zero 0 0 0 0 0 try or 1 0 0 1 0 1 r32 ry mov r32 1 1 1 i8 0 1 1 0 0 move ry, r32 0 0 0 0 0 0 25 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trz zero 0 0 0 0 0 r32 or 1 0 0 1 0 1 rz mov 32r 1 0 1 i8 0 1 1 0 0 move r32 rz 0 0 0 0 0 0 25 r32 2:0 2 3 r32 4:3
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 410 mult multiply 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 mult 0 1 1 0 0 0 mult 1 1 0 0 0 ry rx rr 1 1 1 0 1 mult rx, ry multu multiply unsigned 0 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 multu 0 1 1 0 0 1 multu 1 1 0 0 1 ry rx rr 1 1 1 0 1 multu rx, ry
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 411 neg negate 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 zero 0 0 0 0 0 try trx subu 1 0 0 0 1 1 neg 0 1 0 1 1 ry rx rr 1 1 1 0 1 neg rx, ry 0 0 0 0 0 0 25 not not 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 zero 0 0 0 0 0 try trx nor 1 0 0 1 1 1 not 0 1 1 1 1 ry rx rr 1 1 1 0 1 not rx, ry 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 412 or or 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try trx or 1 0 0 1 0 1 or 0 1 1 0 1 ry rx rr 1 1 1 0 1 or rx, ry 0 0 0 0 0 0 25 sb store byte 0 4 5 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 sb 1 0 1 0 0 0 trx try 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx sb 1 1 0 0 0 sb ry, offset (rx)
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 413 sd store doubleword 0 2 3 7 8 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 sd 1 1 1 1 1 1 trx try 0 0 0 0 0 0 0 0 0 immediate immediate ry rx sd 0 1 1 1 1 sd ry, offset (rx) 0 0 0 0 0 2 3 7 8 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 sd 1 1 1 1 1 1 sp 1 1 1 0 1 try 0 0 0 0 0 0 0 0 0 immediate immediate ry sdsp 0 0 1 i64 1 1 1 1 1 sd ry, offset (sp) 0 0 0 0 0 2 3 10 11 15 16 20 21 25 26 31 0 7 8 10 11 15 sd 1 1 1 1 1 1 sp 1 1 1 0 1 ra 1 1 1 1 1 0 0 0 0 0 0 immediate immediate sd rasp 0 1 0 i64 1 1 1 1 1 sd ra, offset (sp) 0 0 0 0
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 414 sh store halfword 0 1 5 6 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 sh 1 0 1 0 0 1 trx try 0 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx sh 1 1 0 0 1 sh ry, offset (rx) 0 0 sll shift left logical 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try sa sll 0 0 0 0 0 0 sll 0 0 rx ry shift 0 0 1 1 0 sll rx, ry, immediate shamt trx sllv shift left logical variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try sllv 0 0 0 1 0 0 sllv 0 0 1 0 0 ry rx rr 1 1 1 0 1 sllv ry, rx 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 415 slt set on less than 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try t8 1 1 0 0 0 slt 1 0 1 0 1 0 slt 0 0 0 1 0 ry rx rr 1 1 1 0 1 slt rx, ry 0 0 0 0 0 0 25 slti set on less than immediate 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 slti 0 0 1 0 1 0 trx t8 1 1 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate rx slti 0 1 0 1 0 slti rx, immediate
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 416 sltiu set on less than immediate unsigned 0 7 8 15 16 20 21 25 26 31 0 7 8 10 11 15 sltiu 0 0 1 0 1 1 trx t8 1 1 0 0 0 0 0 0 0 0 0 0 0 0 immediate immediate rx sltiu 0 1 0 1 1 sltiu rx, immediate sltu set on less than unsigned 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try t8 1 1 0 0 0 sltu 1 0 1 0 1 1 sltu 0 0 0 1 1 ry rx rr 1 1 1 0 1 sltu rx, ry 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 417 sra shift right arithmetic 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try sa sra 0 0 0 0 1 1 sra 1 1 rx ry shift 0 0 1 1 0 sra rx, ry, immediate shamt trx srav shift right arithmetic variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try srav 0 0 0 1 1 1 srav 0 0 1 1 1 ry rx rr 1 1 1 0 1 srav ry, rx 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 418 srl shift right logical 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 0 0 0 0 0 0 try sa srl 0 0 0 0 1 0 srl 1 0 rx ry shift 0 0 1 1 0 srl rx, ry, immediate shamt trx srlv shift right logical variable 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try try srlv 0 0 0 1 1 0 srlv 0 0 1 1 0 ry rx rr 1 1 1 0 1 srlv ry, rx 0 0 0 0 0 0 25
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 419 sw store word 10 2 6 7 15 16 20 21 25 26 31 0 4 5 7 8 10 11 15 sw 1 0 1 0 1 1 trx try 0 0 0 0 0 0 0 0 0 0 immediate immediate ry rx sw 1 1 0 1 1 sw ry, offset (rx) 0 0 0 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 sw 1 0 1 0 1 1 sp 1 1 1 0 1 trx 0 0 0 0 0 0 0 immediate immediate rx swsp 1 1 0 1 0 sw rx, offset (sp) 0 0 0 9 10 10 2 15 16 20 21 25 26 31 0 7 8 10 11 15 sw 1 0 1 0 1 1 sp 1 1 1 0 1 ra 1 1 1 1 1 0 0 0 0 0 0 0 immediate immediate sw rasp 0 1 0 i8 0 1 1 0 0 sw ra, offset (sp) 0 0 0 9 10
chapter 10 mips16 instruction set format user?s manual u15509ej2v0um 420 subu subtract unsigned 0 5 6 10 11 15 16 20 21 25 26 31 10 2 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try 0 0 0 0 0 0 subu 1 0 0 0 1 1 subu 1 1 ry rx rrr 1 1 1 0 0 subu rz, rx, ry rz trz syscall system call 0 5 6 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 syscall 0 0 1 1 0 0 syscall 0 1 0 0 1 rr 1 1 1 0 1 syscall 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 25 0 0 0 0 0 0 0 0 xor exclusive or 0 5 6 10 11 15 16 20 21 26 31 0 4 5 7 8 10 11 15 special 0 0 0 0 0 0 trx try trx xor 1 0 0 1 1 0 xor 0 1 1 1 0 ry rx rr 1 1 1 0 1 xor rx, ry 0 0 0 0 0 0 25
user?s manual u15509ej2v0um 421 chapter 11 coprocessor 0 hazards the cpu core of the v r 4100 series avoids contention of its internal resources by causing a pipeline interlock in such cases as when the contents of the destination register of an instruction are used as a source in the succeeding instruction. therefore, instructions such as nop must not be inserted between instructions. however, interlocks do not occur on the operations related to the cp0 registers and the tlb. therefore, contention of internal resources should be considered when composing a program that manipulates the cp0 registers or the tlb. the cp0 hazards define the number of nop instructions that is required to avoid contention of internal resources, or the number of instructions unrelated to contention. this chapter describes the cp0 hazards. the cp0 hazards of the cpu core of the v r 4100 series are as or less stringent than those of the v r 4000. table 11-1 lists the coprocessor 0 hazards of the cpu core of the v r 4100 series. code that complies with these hazards will run without modification on the v r 4000 series. the contents of the cp0 registers or the bits in the ?source? column of this table can be used as a source after they are fixed. the contents of the cp0 registers or the bits in the ?destination? column of this table can be available as a destination after they are stored. based on this table, the number of nop instructions required between instructions related to the tlb is computed by the following formula, and so is the number of instructions unrelated to contention: (destination hazard number of a) ? [(source hazard number of b) + 1] as an example, to compute the number of instructions required between an mtc0 and a subsequent mfc0 instruction, this is: (5) ? (3 + 1) = 1 instruction the cp0 hazards do not generate interlocks of pipeline. therefore, the required number of instruction must be controlled by program.
chapter 11 coprocessor 0 hazards user?s manual u15509ej2v0um 422 table 11-1. coprocessor 0 hazards (1/2) (a) v r 4121, v r 4122, v r 4181, and v r 4181a operation source destination source name no. of cycles destination name no. of cycles mtc0 ? cpr 5 mfc0 cpr 3 ? tlbr index, tlb 2 pagemask, entryhi, entrylo0, entrylo1 5 tlbwi tlbwr index or random, pagemask, entryhi, entrylo0, entrylo1 2tlb 5 tlbp pagemask, entryhi 2 index 6 eret epc or errorepc, tlb 2 status[exl], [erl] 4 status 2 cache index_load_tag ? taglo, taghi, perr 5 cache index_store_tag taglo, taghi, perr 3 ? cache hit ops. cache line 3 cache line 5 coprocessor usable test status[cu], [ksu], [exl], [erl] 2 ? instruction fetch entryhi[asid], status[ksu], [exl], [erl], [re], config[k0] 2 ? tlb 2 instruction fetch ? epc, status 4 exception cause, badvaddr, context, xcontext 5 interrupt signals cause[ip], status[im], [ie], [exl], [erl] 2 ? loads/stores entryhi[asid], status[ksu], [exl], [erl], [re], config[k0], tlb 3 ? config[ad], [ep] 3 watchhi, watchlo 3 load/store exception ? epc, status, cause, badvaddr, context, xcontext 5 tlb shutdown (v r 4181 only) ? status[ts] 2 (inst.), 4 (data) remark brackets indicate a bit name or a field name of registers.
chapter 11 coprocessor 0 hazards user?s manual u15509ej2v0um 423 table 11-1. coprocessor 0 hazards (2/2) (b) v r 4131 operation source destination source name no. of cycles destination name no. of cycles mtc0 ? cpr 6 mfc0 cpr 4 ? tlbr index, tlb 3 pagemask, entryhi, entrylo0, entrylo1 6 tlbwi tlbwr index or random, pagemask, entryhi, entrylo0, entrylo1 3tlb 6 tlbp pagemask, entryhi 3 index 6 eret epc or errorepc, tlb 5 status[exl], [erl] 6 status 5 cache index_load_tag ? taglo, taghi, perr 6 cache index_store_tag taglo, taghi, perr 4 ? cache hit ops. cache line 4 cache line 6 coprocessor usable test status[cu], [ksu], [exl], [erl] 2 ? instruction fetch entryhi[asid], status[ksu], [exl], [erl], [re], config[k0] 2 ? tlb 2 instruction fetch ? epc, status 6 exception cause, badvaddr, context, xcontext 6 interrupt signals cause[ip], status[im], [ie], [exl], [erl] 2 ? loads/stores entryhi[asid], status[ksu], [exl], [erl], [re], config[k0], tlb 4 ? config[ad], [ep] 4 watchhi, watchlo 4 load/store exception ? epc, status, cause, badvaddr, context, xcontext 6 remark brackets indicate a bit name or a field name of registers.
chapter 11 coprocessor 0 hazards user?s manual u15509ej2v0um 424 cautions 1. if the setting of the k0 bit in the config register is changed by mtc0 for the kseg0 or ckseg0 area, the change is reflected at first to third instruction after mtc0. 2. the instruction following mtc0 must not be mfc0. 3. the five instructions following mtc0 to status register that changes ksu bit and sets exl and erl bits may be executed in the new mode, and not kernel mode. this can be avoided by setting exl bit first, leaving ksu bit set to kernel, and later changing ksu bit. 4. if interrupts are disabled by setting exl bit in the status register with mtc0, an interrupt may occur immediately after mtc0 without change of the contents of the epc register. this can be avoided by clearing ie bit first, and later setting exl bit. 5. there must be two non-load, non-cache instructions between a store and a cache instruction directed to the same primary cache line as the store. the status during execution of the following instruction for which cp0 hazards must be considered is described below. (1) mtc0 destination: the completion of writing to a destination register (cp0) of mtc0. (2) mfc0 source: the confirmation of a source register (cp0) of mfc0. (3) tlbr source: the confirmation of the status of tlb and the index register before the execution of tlbr. destination: the completion of writing to a destination register (cp0) of tlbr. (4) tlbwi, tlbwr source: the confirmation of a source register of these instructions and registers used to specify a tlb entry. destination: the completion of writing to tlb by these instructions. (5) tlbp source: the confirmation of the pagemask register and the entryhi register before the execution of tlbp. destination: the completion of writing the result of execution of tlbp to the index register. (6) eret source: the confirmation of registers containing information necessary for executing eret. destination: the completion of the processor state transition by the execution of eret. (7) cache index_load_tag destination: the completion of writing the results of execution of this instruction to the related registers. (8) cache index_store_tag source: the confirmation of registers containing information necessary for executing this instruction.
chapter 11 coprocessor 0 hazards user?s manual u15509ej2v0um 425 (9) coprocessor usable test source: the confirmation of modes set by the bits of the cp0 registers in the ?source? column. examples 1. when accessing the cp0 registers in user mode after the cu0 bit of the status register is modified, or when executing an instruction such as tlb instructions, cache instructions, or branch instructions that use the resource of the cp0. 2. when accessing the cp0 registers in the operating mode set in the status register after the ksu, exl, and erl bits of the status register are modified. (10) instruction fetch source: the confirmation of the operating mode and tlb necessary for instruction fetch. examples 1. when changing the operating mode from user to kernel and fetching instructions after the ksu, exl, and erl bits of the status register are modified. 2. when fetching instructions using the modified tlb entry after tlb modification. (11) instruction fetch exception destination: the completion of writing to registers containing information related to the exception when an exception occurs on instruction fetch. (12) interrupts source: the confirmation of registers judging the condition of occurrence of interrupt when an interrupt factor is detected. (13) loads/sores source: the confirmation of the operating mode related to the address generation of load/store instructions, tlb entries, the cache mode set in the k0 bit of the config register, and the registers setting the condition of occurrence of a watch exception. example when loads/stores are executed in the kernel field after changing the mode from user to kernel. (14) load/store exception destination: the completion of writing to registers containing information related to the exception when an exception occurs on load or store operation. (15) tlb shutdown (v r 4181 only) destination: the completion of writing to the ts bit of the status register when a tlb shutdown occurs.
chapter 11 coprocessor 0 hazards user?s manual u15509ej2v0um 426 table 11-2 indicates examples of calculation. table 11-2. calculation example of cp0 hazard and number of instructions inserted number of instructions inserted formula destination source contending internal resource v r 4121, v r 4122, v r 4181, v r 4181a v r 4131 v r 4121, v r 4122, v r 4181, v r 4181a v r 4131 tlbwr/tlbwi tlbp tlb entry 2 2 5 ? (2 + 1) 6 ? (3 + 1) tlbwr/tlbwi load or store using newly modified tlb tlb entry 1 1 5 ? (3 + 1) 6 ? (4 + 1) tlbwr/tlbwi instruction fetch using newly modified tlb tlb entry 2 3 5 ? (2 + 1) 6 ? (2 + 1) mtc0 status [cu] coprocessor instruction that requires the setting of cu status [cu] 2 3 5 ? (2 + 1) 6 ? (2 + 1) tlbr mfc0 entryhi entryhi 1 1 5 ? (3 + 1) 6 ? (4 + 1) mtc0 entrylo0 tlbwr/tlbwi entrylo0 2 2 5 ? (2 + 1) 6 ? (3 + 1) tlbp mfc0 index index 2 1 6 ? (3 + 1) 6 ? (4 + 1) mtc0 entryhi tlbp entryhi 2 2 5 ? (2 + 1) 6 ? (3 + 1) mtc0 epc eret epc 2 0 5 ? (2 + 1) 6 ? (5 + 1) mtc0 status eret status 2 0 5 ? (2 + 1) 6 ? (5 + 1) mtc0 status [ie] note instruction that causes an interrupt status [ie] 2 0 5 ? (2 + 1) 6 ? (5 + 1) note the number of hazards is undefined if the instruction execution sequence is changed by exceptions. in such a case, the minimum number of hazards until the ie bit value is confirmed may be the same as the maximum number of hazards until an interrupt request occurs that is pending and enabled. remark brackets indicate a bit name or a field name of registers.
user?s manual u15509ej2v0um 427 appendix index a access types .......................................................36, 227 address error exception ...........................................179 address spaces.........................................................133 address translation ...................................128, 131, 132 addressing ..................................................................26 addressing modes ......................................30, 124, 164 b badvaddr register.....................................................160 big endian .............................................................26, 27 branch delay ...............................................................90 branch instructions .......................................47, 82, 227 branch prediction ..........................................31, 94, 155 breakpoint exception ................................................185 bus error exception ..................................................183 bypassing..................................................................123 c cache accessing ..............................................................204 index......................................................................204 line size .................................................................204 operations .............................................................202 organization...........................................................200 size................................................................155, 204 states.....................................................................205 cache algorithm ........................................................149 cache data ................................................................200 coherency..............................................................203 placement..............................................................202 cache error register..................................................170 cache line..........................................................200, 201 replacement ..........................................................203 cache memory ..........................................................198 cache tag ..................................................................200 cause register...........................................................165 cold reset exception................................................176 compare register ......................................................161 computational instructions ...................................40, 74 config register...........................................................153 context register.........................................................159 coprocessor 0.............................................................19 coprocessor 0 hazards..............................................421 coprocessor unusable exception .............................186 coprocessors.............................................................. 21 count register........................................................... 160 cp0 ............................................................................ 21 cp0 registers.............................................................. 22 cpu core.................................................................... 19 cpu instruction set............................................. 33, 224 cpu registers ............................................................. 20 d data cache.......................................................... 19, 201 data formats ............................................................... 26 delay slot .................................................. 36, 47, 70, 90 direct mapping.......................................................... 202 doubleword................................................................. 26 e endian ........................................................................ 26 entryhi register................................................. 127, 151 entrylo register ................................................ 127, 148 epc register ............................................................. 167 errorepc register ..................................................... 171 exception .................................................................. 116 priority................................................................... 175 types ..................................................................... 173 vector address ...................................................... 173 exception code ......................................................... 166 exception conditions................................................. 119 exception processing................................................ 157 exception processing registers................................. 158 extend instruction....................................................... 68 g general-purpose register ...................................... 20, 55 h halfword...................................................................... 26 hardware interrupts .................................................. 222 hi register............................................................. 20, 56 i index register.................................................... 127, 147 instruction cache ................................................ 19, 200 instruction formats.................................... 23, 25, 34, 59 instruction notation conventions............................... 224 instruction set architecture ......................................... 33
appendix index user?s manual u15509ej2v0um 428 instruction streaming ........................................ 101, 154 integer overflow exception....................................... 188 interlock.................................................................... 116 interrupt enable ........................................................ 164 interrupt exception ................................................... 190 interrupt signals................................................ 222, 223 interrupts .................................................................. 221 isa mode ................................................................... 56 isa mode bit......................................................... 56, 57 j joint tlb ..................................................................... 30 jtlb........................................................................... 30 jump instruction ........................................... 47, 82, 227 k kernel mode..................................................... 124, 138 kernel mode address space .................................... 139 l line lock function ...................................................... 203 little endian ........................................................... 26, 28 lladdr register......................................................... 155 lo register ........................................................... 20, 56 load delay................................................................. 101 load instructions .......................................... 36, 71, 226 m macc instructions ..................................................... 35 memory hierarchy .................................................... 198 memory management ........................................ 30, 124 memory management registers ............................... 146 mips iii instructions ................................................... 23 mips16 instruction set ....................................... 54, 386 mips16 instructions ................................................... 24 n nmi........................................................................... 221 nmi exception .......................................................... 178 non-maskable interrupt ............................................ 221 o on-chip caches ......................................................... 199 opcode ............................................................... 64, 383 operating modes ........................................ 30, 124, 164 ordinary interrupts .................................................... 221 p page sizes ................................................................ 149 pagemask register............................................ 127, 149 parity error register .................................................. 170 pc......................................................................... 20, 56 pc-relative instructions............................................... 67 physical address............................................... 128, 133 pipeline ................................................................. 31, 84 pipeline activities ...................................................... 102 pipeline stages ............................................... 85, 87, 89 power mode instructions............................................. 35 prid register............................................................. 152 product-sum operation instructions ............................ 35 r random register ............................................... 127, 147 reserved instruction exception ................................ 187 s set associative .......................................................... 202 slip conditions........................................................... 121 soft reset exception ................................................ 177 software interrupts .................................................... 222 special instructions .............................................. 51, 83 special registers ................................................... 20, 56 stall conditions.......................................................... 120 stall cycles .................................................................. 46 status register .......................................................... 161 store instructions.......................................... 36, 71, 226 superscalar ................................................................. 87 supervisor mode .............................................. 124, 135 supervisor mode address space .............................. 136 system call exception .............................................. 184 system control coprocessor ....................................... 21 system control coprocessor (cp0) instructions .. 52, 228 t taghi register ........................................................... 156 taglo register .......................................................... 156 timer interrupt ........................................................... 222 tlb ..................................................................... 30, 125 entry...................................................................... 125 exceptions............................................................. 127 instructions............................................................ 127 manipulation ......................................................... 126 tlb exceptions......................................................... 180 tlb invalid exception ............................................... 181 tlb modified exception ............................................ 182
appendix index user?s manual u15509ej2v0um 429 tlb refill exception ..................................................180 translation lookaside buffer.................................30, 125 trap exception ..........................................................188 u user mode ........................................................124, 133 user mode address space ........................................134 v virtual address...................................................128, 133 w watch exception .......................................................189 watchhi register....................................................... 168 watchlo register ...................................................... 168 way(s)................................................................. 91, 202 wired register........................................................... 150 word............................................................................ 26 writeback .................................................................. 203 x xcontext register...................................................... 169
user?s manual u15509ej2v0um 430 [memo]
although nec has taken all possible steps to ensure that the documentation supplied to our customers is complete, bug free and up-to-date, we readily accept that errors may occur. despite all the care and precautions we've taken, you may encounter problems in the documentation. please complete this form whenever you'd like to report errors or suggest improvements to us. hong kong, philippines, oceania nec electronics hong kong ltd. fax: +852-2886-9022/9044 korea nec electronics hong kong ltd. seoul branch fax: +82-2-528-4411 p.r. china nec electronics shanghai, ltd. nec electronics taiwan ltd. fax: +86-21-6841-1137 address north america nec electronics inc. corporate communications dept. fax: +1-800-729-9288 +1-408-588-6130 europe nec electronics (europe) gmbh market communication dept. fax: +49-211-6503-274 south america nec do brasil s.a. fax: +55-11-6462-6829 taiwan asian nations except philippines nec electronics singapore pte. ltd. fax: +886-2-2719-5951 fax: +65-250-3583 japan nec semiconductor technical hotline fax: +81- 44-435-9608 i would like to report the following error/make the following suggestion: document title: document number: page number: thank you for your kind support. if possible, please fax the referenced page or drawing. excellent good acceptable poor document rating clarity technical accuracy organization cs 02.3 name company from: tel. fax facsimile message

▲Up To Search▲

Price & Availability of UPD30181AY

	To Download UPD30181AY Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .