Title: Unnecessary extp's (ST10) Post by: woj on March 14, 2022, 01:34:22 PM Hi,
So here is one, in some of the ME7.9.10 bins (ST10, if that is not yet clear) that I dissassembled I see unnecessary extp instructions, like this: Code: flash_0324A0:[D7,40,3C,00] extp #0x3C, #1 Unnecessary because dpp2 is 0x3C, so the addressing could be dpp register based, instead of extp-ed. (And no, this is not in those weird places where dpp register is temporarily modified, nothing of this sort). In the same region I find a bit lower addresses (lower F3DXX-s for example) where the dpp register addressing is used where the addresses fall in the same page. So it seems totally unnecessary. Even more weird, the end of page address F3FFC is addressed through dpp register again. So, the question is - is this a compiler bug / feature? Or a CPU bug of some sort that is remedied by those explicit extp-s? (Or something that I missed about ST10 addressing in my now 10 experince experience with it). Title: Re: Unnecessary extp's (ST10) Post by: 360trev on May 05, 2022, 10:26:10 AM The very old GCC based compilers for C167/ST10 are laughably poor when it comes to the code optimization...
It may simply be that the 'project setup' used for compiling the C code was code 'configured' in large memory model and the compiler switches generated included the large memory execution in the IDE, potential simply due to this you see this kind of strangeness. So maybe not even a bug... just poor code optimization. Recall that you can also configure specifically if you want addressing via stack, registers, directly or automatic. That too could be a reason (stupid yes but a reason). If the code was slight better generated it could actually be useful in certain use cases. For instance you've got data currently in a flash segmented area that is accessibly directly (without use of extp instructions due to dpp setup) and you want to support capabilities such as dynamic map switching, etc. but in this instance I would have felt it would make more sense to pass in the segment via a register or on the stack to the function itself, that way if the code was built as a shared library it would be possible to re-use it regardless. Having the segment in the instruction is dumb... I've found myself having to rebuild/patch small memory model compiled functions like gfk_ipol_s8() and patch them to include extp instructions so I can switch maps but in this case i did it something like this... For the C167 so segments are different but the concept is identical... ; ------------------------------------------------------------------------------ ; gfk_ipol_s8() ; ;$Inputs: ; ; r10 = e.g. KFZW2 segment <---- added ; r11 = e.g. SRL12ZUUB segment <---- added ; ; r12 = e.g. KFZW2 ; r13 = e.g. SRL12ZUUB ; r14 = e.g. esst_snm16zuub ; r15 = e.g. esst_srl12zuub ; ------------------------------------------------------------------------------ GKF_IPOL_S8_SEG204: MOV R10, #204h ; E6FA0402 ; KFZW2 SEG MOV R11, #204h ; E6FB0402 ; SRL12ZUUB SEG JMPR CC_UC, GKF_IPOL_S8 ; 0D04 GKF_IPOL_S8_SEG207: MOV R10, #207h ; E6FA0702 ; KFZW2 SEG MOV R11, #207h ; E6FB0702 ; SRL12ZUUB SEG GKF_IPOL_S8: EXTP R11, #1 ; DC4B ; EXTP MOVB RL2, [R13] ; A94D ; *(SRL12ZUUB) ... cut .... I retained the original function prototypes and registers (drop in replacement) but added stubs so that the segments could be configured by the stubs themselves either by setting up registers OR directly invoking the stubs. This allowed these new version of the lookup's to be substituted so you don't need to change function stubs at all if you didn't want to and still access alternative flash segments, particuarly useful when you don't have much space and cannot be bothered to relocate the function or you doing it with automated patching (like I do). Title: Re: Unnecessary extp's (ST10) Post by: 360trev on May 05, 2022, 11:00:39 AM Here's a good example of VERY POOR code optimization... And this is typical of the whole code base by the way...
/**************************************************************** Process: zwgru_10ms Purpose: sequence calls x/_10ms ****************************************************************/ ZWGRU_10ms_BasicIgnitionAngle: ; Code Reference: RHS_raster_10ms 88 90 mov [-r0], r9 88 70 mov [-r0], r7 88 60 mov [-r0], r6 if(fnwue != 0xFF) { // ignition timing #1 zwnws = gkf_ipol_S8(KFZW2, *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub); } else if(fnwue != 0x00) { // ignition timing #2 zwnws = gkf_ipol_S8(KFZW , *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub); } else { ... } 28 02 sub r0, #2 F3 F8 67 8B movb rl4, fnwue ; fnwue: Weighting factor camshaft overlap (inlet) 47 F8 FF 00 cmpb rl4, #0FFh 3D 10 jmpr cc_NZ, lookup_KFZW E6 FC 36 1F mov r12, #KFZW2_CELLS ; KFZW2 : Znndwinkelkennfeld Variante 2 [ZWGRU] E6 FD 13 01 mov r13, #SRL12ZUUB ; number of items on Y F2 FE 64 8E mov r14, esst_snm16zuub ; x index into CURVE F2 FF 76 8E mov r15, esst_srl12zuub ; y index into CURVE DA 00 B8 78 calls 0, gkf_ipol_S8_curve F1 E8 movb rl7, rl4 F7 F8 2D 8D movb zwnws, rl4 ; zwnws : [ZWGRU] E1 0C movb rl6, #0 EA 00 56 F3 jmpa cc_UC, clamp_values ; --------------------------------------------------------------------------- lookup_KFZW: ; Code Reference: ZWGRU_10ms_BasicIgnitionAngle F3 F8 67 8B movb rl4, fnwue ; fnwue : [NWS KWPDATR MDBAS ZWGRU] 3D 0F jmpr cc_NZ, lookup_KFZW2 E6 FC 76 1E mov r12, #KFZW_CELLS ; KFZW : Znndwinkelkennfeld [ZWGRU] E6 FD 13 01 mov r13, #SRL12ZUUB F2 FE 64 8E mov r14, esst_snm16zuub ; esst_snm16zuub : [SSTB ZWGRU ZWMIN] F2 FF 76 8E mov r15, esst_srl12zuub ; esst_srl12zuub : [SSTB ZWGRU ZWMIN] DA 00 B8 78 calls 0, gkf_ipol_S8_curve F1 C8 movb rl6, rl4 F7 F8 2D 8D movb zwnws, rl4 ; zwnws : Grundznndwinkel mit Berncksichtigung von Nockenwellensteuerung [ZWGRU] E1 0E movb rl7, #0 0D 2F jmpr cc_UC, clamp_values ; --------------------------------------------------------------------------- lookup_KFZW2: ; Code Reference: ZWGRU_10ms_BasicIgnitionAngle And how easily it could be optimized... /**************************************************************** Process: zwgru_10ms Purpose: sequence calls x/_10ms ****************************************************************/ ZWGRU_10ms_BasicIgnitionAngle: ; Code Reference: RHS_raster_10ms 88 90 mov [-r0], r9 88 70 mov [-r0], r7 88 60 mov [-r0], r6 if(fnwue != 0xFF) { // ignition timing #1 zwnws = gkf_ipol_S8(KFZW2, *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub); } else if(fnwue != 0x00) { // ignition timing #2 zwnws = gkf_ipol_S8(KFZW , *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub); } else { ... } 28 02 sub r0, #2 E6 FC 76 1E mov r12, #KFZW_CELLS ; KFZW : [ZWGRU] F3 F8 67 8B movb rl10, fnwue ; fnwue: Weighting factor camshaft overlap (inlet) 47 F8 FF 00 cmpb rl10, #0FFh 3D 10 jmpr cc_NZ, lookup_KFZW E6 FC 36 1F mov r12, #KFZW2_CELLS ; KFZW2 : [ZWGRU] E6 FD 13 01 lookup_KFZW: mov r13, #SRL12ZUUB F2 FE 64 8E mov r14, esst_snm16zuub ; esst_snm16zuub : [SSTB ZWGRU ZWMIN] F2 FF 76 8E mov r15, esst_srl12zuub ; esst_srl12zuub : [SSTB ZWGRU ZWMIN] DA 00 B8 78 calls 0, gkf_ipol_S8_curve F1 C8 movb rl6, rl4 F7 F8 2D 8D movb zwnws, rl4 ; zwnws : [ZWGRU] E1 0E movb rl7, #0 Title: Re: Unnecessary extp's (ST10) Post by: prj on May 05, 2022, 05:02:05 PM Not sure they even used GCC... Keil is just as shit, there's zero optimization basically braindead translation of C to asm...
Title: Re: Unnecessary extp's (ST10) Post by: woj on May 06, 2022, 11:50:17 AM Right, OK, so it is not a CPU bug that I should be wary of (I sort of know that already because I made code patches for the very same addresses directly using the dpp addresses), I cannot recall a particular example, but I came across things like this before, that you need some seemingly obsolete code to bypass these (no, not in C167/ST10). In this particular case it seems there is just one narrow range of addresses treated with those unnecessary extp's, throughout the whole code. After what you described it might be that there was indeed some variable relocation after the code was generated. Or someone has forgotten some pragma-s in the source, or whatever else... And about the code optimizations - sometimes it feels like I could rewrite the whole program to use half the flash memory ;) And in rare case one can even do an "in place" code patch for something small.
Title: Re: Unnecessary extp's (ST10) Post by: 360trev on May 07, 2022, 05:21:09 AM Try this tool I just found for C compilation. Never heard of Cosmic before myself! Its from China it seems..
http://www.jzxy.com.cn/support/download/Cosmic/cxST10_C16X_idea_sim_eval.exe Yes its old, last updated in circa 2005 so pretty 'dead' now (unsurprisingly since ST10's/C16x's are no longer manufactured) but it seems the code generation is a bit better from C... Yes this is the demo version and you'll need to copy the 'example' folder elsewhere as its trying to write to it but since its in the "programs folder" permission is blocked by modern windows versions. (yes its that old)... Demo is restricted to 4kbytes code generation however that's trivial to overcome. Code generation seems pretty clean and readable.. 111 ; 26 int fact(int n) 111 ; 27 { 112 switch .text 113 even 114 001a _fact: 116 001a ecf6 push r6 119 001c f06c mov r6,r12 120 ; 28 if (n < 2) 122 001e 4862 cmp r6,#2 123 0020 dd02 jmp cc_sge,L35 124 ; 29 return (1); 126 0022 e014 mov r4,#1 128 0024 0d07 jmp L01 129 0026 L35: 130 ; 31 return (n * fact(n - 1)); 132 0026 f0c6 mov r12,r6 133 0028 28c1 sub r12,#1 134 002a da1a1a00 calls _fact 136 002e 1b46 mulu r4,r6 137 0030 f2f40efe mov r4,mdl 139 0034 L01: 141 0034 fcf6 pop r6 142 0036 db00 rets Disclaimer: only spent 5 minutes looking at it so ymmv... Title: Re: Unnecessary extp's (ST10) Post by: prj on May 20, 2022, 03:25:11 PM unsurprisingly since ST10's/C16x's are no longer manufactured XC166 is alive and kicking.Used for example in Haldex. Maybe latest gen has something else, but they are very much produced and can be bought still. |