Pages: [1]
Author Topic: Unnecessary extp's (ST10)  (Read 3164 times)
woj
Hero Member
*****

Karma: +42/-3
Offline Offline

Posts: 500


« on: March 14, 2022, 01:34:22 PM »

Hi,

So here is one, in some of the ME7.9.10 bins (ST10, if that is not yet clear) that I dissassembled I see unnecessary extp instructions, like this:

Code:
flash_0324A0:[D7,40,3C,00]      extp    #0x3C, #1
flash_0324A4:[F2,FC,0A,3F]      mov     r12, xram2_word_0F3F0A

Unnecessary because dpp2 is 0x3C, so the addressing could be dpp register based, instead of extp-ed. (And no, this is not in those weird places where dpp register is temporarily modified, nothing of this sort). In the same region I find a bit lower addresses (lower F3DXX-s for example) where the dpp register addressing is used where the addresses fall in the same page. So it seems totally unnecessary. Even more weird, the end of page address F3FFC is addressed through dpp register again.

So, the question is - is this a compiler bug / feature? Or a CPU bug of some sort that is remedied by those explicit extp-s? (Or something that I missed about ST10 addressing in my now 10 experince experience with it).
Logged
360trev
Full Member
***

Karma: +68/-2
Offline Offline

Posts: 235


« Reply #1 on: May 05, 2022, 10:26:10 AM »

The very old GCC based compilers for C167/ST10 are laughably poor when it comes to the code optimization...

It may simply be that the 'project setup' used for compiling the C code was code 'configured' in large memory model and the compiler switches generated included the large memory execution in the IDE, potential simply due to this you see this kind of strangeness. So maybe not even a bug... just poor code optimization. Recall that you can also configure specifically if you want addressing via stack, registers, directly or automatic. That too could be a reason (stupid yes but a reason).
 
If the code was slight better generated it could actually be useful in certain use cases.

For instance you've got data currently in a flash segmented area that is accessibly directly (without use of extp instructions due to dpp setup) and you want to support capabilities such as dynamic map switching, etc. but in this instance I would have felt it would make more sense to pass in the segment via a register or on the stack to the function itself, that way if the code was built as a shared library it would be possible to re-use it regardless. Having the segment in the instruction is dumb...

I've found myself having to rebuild/patch small memory model compiled functions like gfk_ipol_s8() and patch them to include extp instructions so I can switch maps but in this case i did it something like this... For the C167 so segments are different but the concept is identical...

; ------------------------------------------------------------------------------
; gfk_ipol_s8()
;
;$Inputs:
;
;                               r10 = e.g. KFZW2 segment   <---- added
;                               r11 = e.g. SRL12ZUUB segment <---- added
;
;                               r12 = e.g. KFZW2
;                               r13 = e.g. SRL12ZUUB
;                               r14 = e.g. esst_snm16zuub
;                               r15 = e.g. esst_srl12zuub
; ------------------------------------------------------------------------------
                                                       
GKF_IPOL_S8_SEG204:                                     
                MOV     R10,  #204h           ; E6FA0402         ; KFZW2 SEG
                MOV     R11,  #204h           ; E6FB0402         ; SRL12ZUUB SEG
                JMPR    CC_UC, GKF_IPOL_S8    ; 0D04     
                                                       
GKF_IPOL_S8_SEG207:                                     
                MOV     R10,  #207h           ; E6FA0702         ; KFZW2 SEG
                MOV     R11,  #207h           ; E6FB0702         ; SRL12ZUUB SEG

                                                       
GKF_IPOL_S8:                                           
                EXTP    R11, #1               ; DC4B             ; EXTP
                MOVB    RL2, [R13]            ; A94D             ; *(SRL12ZUUB)

                ... cut ....


I retained the original function prototypes and registers (drop in replacement) but added stubs so that the segments could be configured by the stubs themselves either by setting up registers OR directly invoking the stubs.

This allowed these new version of the lookup's to be substituted so you don't need to change function stubs at all if you didn't want to and still access alternative flash segments, particuarly useful when you don't have much space and cannot be bothered to relocate the function or you doing it with automated patching (like I do).


« Last Edit: May 05, 2022, 10:29:04 AM by 360trev » Logged
360trev
Full Member
***

Karma: +68/-2
Offline Offline

Posts: 235


« Reply #2 on: May 05, 2022, 11:00:39 AM »

Here's a good example of VERY POOR code optimization... And this is typical of the whole code base by the way...





                                                /****************************************************************
                                                Process: zwgru_10ms
                                                Purpose: sequence calls x/_10ms
                                                ****************************************************************/
                                               
                                                ZWGRU_10ms_BasicIgnitionAngle:          ; Code Reference: RHS_raster_10ms
88 90                                                           mov     [-r0], r9
88 70                                                           mov     [-r0], r7
88 60                                                           mov     [-r0], r6

                                                   if(fnwue != 0xFF)
                                                   {
                                                       // ignition timing #1
                                                       zwnws = gkf_ipol_S8(KFZW2, *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub);
                                                   } else if(fnwue != 0x00)
                                                   {
                                                       // ignition timing #2
                                                       zwnws = gkf_ipol_S8(KFZW , *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub);
                                                   } else {

                                                       ...
                                                   }

28 02                                                           sub     r0, #2

F3 F8 67 8B                                                     movb    rl4, fnwue      ; fnwue: Weighting factor camshaft overlap (inlet)
47 F8 FF 00                                                     cmpb    rl4, #0FFh
3D 10                                                           jmpr    cc_NZ, lookup_KFZW

E6 FC 36 1F                                                     mov     r12, #KFZW2_CELLS ; KFZW2 : Znndwinkelkennfeld Variante 2 [ZWGRU]
E6 FD 13 01                                                     mov     r13, #SRL12ZUUB ; number of items on Y
F2 FE 64 8E                                                     mov     r14, esst_snm16zuub ; x index into CURVE
F2 FF 76 8E                                                     mov     r15, esst_srl12zuub ; y index into CURVE
DA 00 B8 78                                                     calls   0, gkf_ipol_S8_curve
F1 E8                                                           movb    rl7, rl4
F7 F8 2D 8D                                                     movb    zwnws, rl4      ; zwnws : [ZWGRU]
E1 0C                                                           movb    rl6, #0
EA 00 56 F3                                                     jmpa    cc_UC, clamp_values
                                                ; ---------------------------------------------------------------------------
                                                lookup_KFZW:                            ; Code Reference: ZWGRU_10ms_BasicIgnitionAngle
F3 F8 67 8B                                                     movb    rl4, fnwue      ; fnwue : [NWS KWPDATR MDBAS ZWGRU]
3D 0F                                                           jmpr    cc_NZ, lookup_KFZW2

E6 FC 76 1E                                                     mov     r12, #KFZW_CELLS ; KFZW : Znndwinkelkennfeld [ZWGRU]
E6 FD 13 01                                                     mov     r13, #SRL12ZUUB
F2 FE 64 8E                                                     mov     r14, esst_snm16zuub ; esst_snm16zuub :  [SSTB ZWGRU ZWMIN]
F2 FF 76 8E                                                     mov     r15, esst_srl12zuub ; esst_srl12zuub :  [SSTB ZWGRU ZWMIN]
DA 00 B8 78                                                     calls   0, gkf_ipol_S8_curve
F1 C8                                                           movb    rl6, rl4
F7 F8 2D 8D                                                     movb    zwnws, rl4      ; zwnws : Grundznndwinkel mit Berncksichtigung von Nockenwellensteuerung [ZWGRU]
E1 0E                                                           movb    rl7, #0
0D 2F                                                           jmpr    cc_UC, clamp_values
                                                ; ---------------------------------------------------------------------------
                                                lookup_KFZW2:                           ; Code Reference: ZWGRU_10ms_BasicIgnitionAngle



And how easily it could be optimized...


                                               /****************************************************************
                                                Process: zwgru_10ms
                                                Purpose: sequence calls x/_10ms
                                                ****************************************************************/
                                               
                                                ZWGRU_10ms_BasicIgnitionAngle:          ; Code Reference: RHS_raster_10ms
88 90                                                           mov     [-r0], r9
88 70                                                           mov     [-r0], r7
88 60                                                           mov     [-r0], r6

                                                   if(fnwue != 0xFF)
                                                   {
                                                       // ignition timing #1
                                                       zwnws = gkf_ipol_S8(KFZW2, *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub);
                                                   } else if(fnwue != 0x00)
                                                   {
                                                       // ignition timing #2
                                                       zwnws = gkf_ipol_S8(KFZW , *(SRL12ZUUB), esst_snm16zuub, esst_srl12zuub);
                                                   } else {

                                                       ...
                                                   }


28 02                                                           sub     r0, #2

E6 FC 76 1E                                                     mov     r12, #KFZW_CELLS ; KFZW : [ZWGRU]

F3 F8 67 8B                                                     movb    rl10, fnwue      ; fnwue: Weighting factor camshaft overlap (inlet)
47 F8 FF 00                                                     cmpb    rl10, #0FFh
3D 10                                                           jmpr    cc_NZ, lookup_KFZW

E6 FC 36 1F                                                     mov     r12, #KFZW2_CELLS ; KFZW2 : [ZWGRU]

E6 FD 13 01   lookup_KFZW:                                      mov     r13, #SRL12ZUUB
F2 FE 64 8E                                                     mov     r14, esst_snm16zuub ; esst_snm16zuub :  [SSTB ZWGRU ZWMIN]
F2 FF 76 8E                                                     mov     r15, esst_srl12zuub ; esst_srl12zuub :  [SSTB ZWGRU ZWMIN]
DA 00 B8 78                                                     calls   0, gkf_ipol_S8_curve
F1 C8                                                           movb    rl6, rl4
F7 F8 2D 8D                                                     movb    zwnws, rl4      ; zwnws : [ZWGRU]
E1 0E                                                           movb    rl7, #0
« Last Edit: May 05, 2022, 11:02:44 AM by 360trev » Logged
prj
Hero Member
*****

Karma: +1033/-437
Offline Offline

Posts: 5882


« Reply #3 on: May 05, 2022, 05:02:05 PM »

Not sure they even used GCC... Keil is just as shit, there's zero optimization basically braindead translation of C to asm...
Logged

PM's will not be answered, so don't even try.
Log your car properly.
woj
Hero Member
*****

Karma: +42/-3
Offline Offline

Posts: 500


« Reply #4 on: May 06, 2022, 11:50:17 AM »

Right, OK, so it is not a CPU bug that I should be wary of (I sort of know that already because I made code patches for the very same addresses directly using the dpp addresses), I cannot recall a particular example, but I came across things like this before, that you need some seemingly obsolete code to bypass these (no, not in C167/ST10). In this particular case it seems there is just one narrow range of addresses treated with those unnecessary extp's, throughout the whole code. After what you described it might be that there was indeed some variable relocation after the code was generated. Or someone has forgotten some pragma-s in the source, or whatever else... And about the code optimizations - sometimes it feels like I could rewrite the whole program to use half the flash memory Wink And in rare case one can even do an "in place" code patch for something small.
Logged
360trev
Full Member
***

Karma: +68/-2
Offline Offline

Posts: 235


« Reply #5 on: May 07, 2022, 05:21:09 AM »

Try this tool I just found for C compilation. Never heard of Cosmic before myself! Its from China it seems..
 
http://www.jzxy.com.cn/support/download/Cosmic/cxST10_C16X_idea_sim_eval.exe

Yes its old, last updated in circa 2005 so pretty 'dead' now (unsurprisingly since ST10's/C16x's are no longer manufactured) but it seems the code generation is a bit better from C...
Yes this is the demo version and you'll need to copy the 'example' folder elsewhere as its trying to write to it but since its in the "programs folder" permission is blocked by modern windows versions. (yes its that old)...

Demo is restricted to 4kbytes code generation however that's trivial to overcome.

Code generation seems pretty clean and readable..

111                         ; 26 int fact(int n)
 111                         ; 27    {
 112                            switch   .text
 113                            even
 114  001a                   _fact:
 116  001a ecf6                 push   r6
 119  001c f06c                 mov   r6,r12
 120                         ; 28    if (n < 2)
 122  001e 4862                 cmp   r6,#2
 123  0020 dd02                 jmp   cc_sge,L35
 124                         ; 29       return (1);
 126  0022 e014                 mov   r4,#1
 128  0024 0d07                 jmp   L01
 129  0026                   L35:
 130                         ; 31       return (n * fact(n - 1));
 132  0026 f0c6                 mov   r12,r6
 133  0028 28c1                 sub   r12,#1
 134  002a da1a1a00             calls   _fact
 136  002e 1b46                 mulu   r4,r6
 137  0030 f2f40efe             mov   r4,mdl
 139  0034                   L01:
 141  0034 fcf6                 pop   r6
 142  0036 db00                 rets



Disclaimer: only spent 5 minutes looking at it so ymmv...

« Last Edit: May 07, 2022, 06:23:19 AM by 360trev » Logged
prj
Hero Member
*****

Karma: +1033/-437
Offline Offline

Posts: 5882


« Reply #6 on: May 20, 2022, 03:25:11 PM »

unsurprisingly since ST10's/C16x's are no longer manufactured
XC166 is alive and kicking.

Used for example in Haldex. Maybe latest gen has something else, but they are very much produced and can be bought still.
Logged

PM's will not be answered, so don't even try.
Log your car properly.
Pages: [1]
  Print  
 
Jump to:  

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Page created in 0.084 seconds with 18 queries. (Pretty URLs adds 0s, 0q)