NefMoto

Technical => Reverse Engineering => Topic started by: superglitch on July 28, 2021, 10:32:59 AM



Title: Tricore Custom Code
Post by: superglitch on July 28, 2021, 10:32:59 AM
So I'm fairly familiar with IDA Pro and writing assembly, and it's a pain.  I'm looking to start utilizing a compiler to help aid in creating more complex functions and reduce effort required to create patching programs to translate to similar ECU software.

I found Hightech's free compiler here: https://free-entry-toolchain.hightec-rt.com/

I do have that working and generating assembly without issue, my issue comes from properly writing code in order to read and write variables correctly.  I'll keep it very basic in order so we can create knowledge on how to adapt this to any problem/situation that someone is working on.

Here's the C code:
Code:
#include <tc1767.h>

void main()
{
volatile float * xa = (volatile float *)0xD000D000;
volatile float * ya = (volatile float *)0xD000D004;
volatile float * za = (volatile float *)0xD000D008;

float x = *xa;
float y = *ya;
float z = x * y;

*za = z;
}

Which does compile and generates this assembly:
Code:
main:
mov16.aa        a14, sp
sub16.a         sp, #0x18
movh            d15, #0xD001
addi            d15, d15, #-0x3000
st32.w          [a14]-4, d15
movh            d15, #0xD001
addi            d15, d15, #-0x2FFC
st32.w          [a14]-8, d15
movh            d15, #0xD001
addi            d15, d15, #-0x2FF8
st32.w          [a14]-0xC, d15
ld32.w          d15, [a14]-4
mov16.a         a15, d15
ld16.w          d15, [a15]0
st32.w          [a14]-0x10, d15
ld32.w          d15, [a14]-8
mov16.a         a15, d15
ld16.w          d15, [a15]0
st32.w          [a14]-0x14, d15
ld32.w          d2, [a14]-0x10
ld32.w          d15, [a14]-0x14
mul.f           d15, d2, d15
st32.w          [a14]-0x18, d15
ld32.w          d2, [a14]-0xC
ld32.w          d15, [a14]-0x18
mov16.a         a15, d2
st16.w          [a15]0, d15
ret16

Inserting this code causes the ECU to not respond, requiring boot to restore.  Almost as if it's stuck in a loop in the ASW.   :'(


I have manually written assembly to accomplish the same desired code and can be called without issue:
Code:
movh.a          a9, #@HIS(unk_D000C000)
lea             a9, [a9]@LOS(unk_D000C000)
ld32.bu         d15, [a9](unk_D000D000 - unk_D000C000)
ld32.bu         d14, [a9](unk_D000D004 - unk_D000C000)
mul16           d14, d15
st32.b          unk_D000D008, d14
ret16

Obviously this is an over simplification of what I'm actually trying to accomplish, but I figure it's best to keep everything simple until I can get the compiler trick to operate correctly so I can actually perform some real functions and math.

Any help with the method of writing my C or understanding what is going wrong would be greatly appreciated.


Title: Re: Tricore Custom Code
Post by: daniel2345 on July 28, 2021, 11:21:46 AM
Some Registers are used by generic functions in ASW and some are dedicated to BSW.

It changes from build to build. Try to use other Registers in generated assembler code. Then modify toolchain according correct register usage.

Maybe modifications of stack pointer (sp) are also not a good idea, depending on main loop functions or BSW.


Title: Re: Tricore Custom Code
Post by: nihalot on July 28, 2021, 12:38:19 PM
compile the C/C++ code with -mcpu=TC1767 and -O2 or -O3 flags
That way stack ptr is usually avoided by compiler unless absolutely necessary.


Title: Re: Tricore Custom Code
Post by: superglitch on July 28, 2021, 01:01:38 PM
@nihalot that resolved the stack pointer stuff, I am now generating much simpler code

Now we are sitting at
Code:
movh.a          a15, #0xD001
lea             a15, [a15]-0x3000
ld16.w          d2, [a15]0
movh.a          a15, #0xD001
lea             a15, [a15]-0x2FFC
ld16.w          d15, [a15]0
movh.a          a15, #0xD001
mul.f           d15, d2, d15
lea             a15, [a15]-0x2FF8
st16.w          [a15]0, d15
ret16

I have not had a chance to test, but what @daniel2345 brings up. it would be best if possible to somehow inform the compiler of existing registers to use, for example if I know the mappings of a0, a1, or a9 throughout the file without the need to modify a15.  Though I do think that in this particular binary I'm testing with a15 is safe to be modified in the position I'm placing the code.

Is there a better manual out there on the compiler than -v --help?  The verbose help output is definitely something everyone should review.


Title: Re: Tricore Custom Code
Post by: d3irb on July 28, 2021, 01:39:11 PM
https://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Global-Reg-Vars.html#Global-Reg-Vars

Yes, the HighTec compiler is just Tricore backend patches to GCC (GNU C Compiler) so there is a wealth of documentation dating back many years.


Title: Re: Tricore Custom Code
Post by: prj on July 29, 2021, 04:06:26 AM
Read TriCore manual and EABI.

You're not gonna get anywhere without.


Title: Re: Tricore Custom Code
Post by: jcsbanks on July 29, 2021, 08:17:38 AM
Apart from the good advice already given, I never tried floats since the OS in my targets doesn't use floats, it might need some initialization that you are not having. You can use the stack though, but it is rarely needed or used when using high -O levels. Don't use a0, a1, a8, a9, your compiler isn't and will generally follow the EABI and work with existing code. Read up on upper and lower contexts, check out the calling conventions using the first variable in d4, first pointer in a4, return variable in d2 and if you are returning a pointer, a2. IIRC you can pass up to 4 variables and 4 pointers before the EABI and the compiler will use the stack.


Title: Re: Tricore Custom Code
Post by: Herleybob on August 30, 2021, 11:02:52 PM
I use these guidelines when compiling code. 99% of the time I will be patching a function, or jumping out of the end of a call tree into my own code so I can either use the registers that hightec assigns or I can change them out to not modify existing ones. I like to just use a new function per code segment i'm writing because ghidra makes it easy to find it.

You will need to disassemble (obviously) and then inspect the asm before patching.

Code:
#define CUSTOM_VARIABLE_ONE (*((volatile unsigned char *)  0xD0003F11))
#define CUSTOM_VARIABLE_TWO (*((volatile unsigned char *)  0xD0003F12))
#define CUSTOM_VARIABLE_THREE (*((volatile unsigned char *)  0xD0003F13))

void PROCESS_EXAMPLE() {

unsigned char adc = CUSTOM_VARIABLE_ONE ;

if(adc > CUSTOM_VARIABLE_TWO ) {
CUSTOM_VARIABLE_THREE = 2;
return;
} else {
CUSTOM_VARIABLE_THREE = 1;
return;
}
}

If you don't need to repeatedly call the address, just make a pointer to the address.

Code:
void COPY_PARAM_TO_RAM() {
volatile short *delay_ms =   (int *)0xD0003F24;
volatile short *const_delay_ms =   (int *)0x800A02F8;

*delay_ms = *const_delay_ms;
}

Also, ive found multiple functions that process 2D and 3D tables, returning varying variables ( chars, shorts, longs, signed, unsigned, etc). I created a faux function that calls to the map interpolation function that i want to use, and give it the same arguments so that i can call it and then just replace the call address when i decompile and patch it in. (hopefully that made sense, im just trying to interpolate my own 2d/3d tables so i use the built in functions).

Once I compile it and then disassemble it in Ghidra, i know the call address to the actual interpolation functions, i will replace those in my code asm and verify it is putting the correct variables in the registers before it calls.

Code:
unsigned char FUNC_PROCESS_2D_TABLE_8BIT(unsigned char AXIS_LENGTH, unsigned char MONITOR ,volatile unsigned char *AXIS_VALUES, volatile unsigned char *TABLE_VALUES) 
{
    return MONITOR; // Ignore this, it isnt actually used when i disassemble (only here as a "placeholder")
}

unsigned char FUNC_PROCESS_2D_TABLE(unsigned char MONITOR, volatile unsigned char *TABLE)
{
    return MONITOR; // Ignore this, it isnt actually used when i disassemble (only here as a "placeholder")
}

unsigned char INTERPOLATE_ALL_TABLE_ETHANOL(volatile unsigned char * TABLE_ONE, volatile unsigned char * TABLE_TWO) {
FLEX_SWITCH_T1V = FUNC_PROCESS_2D_TABLE(ECT, TABLE_ONE);
FLEX_SWITCH_T2V = FUNC_PROCESS_2D_TABLE(ECT, TABLE_TWO);

return FUNC_PROCESS_2D_TABLE_8BIT(0x2, FLEX_ETH_PERC, &FLEX_SWITCH_AXIS, &FLEX_SWITCH_T1V);
}

As a final example, i'm working on adding turn signals to a SxS, something that didnt originally come with the function. See below for my code. It might not be 100% correct but it works and give me the least amount of asm.

Code:
void PROCESS_TURN_SIGNALS() {
volatile char *tmp_status =       (int *)0xD0003F20;
volatile char *flash_pos =        (int *)0xD0003F21;
volatile char *flash_count =      (int *)0xD0003F22;
volatile short *delay_ms =        (int *)0xD0003F24;
volatile short *timer =           (int *)0xD0003F26;
volatile short *num_flashes =     (int *)0xD0003F28;
volatile short *triple_timer =    (int *)0xD0003F2C;

switch(TURN_ADC_IN >> 2)  {
case 257 ... 512: // Left turn
*flash_pos = 1;
*flash_count = 0;
*tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011);
goto enable;
case 513 ... 768: // Right turn
*flash_pos = 2;
*flash_count = 0;
*tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011);
goto enable;
case 769 ... 1024: // Hazards
*flash_pos = 3;
*flash_count = 0;
*tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011);
goto enable;
default:
if(*flash_count > 0) {
goto enable;
}

if(*num_flashes == 0 && *triple_timer >= *timer) {
*flash_count = 5;
goto enable;
}

*tmp_status = 12; // sets outputs to high at the start of flashing.
*timer = 0;
*num_flashes = 0;
goto end;
}

enable:
if(*delay_ms >= *timer) {
*timer += 1;
goto end;
} else {
*timer = 0;
if(*flash_count != 0) *flash_count -= 1;
*tmp_status ^= (*flash_pos << 2);
*num_flashes++;
goto end;
}

end:
return;
}

In the end I will decompile my code, verify the flow looks correct and then patch it in, Make sure it isn't overwriting any registers that are needed, and then try it out. The above code needs 0 modifying before patching in and trying.


EDIT:

See below for some macros to work with bits. I have needed these quite a bit and just got the storebit working.

These will translate into:

JZ.T/JNZ.T - For getting a bit
ST.T - For storing a bit

Code:
#define GETBIT(var, bit)	(((char)(var) >> (bit)) & 1) 
#define STOREBIT(addr,bpos,b)   __asm("st.t %0,%1,%2"::"i"(addr),"i"(bpos),"i"(b))

Examples:

#define NOTAREALADDRESS   (*((volatile unsigned char *) 0xD00017AA))

/Getting a bit
char brakeStatus = GETBIT(NOTAREALADDRESS, 3);

//Setting a bit
STOREBIT(&NOTAREALADDRESS, 0x3, 0x1);


Title: Re: Tricore Custom Code
Post by: fknbrkn on January 22, 2025, 12:17:57 AM
This is my first approach to med17 code, simple routine to change NLLM - NLLMGS selection logic based on gwhpos (dsg mode)
Just want to confirm my workflow from more experienced users

stock selection based on b_fs (gearbox type iirc) but file i work on is a great example of nowadays lazy-optimization strategy .. anyway, code looks like:

Quote
<...some axis things ..>
PFLASH:800D88F6 D9 F4 3E 00                 lea             a4, [a15](nllmgfs_map - unk_801E117C)
PFLASH:800D88FA 02 26                       mov16           d6, d2
PFLASH:800D88FC 59 02 74 39                 st32.w          [a0](unk_D0004EF4 - unk_D000BA00), d2 ;axis
PFLASH:800D8900 09 C4 00 08                 ld.b            d4, [a12]0
PFLASH:800D8904 6D 01 93 81                 call32          func_map3d_8bit
PFLASH:800D8908 02 F5                       mov16           d5, d15
PFLASH:800D890A D9 F4 08 00                 lea             a4, [a15](nllm_map - unk_801E117C)
PFLASH:800D890E 09 C4 00 08                 ld.b            d4, [a12]0
PFLASH:800D8912 02 28                       mov16           d8, d2 ;result of NLLMGFS goest to d8
PFLASH:800D8914 19 06 74 39                 ld32.w          d6, [a0](unk_D0004EF4 - unk_D000BA00)
PFLASH:800D8918 6D 01 89 81                 call32          func_map3d_8bit
PFLASH:800D891C 05 DF D7 24                 ld32.bu         d15, byte_D0000C97 ;b_fs here
PFLASH:800D8920 87 FF 42 F1                 nor.t           d15, d15:2, d15:2
PFLASH:800D8924 2B 82 40 FF                 sel             d15, d15, d2, d8 ; both maps proceed but selected only one result; d2 in case NLLM and d8 for NLLMGFS
PFLASH:800D8928 25 DF 44 63                 st32.b          nsolbas, d15


so the idea to call from here
 sel             d15, d15, d2, d8

 to my routine with
Quote
PFLASH:800D8924 ED 8B 00 B8                 calla           sub_80177000
and return selected value to d15

but as its already both maps calculated and stored at d2 and d8 registers, ive decided to simplify routine with using predefined asm instructions

my project looks like that (options: -ffixed-d15 -ffixed-a15 -O2 -O3 -mcpu-tc1767)

Code:
#define gwhpos (*((volatile char  *) 0xD000370C )) 
#define asm_(dst) __asm__(dst)


void getNLLMVariant()
{

asm_("mov %d6, %d2");
if ((gwhpos == 12) || (gwhpos == 14))
{
asm_("mov %d15, %d8");
}
else
{
asm_("mov %d15, %d6");
}


return; //tmp;

}

and the listing:

Code:
  16 0000 0226     	mov %d6,%d2
  17             
  18             
  19 0002 05D24CC3 ld.b %d2,0xd000370c
  20 0006 8BC20022 eq %d2,%d2,12
  21 000a F628      jnz %d2,.L3
  22 000c 05D24CC3 ld.b %d2,0xd000370c
  23 0010 8BE22022 ne %d2,%d2,14
  24 0014 7623      jz %d2,.L3
  25             
  26             
  27 0016 026F      mov %d15,%d6
  28             
  29             
  30 0018 0090      ret
  31              .L3:
  32             
  33             
  34 001a 028F      mov %d15,%d8
  35             
  36             
  37 001c 0090      ret

finally in IDA:

Code:
PFLASH:80177000             sub_80177000:                           ; CODE XREF: sub_800D88B8+6C↑p
PFLASH:80177000 02 26                       mov16           d6, d2
PFLASH:80177002 05 D2 4C C3                 ld.b            d2, gwhpos
PFLASH:80177006 8B C2 00 22                 eq32            d2, d2, #12
PFLASH:8017700A F6 28                       jnz16           d2, loc_8017701A
PFLASH:8017700C 05 D2 4C C3                 ld.b            d2, gwhpos
PFLASH:80177010 8B E2 20 22                 ne              d2, d2, #14
PFLASH:80177014 76 23                       jz16            d2, loc_8017701A
PFLASH:80177016 02 6F                       mov16           d15, d6
PFLASH:80177018 00 90                       ret16
PFLASH:8017701A             ; ---------------------------------------------------------------------------
PFLASH:8017701A
PFLASH:8017701A             loc_8017701A:                           ; CODE XREF: sub_80177000+A↑j
PFLASH:8017701A                                                     ; sub_80177000+14↑j
PFLASH:8017701A 02 8F                       mov16           d15, d8
PFLASH:8017701C 00 90                       ret16
PFLASH:8017701C             ; End of function sub_80177000

thats the only way i find to avoid using some registers which are stored some data used in code after my call
is it ok at all?

im not sure about sizing of variables, ive defined gwhpos as char and it compiles as ld.b operand but stock routines using ld32.bu and im a bit confused here due to gwhpos is a 8 bit (char) value in a RAM



Title: Re: Tricore Custom Code
Post by: prj on January 22, 2025, 04:48:51 AM
No this is not ok. Read the EABI and the instruction set manual for CALL function.
d15 gets stored when calling and then restored on a ret.
Any modifications you make in the routine to d15 will not affect anything.

Your code does nothing.

overwrite the st32.b to nsolbas with a call and store whatever you like in there.
Don't use asm in the C code when it can be avoided, just use straight C.

Because you don't care about b_fs anymore, the bit extraction or the selection, just overwrite the nor and selection with two moves:
Code:
mov d4, d2
mov d5, d8
call getNLLMVariant
And finally change the st32.b from d15 to d2:
Code:
st32.b nsolbas, d2

And the routine looks like this:

Code:
uint8_t getNLLMVariant(uint8_t nllm_value, uint8_t nllmgfs_value) {
   return ((gwhpos == 12) || (gwhpos == 14)) ? nllmgfs_value : nllm_value
}

Work smarter :p
Alternatively you can also move nsolbas to a4 and add a pointer argument, and do the store directly in the routine.


Title: Re: Tricore Custom Code
Post by: prj on January 22, 2025, 04:54:30 AM
im not sure about sizing of variables, ive defined gwhpos as char and it compiles as ld.b operand but stock routines using ld32.bu and im a bit confused here due to gwhpos is a 8 bit (char) value in a RAM

It is probably UBYTE.
char is signed...

Anyway I recommend using stdint.
#include <stdint.h>

And use the stdint types.


Title: Re: Tricore Custom Code
Post by: fknbrkn on January 23, 2025, 02:10:00 AM
Quote
The conventions described here assume the use of the TriCore call / return mechanism, which automatically saves
registers D[8] through D[15] and A[10] through A[15] as a side effect of the CALL instruction, and restores them
as a side effect of the RET instruction. The registers saved automatically include the stack pointer A[10], so a
called function requires no epilog to restore the caller's stack pointer value prior to returning


Oh i see now, thats a kinda local variables.. a bit tricky after me7 but thats definitely much comfortable for developer

At the process of the analyzing other tuner modifications i wonder why he doesnt use CALLA running JUMPA instead and JUMPA back to stock routine, probably hes also doesnt read EABI lol..  is it safe to run multiple included calls btw, should i care about stack overflow?

thanks for the input prj, brilliant as usual


Title: Re: Tricore Custom Code
Post by: prj on January 23, 2025, 05:19:39 AM
Stack is usually deeper than you think.

Jumps are used because:
a) No room to place call
b) Lazy
c) Don't know EABI


Title: Re: Tricore Custom Code
Post by: fknbrkn on January 24, 2025, 05:04:55 PM
The code works perfect

also as i need to make asm changes in original code like making CALLA, mov etc i found a way to make it with asm in C

Code:
#define asm_(dst) __asm__(dst)

asm_("mov %d4, %d2");
asm_("mov %d5, %d15");
asm_("CALLA 0x177000");


which gives me a listing with hex codes for that part

Code:
0000 0224     	mov %d4,%d2
  10             
  11             
  12 0002 02F5      mov %d5,%d15
  13             
  14             
  15 0004 ED8B00B8 CALLA 0x80177000

Btw what is the best practices to import changes to original file?