superglitch
Jr. Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +4/-0
Offline
Posts: 45
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« on: July 28, 2021, 10:32:59 AM »
|
|
|
So I'm fairly familiar with IDA Pro and writing assembly, and it's a pain. I'm looking to start utilizing a compiler to help aid in creating more complex functions and reduce effort required to create patching programs to translate to similar ECU software. I found Hightech's free compiler here: https://free-entry-toolchain.hightec-rt.com/I do have that working and generating assembly without issue, my issue comes from properly writing code in order to read and write variables correctly. I'll keep it very basic in order so we can create knowledge on how to adapt this to any problem/situation that someone is working on. Here's the C code: #include <tc1767.h>
void main() { volatile float * xa = (volatile float *)0xD000D000; volatile float * ya = (volatile float *)0xD000D004; volatile float * za = (volatile float *)0xD000D008;
float x = *xa; float y = *ya; float z = x * y;
*za = z; }
Which does compile and generates this assembly: main: mov16.aa a14, sp sub16.a sp, #0x18 movh d15, #0xD001 addi d15, d15, #-0x3000 st32.w [a14]-4, d15 movh d15, #0xD001 addi d15, d15, #-0x2FFC st32.w [a14]-8, d15 movh d15, #0xD001 addi d15, d15, #-0x2FF8 st32.w [a14]-0xC, d15 ld32.w d15, [a14]-4 mov16.a a15, d15 ld16.w d15, [a15]0 st32.w [a14]-0x10, d15 ld32.w d15, [a14]-8 mov16.a a15, d15 ld16.w d15, [a15]0 st32.w [a14]-0x14, d15 ld32.w d2, [a14]-0x10 ld32.w d15, [a14]-0x14 mul.f d15, d2, d15 st32.w [a14]-0x18, d15 ld32.w d2, [a14]-0xC ld32.w d15, [a14]-0x18 mov16.a a15, d2 st16.w [a15]0, d15 ret16
Inserting this code causes the ECU to not respond, requiring boot to restore. Almost as if it's stuck in a loop in the ASW. ![Cry](http://nefariousmotorsports.com/forum/Smileys/default/cry.gif) I have manually written assembly to accomplish the same desired code and can be called without issue: movh.a a9, #@HIS(unk_D000C000) lea a9, [a9]@LOS(unk_D000C000) ld32.bu d15, [a9](unk_D000D000 - unk_D000C000) ld32.bu d14, [a9](unk_D000D004 - unk_D000C000) mul16 d14, d15 st32.b unk_D000D008, d14 ret16
Obviously this is an over simplification of what I'm actually trying to accomplish, but I figure it's best to keep everything simple until I can get the compiler trick to operate correctly so I can actually perform some real functions and math. Any help with the method of writing my C or understanding what is going wrong would be greatly appreciated.
|
|
|
Logged
|
|
|
|
daniel2345
Full Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +11/-7
Offline
Posts: 200
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #1 on: July 28, 2021, 11:21:46 AM »
|
|
|
Some Registers are used by generic functions in ASW and some are dedicated to BSW.
It changes from build to build. Try to use other Registers in generated assembler code. Then modify toolchain according correct register usage.
Maybe modifications of stack pointer (sp) are also not a good idea, depending on main loop functions or BSW.
|
|
|
Logged
|
|
|
|
nihalot
Full Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +42/-3
Offline
Posts: 117
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #2 on: July 28, 2021, 12:38:19 PM »
|
|
|
compile the C/C++ code with -mcpu=TC1767 and -O2 or -O3 flags That way stack ptr is usually avoided by compiler unless absolutely necessary.
|
|
|
Logged
|
www.tangentmotorsport.commultimap/LC/rolling antilag for MG1/MED17/EDC17/MED9/EDC15 contact for reverse engineering services of any ECU/TCU
|
|
|
superglitch
Jr. Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +4/-0
Offline
Posts: 45
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #3 on: July 28, 2021, 01:01:38 PM »
|
|
|
@nihalot that resolved the stack pointer stuff, I am now generating much simpler code Now we are sitting at movh.a a15, #0xD001 lea a15, [a15]-0x3000 ld16.w d2, [a15]0 movh.a a15, #0xD001 lea a15, [a15]-0x2FFC ld16.w d15, [a15]0 movh.a a15, #0xD001 mul.f d15, d2, d15 lea a15, [a15]-0x2FF8 st16.w [a15]0, d15 ret16
I have not had a chance to test, but what @daniel2345 brings up. it would be best if possible to somehow inform the compiler of existing registers to use, for example if I know the mappings of a0, a1, or a9 throughout the file without the need to modify a15. Though I do think that in this particular binary I'm testing with a15 is safe to be modified in the position I'm placing the code. Is there a better manual out there on the compiler than -v --help? The verbose help output is definitely something everyone should review.
|
|
|
Logged
|
|
|
|
|
prj
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #5 on: July 29, 2021, 04:06:26 AM »
|
|
|
Read TriCore manual and EABI.
You're not gonna get anywhere without.
|
|
|
Logged
|
|
|
|
jcsbanks
Full Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +20/-3
Offline
Posts: 150
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #6 on: July 29, 2021, 08:17:38 AM »
|
|
|
Apart from the good advice already given, I never tried floats since the OS in my targets doesn't use floats, it might need some initialization that you are not having. You can use the stack though, but it is rarely needed or used when using high -O levels. Don't use a0, a1, a8, a9, your compiler isn't and will generally follow the EABI and work with existing code. Read up on upper and lower contexts, check out the calling conventions using the first variable in d4, first pointer in a4, return variable in d2 and if you are returning a pointer, a2. IIRC you can pass up to 4 variables and 4 pointers before the EABI and the compiler will use the stack.
|
|
|
Logged
|
|
|
|
Herleybob
Newbie
Karma: +5/-0
Offline
Posts: 11
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #7 on: August 30, 2021, 11:02:52 PM »
|
|
|
I use these guidelines when compiling code. 99% of the time I will be patching a function, or jumping out of the end of a call tree into my own code so I can either use the registers that hightec assigns or I can change them out to not modify existing ones. I like to just use a new function per code segment i'm writing because ghidra makes it easy to find it. You will need to disassemble (obviously) and then inspect the asm before patching. #define CUSTOM_VARIABLE_ONE (*((volatile unsigned char *) 0xD0003F11)) #define CUSTOM_VARIABLE_TWO (*((volatile unsigned char *) 0xD0003F12)) #define CUSTOM_VARIABLE_THREE (*((volatile unsigned char *) 0xD0003F13))
void PROCESS_EXAMPLE() { unsigned char adc = CUSTOM_VARIABLE_ONE ; if(adc > CUSTOM_VARIABLE_TWO ) { CUSTOM_VARIABLE_THREE = 2; return; } else { CUSTOM_VARIABLE_THREE = 1; return; } }
If you don't need to repeatedly call the address, just make a pointer to the address. void COPY_PARAM_TO_RAM() { volatile short *delay_ms = (int *)0xD0003F24; volatile short *const_delay_ms = (int *)0x800A02F8; *delay_ms = *const_delay_ms; } Also, ive found multiple functions that process 2D and 3D tables, returning varying variables ( chars, shorts, longs, signed, unsigned, etc). I created a faux function that calls to the map interpolation function that i want to use, and give it the same arguments so that i can call it and then just replace the call address when i decompile and patch it in. (hopefully that made sense, im just trying to interpolate my own 2d/3d tables so i use the built in functions). Once I compile it and then disassemble it in Ghidra, i know the call address to the actual interpolation functions, i will replace those in my code asm and verify it is putting the correct variables in the registers before it calls. unsigned char FUNC_PROCESS_2D_TABLE_8BIT(unsigned char AXIS_LENGTH, unsigned char MONITOR ,volatile unsigned char *AXIS_VALUES, volatile unsigned char *TABLE_VALUES) { return MONITOR; // Ignore this, it isnt actually used when i disassemble (only here as a "placeholder") }
unsigned char FUNC_PROCESS_2D_TABLE(unsigned char MONITOR, volatile unsigned char *TABLE) { return MONITOR; // Ignore this, it isnt actually used when i disassemble (only here as a "placeholder") }
unsigned char INTERPOLATE_ALL_TABLE_ETHANOL(volatile unsigned char * TABLE_ONE, volatile unsigned char * TABLE_TWO) { FLEX_SWITCH_T1V = FUNC_PROCESS_2D_TABLE(ECT, TABLE_ONE); FLEX_SWITCH_T2V = FUNC_PROCESS_2D_TABLE(ECT, TABLE_TWO); return FUNC_PROCESS_2D_TABLE_8BIT(0x2, FLEX_ETH_PERC, &FLEX_SWITCH_AXIS, &FLEX_SWITCH_T1V); }
As a final example, i'm working on adding turn signals to a SxS, something that didnt originally come with the function. See below for my code. It might not be 100% correct but it works and give me the least amount of asm. void PROCESS_TURN_SIGNALS() { volatile char *tmp_status = (int *)0xD0003F20; volatile char *flash_pos = (int *)0xD0003F21; volatile char *flash_count = (int *)0xD0003F22; volatile short *delay_ms = (int *)0xD0003F24; volatile short *timer = (int *)0xD0003F26; volatile short *num_flashes = (int *)0xD0003F28; volatile short *triple_timer = (int *)0xD0003F2C;
switch(TURN_ADC_IN >> 2) { case 257 ... 512: // Left turn *flash_pos = 1; *flash_count = 0; *tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011); goto enable; case 513 ... 768: // Right turn *flash_pos = 2; *flash_count = 0; *tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011); goto enable; case 769 ... 1024: // Hazards *flash_pos = 3; *flash_count = 0; *tmp_status = (*tmp_status & ~0b00000011) | (*flash_pos & 0b00000011); goto enable; default: if(*flash_count > 0) { goto enable; } if(*num_flashes == 0 && *triple_timer >= *timer) { *flash_count = 5; goto enable; } *tmp_status = 12; // sets outputs to high at the start of flashing. *timer = 0; *num_flashes = 0; goto end; } enable: if(*delay_ms >= *timer) { *timer += 1; goto end; } else { *timer = 0; if(*flash_count != 0) *flash_count -= 1; *tmp_status ^= (*flash_pos << 2); *num_flashes++; goto end; }
end: return; } In the end I will decompile my code, verify the flow looks correct and then patch it in, Make sure it isn't overwriting any registers that are needed, and then try it out. The above code needs 0 modifying before patching in and trying. EDIT: See below for some macros to work with bits. I have needed these quite a bit and just got the storebit working. These will translate into: JZ.T/JNZ.T - For getting a bit ST.T - For storing a bit #define GETBIT(var, bit) (((char)(var) >> (bit)) & 1) #define STOREBIT(addr,bpos,b) __asm("st.t %0,%1,%2"::"i"(addr),"i"(bpos),"i"(b))
Examples:
#define NOTAREALADDRESS (*((volatile unsigned char *) 0xD00017AA))
/Getting a bit char brakeStatus = GETBIT(NOTAREALADDRESS, 3);
//Setting a bit STOREBIT(&NOTAREALADDRESS, 0x3, 0x1);
|
|
« Last Edit: September 16, 2021, 11:50:00 PM by Herleybob »
|
Logged
|
|
|
|
fknbrkn
Hero Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +192/-24
Offline
Posts: 1483
mk4 1.8T AUM
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #8 on: January 22, 2025, 12:17:57 AM »
|
|
|
This is my first approach to med17 code, simple routine to change NLLM - NLLMGS selection logic based on gwhpos (dsg mode) Just want to confirm my workflow from more experienced users stock selection based on b_fs (gearbox type iirc) but file i work on is a great example of nowadays lazy-optimization strategy .. anyway, code looks like: <...some axis things ..> PFLASH:800D88F6 D9 F4 3E 00 lea a4, [a15](nllmgfs_map - unk_801E117C) PFLASH:800D88FA 02 26 mov16 d6, d2 PFLASH:800D88FC 59 02 74 39 st32.w [a0](unk_D0004EF4 - unk_D000BA00), d2 ;axis PFLASH:800D8900 09 C4 00 08 ld.b d4, [a12]0 PFLASH:800D8904 6D 01 93 81 call32 func_map3d_8bit PFLASH:800D8908 02 F5 mov16 d5, d15 PFLASH:800D890A D9 F4 08 00 lea a4, [a15](nllm_map - unk_801E117C) PFLASH:800D890E 09 C4 00 08 ld.b d4, [a12]0 PFLASH:800D8912 02 28 mov16 d8, d2 ;result of NLLMGFS goest to d8 PFLASH:800D8914 19 06 74 39 ld32.w d6, [a0](unk_D0004EF4 - unk_D000BA00) PFLASH:800D8918 6D 01 89 81 call32 func_map3d_8bit PFLASH:800D891C 05 DF D7 24 ld32.bu d15, byte_D0000C97 ;b_fs here PFLASH:800D8920 87 FF 42 F1 nor.t d15, d15:2, d15:2 PFLASH:800D8924 2B 82 40 FF sel d15, d15, d2, d8 ; both maps proceed but selected only one result; d2 in case NLLM and d8 for NLLMGFS PFLASH:800D8928 25 DF 44 63 st32.b nsolbas, d15 so the idea to call from here sel d15, d15, d2, d8 to my routine with PFLASH:800D8924 ED 8B 00 B8 calla sub_80177000 and return selected value to d15 but as its already both maps calculated and stored at d2 and d8 registers, ive decided to simplify routine with using predefined asm instructions my project looks like that (options: -ffixed-d15 -ffixed-a15 -O2 -O3 -mcpu-tc1767) #define gwhpos (*((volatile char *) 0xD000370C )) #define asm_(dst) __asm__(dst)
void getNLLMVariant() { asm_("mov %d6, %d2"); if ((gwhpos == 12) || (gwhpos == 14)) { asm_("mov %d15, %d8"); } else { asm_("mov %d15, %d6"); }
return; //tmp;
} and the listing: 16 0000 0226 mov %d6,%d2 17 18 19 0002 05D24CC3 ld.b %d2,0xd000370c 20 0006 8BC20022 eq %d2,%d2,12 21 000a F628 jnz %d2,.L3 22 000c 05D24CC3 ld.b %d2,0xd000370c 23 0010 8BE22022 ne %d2,%d2,14 24 0014 7623 jz %d2,.L3 25 26 27 0016 026F mov %d15,%d6 28 29 30 0018 0090 ret 31 .L3: 32 33 34 001a 028F mov %d15,%d8 35 36 37 001c 0090 ret finally in IDA: PFLASH:80177000 sub_80177000: ; CODE XREF: sub_800D88B8+6C↑p PFLASH:80177000 02 26 mov16 d6, d2 PFLASH:80177002 05 D2 4C C3 ld.b d2, gwhpos PFLASH:80177006 8B C2 00 22 eq32 d2, d2, #12 PFLASH:8017700A F6 28 jnz16 d2, loc_8017701A PFLASH:8017700C 05 D2 4C C3 ld.b d2, gwhpos PFLASH:80177010 8B E2 20 22 ne d2, d2, #14 PFLASH:80177014 76 23 jz16 d2, loc_8017701A PFLASH:80177016 02 6F mov16 d15, d6 PFLASH:80177018 00 90 ret16 PFLASH:8017701A ; --------------------------------------------------------------------------- PFLASH:8017701A PFLASH:8017701A loc_8017701A: ; CODE XREF: sub_80177000+A↑j PFLASH:8017701A ; sub_80177000+14↑j PFLASH:8017701A 02 8F mov16 d15, d8 PFLASH:8017701C 00 90 ret16 PFLASH:8017701C ; End of function sub_80177000 thats the only way i find to avoid using some registers which are stored some data used in code after my call is it ok at all? im not sure about sizing of variables, ive defined gwhpos as char and it compiles as ld.b operand but stock routines using ld32.bu and im a bit confused here due to gwhpos is a 8 bit (char) value in a RAM
|
|
« Last Edit: January 22, 2025, 12:30:31 AM by fknbrkn »
|
Logged
|
|
|
|
prj
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #9 on: January 22, 2025, 04:48:51 AM »
|
|
|
No this is not ok. Read the EABI and the instruction set manual for CALL function. d15 gets stored when calling and then restored on a ret. Any modifications you make in the routine to d15 will not affect anything. Your code does nothing. overwrite the st32.b to nsolbas with a call and store whatever you like in there. Don't use asm in the C code when it can be avoided, just use straight C. Because you don't care about b_fs anymore, the bit extraction or the selection, just overwrite the nor and selection with two moves: mov d4, d2 mov d5, d8 call getNLLMVariant And finally change the st32.b from d15 to d2: And the routine looks like this: uint8_t getNLLMVariant(uint8_t nllm_value, uint8_t nllmgfs_value) { return ((gwhpos == 12) || (gwhpos == 14)) ? nllmgfs_value : nllm_value } Work smarter :p Alternatively you can also move nsolbas to a4 and add a pointer argument, and do the store directly in the routine.
|
|
|
Logged
|
|
|
|
prj
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #10 on: January 22, 2025, 04:54:30 AM »
|
|
|
im not sure about sizing of variables, ive defined gwhpos as char and it compiles as ld.b operand but stock routines using ld32.bu and im a bit confused here due to gwhpos is a 8 bit (char) value in a RAM It is probably UBYTE. char is signed... Anyway I recommend using stdint. #include <stdint.h> And use the stdint types.
|
|
|
Logged
|
|
|
|
fknbrkn
Hero Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +192/-24
Offline
Posts: 1483
mk4 1.8T AUM
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #11 on: January 23, 2025, 02:10:00 AM »
|
|
|
The conventions described here assume the use of the TriCore call / return mechanism, which automatically saves registers D[8] through D[15] and A[10] through A[15] as a side effect of the CALL instruction, and restores them as a side effect of the RET instruction. The registers saved automatically include the stack pointer A[10], so a called function requires no epilog to restore the caller's stack pointer value prior to returning Oh i see now, thats a kinda local variables.. a bit tricky after me7 but thats definitely much comfortable for developer At the process of the analyzing other tuner modifications i wonder why he doesnt use CALLA running JUMPA instead and JUMPA back to stock routine, probably hes also doesnt read EABI lol.. is it safe to run multiple included calls btw, should i care about stack overflow? thanks for the input prj, brilliant as usual
|
|
« Last Edit: January 23, 2025, 02:14:11 AM by fknbrkn »
|
Logged
|
|
|
|
prj
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #12 on: January 23, 2025, 05:19:39 AM »
|
|
|
Stack is usually deeper than you think.
Jumps are used because: a) No room to place call b) Lazy c) Don't know EABI
|
|
|
Logged
|
|
|
|
fknbrkn
Hero Member
![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif) ![*](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/star.gif)
Karma: +192/-24
Offline
Posts: 1483
mk4 1.8T AUM
|
![](http://nefariousmotorsports.com/forum/Themes/nefmoto/images/post/xx.gif) |
« Reply #13 on: January 24, 2025, 05:04:55 PM »
|
|
|
The code works perfect also as i need to make asm changes in original code like making CALLA, mov etc i found a way to make it with asm in C #define asm_(dst) __asm__(dst)
asm_("mov %d4, %d2"); asm_("mov %d5, %d15"); asm_("CALLA 0x177000"); which gives me a listing with hex codes for that part 0000 0224 mov %d4,%d2 10 11 12 0002 02F5 mov %d5,%d15 13 14 15 0004 ED8B00B8 CALLA 0x80177000 Btw what is the best practices to import changes to original file?
|
|
« Last Edit: January 25, 2025, 06:26:52 AM by fknbrkn »
|
Logged
|
|
|
|
|