Pages: 1 ... 4 5 [6] 7
Author Topic: Disassembling MED/EDC17  (Read 75511 times)
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #75 on: January 21, 2024, 11:05:25 AM »

I cannot tell you if they have dropped it or not on a certain binary. But for UDS there might be a similar table.

This is irrelevant on MEDC17 outside of TP2.0.
On UDS on VAG every ODX ID can have completely arbitrary identifiers assigned to the variables.
I currently have 16220 definitions in my logger for VAG/Porsche UDS identifiers ($22) for this reason. There are no shortcuts.

It makes no sense to parse the tables when all the ID's can have a random offset from one software version to the next.
You can often not infer any information at all between different ODX ID's.
That ship sailed with MED9.

Also even on MED9 for finding ram cells it's a waste of time parsing those things anyway, when you can do this:
Code:
[20:08:28] ######## SOURCE BIN ########
[20:08:28] Loading and parsing files...
[20:08:28] Removing bitfields...
[20:08:28] Code segments:
[20:08:28] 0x005500 - 0x00B3C5 Len: 0x005EC6 70: 95%%
[20:08:28] 0x00B3EC - 0x0573F4 Len: 0x04C009 70: 101%%
[20:08:28] 0x057400 - 0x062DD8 Len: 0x00B9D9 70: 107%%
[20:08:28] 0x090E04 - 0x09E787 Len: 0x00D984 70: 100%%
[20:08:28] 0x09FFC0 - 0x0AB161 Len: 0x00B1A2 70: 82%%
[20:08:28] 0x0AC1B8 - 0x0B9C08 Len: 0x00DA51 70: 33%%
[20:08:28] 0x0B9C10 - 0x0D9BF7 Len: 0x01FFE8 70: 91%%
[20:08:28] 0x0D9C18 - 0x0E49E7 Len: 0x00ADD0 70: 108%%
[20:08:28] 0x0E49F0 - 0x0E7EFE Len: 0x00350F 70: 120%%
[20:08:28] 0x0E7F08 - 0x0EC510 Len: 0x004609 70: 105%%
[20:08:28] 0x103FE0 - 0x106B7C Len: 0x002B9D 70: 93%%
[20:08:28] 0x106B8C - 0x10ADAC Len: 0x004221 70: 83%%
[20:08:28] 0x10AE40 - 0x116B18 Len: 0x00BCD9 70: 87%%
[20:08:28] 0x118264 - 0x11AEBC Len: 0x002C59 70: 81%%
[20:08:28] 0x120000 - 0x124C29 Len: 0x004C2A 70: 88%%
[20:08:28] 0x124C30 - 0x1303B2 Len: 0x00B783 70: 61%%
[20:08:28] 0x1305A8 - 0x134250 Len: 0x003CA9 70: 101%%
[20:08:28] 0x138C70 - 0x13E11C Len: 0x0054AD 70: 106%%
[20:08:28] 0x13F138 - 0x212BD4 Len: 0x0D3A9D 70: 85%%
[20:08:28] r2: 0x5C9FF0
[20:08:28] r13: 0x7FFFF0
[20:08:28] Parsing instructions...
[20:08:29] Got 16762 refs
[20:08:29] Matching refs to A2L...
[20:08:29] A2L addresses: 10762, with refs: 6870
[20:08:29] ######## TARGET BIN ########
[20:08:29] Code segments:
[20:08:29] 0x005500 - 0x00C391 Len: 0x006E92 70: 95%%
[20:08:29] 0x00C3B8 - 0x065BE8 Len: 0x059831 70: 102%%
[20:08:29] 0x090E04 - 0x09E787 Len: 0x00D984 70: 100%%
[20:08:29] 0x09FFC0 - 0x0A7DE4 Len: 0x007E25 70: 84%%
[20:08:29] 0x0A7DF0 - 0x0AB2F1 Len: 0x003502 70: 76%%
[20:08:29] 0x0AC348 - 0x0B9D98 Len: 0x00DA51 70: 33%%
[20:08:29] 0x0B9DA0 - 0x0DB244 Len: 0x0214A5 70: 89%%
[20:08:29] 0x0DB250 - 0x0DC697 Len: 0x001448 70: 117%%
[20:08:29] 0x0DC6B8 - 0x0E7487 Len: 0x00ADD0 70: 108%%
[20:08:29] 0x0E7490 - 0x0EA99E Len: 0x00350F 70: 120%%
[20:08:29] 0x0EA9A8 - 0x0EF0A0 Len: 0x0046F9 70: 104%%
[20:08:29] 0x103FE0 - 0x106B7C Len: 0x002B9D 70: 93%%
[20:08:29] 0x106B8C - 0x10ADAC Len: 0x004221 70: 83%%
[20:08:29] 0x10AE40 - 0x116B18 Len: 0x00BCD9 70: 87%%
[20:08:29] 0x118264 - 0x11AEBC Len: 0x002C59 70: 81%%
[20:08:29] 0x120000 - 0x1253F9 Len: 0x0053FA 70: 87%%
[20:08:29] 0x125400 - 0x1308C6 Len: 0x00B4C7 70: 61%%
[20:08:29] 0x130ABC - 0x1345C0 Len: 0x003B05 70: 105%%
[20:08:29] 0x139278 - 0x13E7F4 Len: 0x00557D 70: 106%%
[20:08:29] 0x13F810 - 0x21557F Len: 0x0D5D70 70: 85%%
[20:08:29] r2: 0x5C9FF0
[20:08:29] r13: 0x7FFFF0
[20:08:29] Parsing instructions...
[20:08:29] Got 16897 refs
[20:08:29] ######## MATCHING ########
[20:08:29] Found 90175 derived references via COPYTRACK in SOURCE.
[20:08:30] Found 73615 derived references via COPYTRACK in TARGET.
[20:08:30] Total refs to match - Source: 87017 Target: 88199
[20:08:30] Running REFMATCH: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
[20:08:31] Matched 83803 references.
[20:08:31] Conflict resolution...OK
[20:08:32] cnt 180
[20:08:32] instrsum 50
[20:08:32] discard 2
[20:08:32] doublerefs 40
[20:08:32]
[20:08:32] Matched 6701 addresses via REFMATCH, out of them 1126 single.
[20:08:32] Array fill...  OK, Got 2 addresses.
[20:08:32] Fill...
[20:08:32] Matched 2488 addresses via FILL.
[20:08:32] Total matches: 9191
[20:08:32] 252/9192 with same address.
[20:08:32] Writing ECU file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.ecu
[20:08:32] Writing ADR file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.adr
[20:08:32] Writing GEN file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.gen
[20:08:32] FINISHED, time: 00:00:03

Completely automatically... and it's not limited to VAG or Bosch.
« Last Edit: January 21, 2024, 11:17:04 AM by prj » Logged

PM's will not be answered, so don't even try.
Log your car properly.
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #76 on: January 21, 2024, 12:12:17 PM »

This is irrelevant on MEDC17 outside of TP2.0.
On UDS on VAG every ODX ID can have completely arbitrary identifiers assigned to the variables.
I currently have 16220 definitions in my logger for VAG/Porsche UDS identifiers ($22) for this reason. There are no shortcuts.

It makes no sense to parse the tables when all the ID's can have a random offset from one software version to the next.
You can often not infer any information at all between different ODX ID's.
That ship sailed with MED9.

Also even on MED9 for finding ram cells it's a waste of time parsing those things anyway, when you can do this:
Code:
[20:08:28] ######## SOURCE BIN ########
[20:08:28] Loading and parsing files...
[20:08:28] Removing bitfields...
[20:08:28] Code segments:
[20:08:28] 0x005500 - 0x00B3C5 Len: 0x005EC6 70: 95%%
[20:08:28] 0x00B3EC - 0x0573F4 Len: 0x04C009 70: 101%%
[20:08:28] 0x057400 - 0x062DD8 Len: 0x00B9D9 70: 107%%
[20:08:28] 0x090E04 - 0x09E787 Len: 0x00D984 70: 100%%
[20:08:28] 0x09FFC0 - 0x0AB161 Len: 0x00B1A2 70: 82%%
[20:08:28] 0x0AC1B8 - 0x0B9C08 Len: 0x00DA51 70: 33%%
[20:08:28] 0x0B9C10 - 0x0D9BF7 Len: 0x01FFE8 70: 91%%
[20:08:28] 0x0D9C18 - 0x0E49E7 Len: 0x00ADD0 70: 108%%
[20:08:28] 0x0E49F0 - 0x0E7EFE Len: 0x00350F 70: 120%%
[20:08:28] 0x0E7F08 - 0x0EC510 Len: 0x004609 70: 105%%
[20:08:28] 0x103FE0 - 0x106B7C Len: 0x002B9D 70: 93%%
[20:08:28] 0x106B8C - 0x10ADAC Len: 0x004221 70: 83%%
[20:08:28] 0x10AE40 - 0x116B18 Len: 0x00BCD9 70: 87%%
[20:08:28] 0x118264 - 0x11AEBC Len: 0x002C59 70: 81%%
[20:08:28] 0x120000 - 0x124C29 Len: 0x004C2A 70: 88%%
[20:08:28] 0x124C30 - 0x1303B2 Len: 0x00B783 70: 61%%
[20:08:28] 0x1305A8 - 0x134250 Len: 0x003CA9 70: 101%%
[20:08:28] 0x138C70 - 0x13E11C Len: 0x0054AD 70: 106%%
[20:08:28] 0x13F138 - 0x212BD4 Len: 0x0D3A9D 70: 85%%
[20:08:28] r2: 0x5C9FF0
[20:08:28] r13: 0x7FFFF0
[20:08:28] Parsing instructions...
[20:08:29] Got 16762 refs
[20:08:29] Matching refs to A2L...
[20:08:29] A2L addresses: 10762, with refs: 6870
[20:08:29] ######## TARGET BIN ########
[20:08:29] Code segments:
[20:08:29] 0x005500 - 0x00C391 Len: 0x006E92 70: 95%%
[20:08:29] 0x00C3B8 - 0x065BE8 Len: 0x059831 70: 102%%
[20:08:29] 0x090E04 - 0x09E787 Len: 0x00D984 70: 100%%
[20:08:29] 0x09FFC0 - 0x0A7DE4 Len: 0x007E25 70: 84%%
[20:08:29] 0x0A7DF0 - 0x0AB2F1 Len: 0x003502 70: 76%%
[20:08:29] 0x0AC348 - 0x0B9D98 Len: 0x00DA51 70: 33%%
[20:08:29] 0x0B9DA0 - 0x0DB244 Len: 0x0214A5 70: 89%%
[20:08:29] 0x0DB250 - 0x0DC697 Len: 0x001448 70: 117%%
[20:08:29] 0x0DC6B8 - 0x0E7487 Len: 0x00ADD0 70: 108%%
[20:08:29] 0x0E7490 - 0x0EA99E Len: 0x00350F 70: 120%%
[20:08:29] 0x0EA9A8 - 0x0EF0A0 Len: 0x0046F9 70: 104%%
[20:08:29] 0x103FE0 - 0x106B7C Len: 0x002B9D 70: 93%%
[20:08:29] 0x106B8C - 0x10ADAC Len: 0x004221 70: 83%%
[20:08:29] 0x10AE40 - 0x116B18 Len: 0x00BCD9 70: 87%%
[20:08:29] 0x118264 - 0x11AEBC Len: 0x002C59 70: 81%%
[20:08:29] 0x120000 - 0x1253F9 Len: 0x0053FA 70: 87%%
[20:08:29] 0x125400 - 0x1308C6 Len: 0x00B4C7 70: 61%%
[20:08:29] 0x130ABC - 0x1345C0 Len: 0x003B05 70: 105%%
[20:08:29] 0x139278 - 0x13E7F4 Len: 0x00557D 70: 106%%
[20:08:29] 0x13F810 - 0x21557F Len: 0x0D5D70 70: 85%%
[20:08:29] r2: 0x5C9FF0
[20:08:29] r13: 0x7FFFF0
[20:08:29] Parsing instructions...
[20:08:29] Got 16897 refs
[20:08:29] ######## MATCHING ########
[20:08:29] Found 90175 derived references via COPYTRACK in SOURCE.
[20:08:30] Found 73615 derived references via COPYTRACK in TARGET.
[20:08:30] Total refs to match - Source: 87017 Target: 88199
[20:08:30] Running REFMATCH: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
[20:08:31] Matched 83803 references.
[20:08:31] Conflict resolution...OK
[20:08:32] cnt 180
[20:08:32] instrsum 50
[20:08:32] discard 2
[20:08:32] doublerefs 40
[20:08:32]
[20:08:32] Matched 6701 addresses via REFMATCH, out of them 1126 single.
[20:08:32] Array fill...  OK, Got 2 addresses.
[20:08:32] Fill...
[20:08:32] Matched 2488 addresses via FILL.
[20:08:32] Total matches: 9191
[20:08:32] 252/9192 with same address.
[20:08:32] Writing ECU file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.ecu
[20:08:32] Writing ADR file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.adr
[20:08:32] Writing GEN file: C:\Tuning\Tools\tsunami\ecus\52AC6A0KG_44L1.gen
[20:08:32] FINISHED, time: 00:00:03

Completely automatically... and it's not limited to VAG or Bosch.

What is this tool and where can i find it? It seems like something which can make the life of a lot of people easier.
Logged
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #77 on: January 21, 2024, 04:10:30 PM »

What is this tool and where can i find it? It seems like something which can make the life of a lot of people easier.

Nowhere, it is the engine that makes my logger possible, and how I have measuring data for almost any car.
I just gave an idea where to work towards - I don't see much gain in writing parsers for specific tables and specific ecu's. The issue can be approached in a completely different manner.
Logged

PM's will not be answered, so don't even try.
Log your car properly.
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #78 on: January 22, 2024, 07:04:45 AM »

Ok, i got your point. Not sure exactly what you are doing exactly but i have following idea how to implement such a "A2L Converter". Challenge me:

1. Lets assume that you have a Binary A with A2L and a Binary B without a A2L. You goal is to guess a lot of memory variables in the Binary B correctly.
2. Lets assume that you already loaded both into Ghidra and got the memory-map right and also set the registers for indirect mapping correct.
3. You then export all functions-ghidra-pcode(c-listing) in Binary A to a Database.
3. Do same with Binary B, but to a different DB
4. Then you start comparing function-texts from Database A to Database B. You could remove all
"Variables" and "Func" Calls from the Text and then compare both texts to each other. If they match, you know that its the same code, just different variable memory locations. Its more complicated than that for sure, but its a starting point, and here the "magic" must happen.
5. As soon as you got the mapping of functions correct, you could see what variables are used in that particular function on Bin A (using A2L) and name the vars
6. Then you could rename the Vars in Bin B to the same name(as you know the function match up)
7. Export this vars somehow in A2L format.

« Last Edit: January 22, 2024, 07:11:33 AM by elias » Logged
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #79 on: January 22, 2024, 08:09:45 AM »

I don't use any 3rd party tools, the entire disassembly and analysis process is done by my tool.
So that means it has disassemblers for all the architectures.
Also, it automatically detects all the code segments based on statistical analysis (that is actually quite simple).

Comparing Ghidra text is a waste of time, because if compiler changes and optimization changes then the output is completely different.
Some PhD researchers from USA who specialize in this kind of thing tried, and they got less than 30% of the matches of my tool. I was looking to increase the accuracy, but this turned out not to be a viable approach.
Not to mention Ghidra is insanely slow compared to this.

So a different approach needs to be done - for starters cataloguing all the memory references, accesses, copies and categorizing instructions into different subtypes.
Then looking through another binary and trying to find the same places based on the pattern of the pseudocode (or even more crude methods).
Then you need strong false positive rejection algorithms.

And before you even start you need some kind of A2L parser engine....

The log is all that there is to see: bin and a2l (preprocessed) dragged into one box, another bin dragged into another and then a button pushed.
There are no hidden steps, no Ghidra, no Ida, no nothing - no pre-processing of the binary data. The only thing that is pre-processed are the A2L's, into a much more efficient format (because I do that for the logger anyway).
From start of button push until end of the process it is 3 seconds. This includes detection and disassembly of all the code areas and the algos mentioned above.
Logged

PM's will not be answered, so don't even try.
Log your car properly.
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #80 on: January 22, 2024, 09:21:20 AM »

Quote
Also, it automatically detects all the code segments based on statistical analysis (that is actually quite simple).
Can you elaborate what code segments are? Do you mean this "bosch blocks"?

Quote
So a different approach needs to be done - for starters cataloguing all the memory references, accesses, copies and categorizing instructions into different subtypes.
Quote
Then looking through another binary and trying to find the same places based on the pattern of the pseudocode (or even more crude methods).
Then you need strong false positive rejection algorithms.
Comparing Ghidra Text was just a example, the matching algorhytm is the secret sauce. It was just an idea.

Quote
And before you even start you need some kind of A2L parser engine....
I have a "selfmade" A2L Parser "Engine" which produces json files:
Code:
    "AACCMNOFF": {
        "name": "AACCMNOFF",
        "description": "\"Offset auf neg. Beschleunigungsbegrenzung\"",
        "type": "static_variable",
        "addr": "0x5dff1c",
        "size": 2
    }
I think i have shared it somewhere already, if not i can upload it, no secrets here. I am using it to compare MED9-Binaries together.


@prj: Lets say you would be a Hobbyist and want to get some A2Ls which does not exists for your Hobbyist needs.
Which approach would you choose for the "A2L Transfer-Tool"? Your time is limited and you cannot spend years in R&D developing the perfect secret sauce method.
« Last Edit: January 22, 2024, 09:24:59 AM by elias » Logged
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #81 on: January 22, 2024, 09:44:23 AM »

Can you elaborate what code segments are? Do you mean this "bosch blocks"?
No I mean literal chunks in the binary that are code. You need to somehow separate data from code in the file when you know nothing about it. You only want to process the code and not try (futile) disassembly of data areas.
Bosch has nothing to do with it, my tool doesn't care if it's Bosch, Conti, Kefico or something else.

Quote
@prj: Lets say you would be a Hobbyist and want to get some A2Ls which does not exists for your Hobbyist needs.
Which approach would you choose for the "A2L Transfer-Tool"? Your time is limited and you cannot spend years in R&D developing the perfect secret sauce method.

For logging you buy my tool then you use ida to locate the stuff based on similar a2l that you need. Usually you're not going to need all variables, but only very few of them.
IDA lets you alt+b search binary masks where you can mask out registers, data etc. You need some familiarity with the opcode binary syntax of course.
I mean, technically you could sniff the CAN for the variables you need also, $2C specification is not exactly a secret.

But if you want to just work on one ECU then this is the way to go. Building any tools is a waste of time then. I already spent the 1y+ R&D on it and made the datalogging accessible to everyone for what I think is a quite reasonable fee, and my licensing allows hobbyists to purchase perpetual licenses for their cars so they don't have to keep paying monthly fees. There's also no setup fees like dropping 5k on a flash tool etc, I think it's extremely friendly towards hobbyists and I don't think it's cost effective for anyone to develop their own solutions for at least datalogging at this point. Maybe only to satisfy curiosity or as a highly specialized dedicated package for a handful of platforms.
Logged

PM's will not be answered, so don't even try.
Log your car properly.
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #82 on: January 22, 2024, 05:19:03 PM »

Okey, my mistake, i havent wrote my goals correctly.:

Currently i want to upgrade from 1Q0 907 115C to 1Q0 907 115F. For 115C i have a suitable A2L, but not for 115F. So naturally the topic arises: How do i get a A2L for the 115F?

Why do i need to upgrade? Well, i hope they fixxed some bugs there, and its always good to have the newest software right?

Why do i need a A2L? Well, it would make my life easier writing (new) features(like the mapswitch). I could do it manually, but its a lot of effort.

Besides: I appreciate your work with your logger, but it wont help me currently. Never ever have logged my car besides watching some measurement values in VCDS. Never tuned my car, it came with a tune from previous owner. I am mostly working on some features as a hobby, and need A2Ls for further dev. Also i have sorted one of the "collections" using my "med9info" + scripting to find some A2Ls but unfortunately it did not contain the ones which i wanted.

What would be your approach to get your hands on this A2Ls which does not exist in the wild? Its obviously would be writing some tool , but not spending years of work on it.
Logged
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #83 on: January 23, 2024, 12:21:22 AM »

What would be your approach to get your hands on this A2Ls which does not exist in the wild? Its obviously would be writing some tool , but not spending years of work on it.
Either find somewhere to buy it, or if you can't then spend years writing a tool.

If not, then deal with it and use IDA with binary matching to find the things you need.
Logged

PM's will not be answered, so don't even try.
Log your car properly.
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #84 on: January 27, 2024, 05:54:23 AM »

Okey, i fiddled together a tool which is completely suitable for my needs.

Hope it help the MED17 folks. Approach is written down there, and is very similar to the one which you described prj:
https://github.com/EliasKotlyar/A2L-Transfer

The heavy lifting is done via bindiff, which can match up binary chunks for me. I dont have to deal with it myself, and it produces reliable results. I tested it out on various MED9.1 A2Ls and it looking good. There are some limitations due to "hobbiest" approach and not spending too much time configuring all cases. However its enough to match a big portion of variables, and thats completely sufficient for me. Even if it does not work/produces garbage in some cases, i get a good overview with the matching vars and can work with it more "efficiently" than with no A2L at all.
Logged
prj
Hero Member
*****

Karma: +915/-427
Offline Offline

Posts: 5840


« Reply #85 on: January 27, 2024, 10:34:41 AM »

MED9 is actually super easy to do and the code base differs very little. Same for ME7.

MED17 needs a lot more processing.
Your thing will work only for MED9 that you have hardcoded register values for (because they are the same for nearly all MED9 binaries).
On MED17 and MG1 they change a lot depending of size of smalldata1 and 2. Also you hardcoded the flash layout, just as an example for VAG alone there are well over a 100 different flashlayouts on newer ECU's. Never mind the other makes and models.

Bindiff looks actually very nice.

Unfortunate thing is that it needs a Ghidra or IDA project to work. This is very slow on newer ECU's, Ghidra can easily take half an hour to process a binary of a more modern ECU.
IDA you need manual scripts and logic to process the globals, at least on TriCore. And also you need to feed it info about where the code segments are and at what address to load the binary.

So I would say your thing is not useful for VAG MED17 or MG1. Only for VAG MED9.1, but already not for MED9.1.2 that has intflash, because that requires manually setting everything again...

I'd say if you're going to use bindiff, then I don't see much point in your stuff. You can just use it with IDA directly and transfer the symbols.
All you did is call ghidra and bindiff and hardcode some MED9 locations - the same could have been done by loading two binaries in ida, using my script to populate the symbols from a2l and then use bindiff directly.

In case of MED17/MG1 you will have to load the binary anyway to find the globals or you need a disassembler that can find it in the binary, thus making your scripts redundant.

I don't want to shit on your work or anything, I just have a hard time seeing the value in it.
« Last Edit: January 27, 2024, 10:44:40 AM by prj » Logged

PM's will not be answered, so don't even try.
Log your car properly.
d3irb
Full Member
***

Karma: +131/-1
Offline Offline

Posts: 186


« Reply #86 on: January 27, 2024, 11:04:47 AM »

Don't let prj's lack of respect for modern tooling dissuade you Smiley +1 for Bindiff. I've been using the same approach myself and it works very well. If anyone is interested in trying something new, the latest Ghidra has a tool called BSim that also works extremely well. Yes, even on MED17 binaries, even though I don't usually mess with Bosch stuff I just tried a few. Far better than needle-in-haystack pattern mask matching.

Adding detection of Tricore register values isn't rocket science. Especially now that Unicorn supports Tricore it's very easy to auto-configure setup state. And using emulation to do setup plus using bindiff, you just need a memory map, a loader script for whatever BIN format you have, and the entry point for ASW (and the memory map doesn't even need to be right, really, just 0xD -> RAM, 0xB -> LMU RAM, 0xC -> Scratchpad RAM, 0x8 / 0xA -> Flash is sufficient for Tricore usually since there isn't overlap).

The whole point of automation instead of "using it with IDA directly and transferring the symbols" is that if something takes 30 minutes, it's not wasting your day anyway. For example, BSim in Ghidra is really slow but is trivial to fully automate from the command line, so it's no issue. It even has all of the plumbing already to let you make a service that will do a ton of binaries in parallel and store the results in a Postgres database.

I think this project could easily be extended to be very useful for MED17. I've written a very similar but simpler tool for Simos18 and it's pretty straightforward to get very strong matching working using off the shelf tooling.

Logged
elias
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 59


« Reply #87 on: January 27, 2024, 01:06:16 PM »

Quote
MED17 needs a lot more processing.
Yes i know, you need to adjust the "import into ghidra" script to your needs. I expect that someone who knows how to deal with a certain ECU knows how to setup Ghidra or IDA Pro.
Its a manual process, and depends on the ECU

Quote
Unfortunate thing is that it needs a Ghidra or IDA project to work. This is very slow on newer ECU's, Ghidra can easily take half an hour to process a binary of a more modern ECU.
This may be the case, but as i mentioned before , its a "hobbiest" approach. I can wait 30min and go watch TV. I am (currently) not doing it to make $$$. I respect people who do, but for me its a hobby.

Quote
You can just use it with IDA directly and transfer the symbols.
How do you do that? I am not familiar with IDA Pro, i have looked at it but for me its looking like a stripped down Ghidra Version. I understand it as a more sophisticated disassembler.

Quote
All you did is call ghidra and bindiff and hardcode some MED9 locations - the same could have been done by loading two binaries in ida, using my script to populate the symbols from a2l and then use bindiff directly.
Yes, it would have helped if someone would have told me before i started.

Quote
I don't want to shit on your work or anything, I just have a hard time seeing the value in it.
Value is, that i got my A2L ported in small amount of time. Also i learned something, now i am more confident with "ghidra scripting" and can do much more sophisticated stuff with it.
Also i shared my approach and i trigged a discussion about that approach - which seems to be a good thing, as nobody needs to reinvent the wheel and do my appoach again

Quote
I've written a very similar but simpler tool for Simos18
What exactly did you simplify? I would like to learn more about it.
Logged
d3irb
Full Member
***

Karma: +131/-1
Offline Offline

Posts: 186


« Reply #88 on: January 27, 2024, 01:51:31 PM »

How do you do that? I am not familiar with IDA Pro, i have looked at it but for me its looking like a stripped down Ghidra Version. I understand it as a more sophisticated disassembler.

IDA Pro has a built in cross-correlation system similar to Bindiff, called Lumina. It's unclear if prj was referring to that or just using Bindiff inside of IDA (dismissing writing a tool to automate it basically and suggesting that you might as well use a plugin instead, he's just being cranky).

Compared to Ghidra overall: IDA has better ergonomics (key shortcuts, UI, etc.), runs faster, has a _way_ better dynamic analysis feature set (debugging) and sometimes a better disassembler in terms of bugs. There's also a much bigger scripting ecosystem for IDA since all the old school people still use it.

But Ghidra is better for ECUs IMO by virtue of having a generalized decompiler. Basically on Ghidra, you get decompilation for any supported processor. On IDA, you don't, because the decompiler is a separate product from the disassembler. So there's no decompilation for Tricore or C167 in IDA. This also means that IDA chokes on weird things (generating XRefs from pointers that are calculated twice being a big one) because the disassembler is just a disassembler (vs Ghidra, where the whole thing is hoisted into IR / PCode and then evaluated).

I personally use IDA for any OS-hosted binary stuff on a supported platform (ie - Windows x86 or ARM Linux binaries), where it's very polished and fast to use, and Ghidra for everything else.

Quote
What exactly did you simplify? I would like to learn more about it.

It's pretty similar conceptually, I just don't use A2L as the input/output format for my automation. Kind of like what you've done with JSON, I have a separate tool that turns A2L into CSV first, and I never had a need to write A2L back out the other side, so I don't. I also autodetect writes to the global address registers in the ASW execution flow. You can do this using raw disassembly (Ghidra scripting, basically keep fetching the next instruction until it's a load immediate with the global register as target) or emulation (Unicorn).

Simos18 is also just simpler in general with primitive tooling because there is less indirection. For example, maps are directly referenced rather than indirectly referenced through a load function like they are in Bosch, so calibration data also gets XRefs by default. This is a double edged sword since it makes hooking and map replacement harder (without using flash emulation at least), but it makes cross-correlation tooling a lot more straightforward.
Logged
jcsbanks
Full Member
***

Karma: +17/-3
Offline Offline

Posts: 126


« Reply #89 on: January 27, 2024, 04:27:32 PM »

If you make names, comments and function names from the a2l in Bin_A, you could perhaps script the BinDiff GUI "import symbols and comments" into Bin_B. Perhaps this could be a simplification, optimization or generalization of the script because all you have to otherwise is process Bin_A and Bin_B the same way (eg on Tricore finding a0, a1, a8, a9, marking code as code and data as data, defining functions).
Logged
Pages: 1 ... 4 5 [6] 7
  Print  
 
Jump to:  

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Page created in 0.029 seconds with 17 queries. (Pretty URLs adds 0.001s, 0q)