About the “Data Trace” feature of our chip emulators

Introduction

The ‘Data trace’ feature of our emulators is probably one of the most misunderstood features.  Trace is intended to give you some idea of what portions of an emulated ROM are being used by the target system as a last resort when you don’t have a way of establishing communications for logging.  Trace is a feature supported only by the Ostrich 2.0 and RoadRunner (with latest firmware).  This article exists to document what Trace is, how it works, what it can do, what kind of limitations exist and how it can go wrong.

In order to understand how data trace works, it is necessary to understand the electrical signals used by a microcontroller in your ECU (or target system) to access RAM or ROM using parallel access.  There are many explanations of this out there but this one seemed decently concise.  It will also be necessary to understand the commands used to set up Trace and the mechanism that the emulator uses to gather and report data back to the PC, along with what happens to that data in the application running on the PC.  It will also be very helpful to understand TunerPro RT definition creation.

Bottom line: Trace is complicated, finicky, temperamental and is not designed to provide the same kind of steady, consistent data that can be obtained through communicating with the ECM using some form of data logging.  Our emulators were NOT designed from the ground up to provide 100% accurate address trace data and we do not expect them to be able to deliver that level of performance.

What Trace is and How it Works

Normal operation of the emulator is the PC sending commands to the Emulator to make changes to emulator memory, allowing changes in a “chip” to be made without having to stop, remove, reprogram and reinstall the chip.  The Emulator has a microcontroller which is responsible for receiving and processing commands from the PC and communicating with a memory controller.  In order to allow changes to be made without disturbing the target system, our emulators sneak updates from the PC in between accesses by the target system.  (If the target expects data too fast, glitches may occur caused by collisions between PC and target memory access.)

Trace allows an application that sends the specific appropriate setup commands to the emulator to monitor which addresses the target system accesses.  When trace is enabled, the microcontroller on the emulator starts querying the same memory controller used for realtime updates to see which addresses are queried by the target system.  In order to determine which memory is accessed, two main signals are monitored by the memory controller:

  1. The address lines on the emulator, used by the target system to specify which data it wants to see
  2. The !OE (Output Enable) and !CE(chip enable) pins, which are used by the target to control the timing of a data output request

After the control lines indicate memory access, the memory controller stores the last address used by the target system.  As fast as it can, the microcontroller retrieves the address information from the memory controller.  Addresses responses are always 3 bytes and take (minimum) 8 MCU clock cycles or around 0.6uS to retrieve from the memory controller.  Setup commands sent by the PC control how the Emulator handles each address retrieved from the memory controller.  It can either store/buffer, send to PC or ignore the received address and wait for another hit.  If you are curious, you can look at the setup command structure in our documentation.  It is possible to control the range of addresses which trigger a match, the number of address hits to gather before reporting to the PC, whether addresses are streamed continuously or reported once before returning to normal control, whether duplicate hits are reported multiple times or once, the format of responses in terms of number of bytes reported and more.

Our emulators communicate with the PC at 921,600 baud 8N1 over a FTDI USB-Serial connection.  This means that approximately 102,400 bytes can be transferred each second, and each byte takes about 10uS to send.  The system is bandwidth-limited because it can gather trace responses from the memory controller faster than it can supply them to the PC.

Software Support

At this point (April 2020) the only softwares that have implemented support for data trace that we know of are TunerPro RT and RenoVelo Domino.  Specific software support for the trace feature is REQUIRED.  An application that supports the realtime tuning / emulation features of our products (i.e. EmUtility) may NOT support trace at all.

While we do not develop it in house, TunerPro RT is our reference platform that we use internally for testing and product development.  There are two methods of using the Trace feature of a compatible emulator in TunerPro RT.

The Address Watch Utility: (Note: it is “greyed out” / unavailable in this screenshot because I didn’t have a compatible emulator plugged in)

TPRT - Address watch utillity

 

Trace can also be invoked to watch individual tables: (again, “greyed out” / unavailable in this screenshot because I didn’t have a compatible emulator plugged in)

TPRT - A for Address

Looking at the control protocols, an example auto-generated “T” command sent by TunerPro RT to the emulator to set up trace after clicking the ‘A’ icon appears to be

"54 23 00 00 01 01 08 44 38 08 44 73 BC    /      T#.....D8.Ds."
  • Control byte = 23: 0b00100011
  • NO streaming
  • report only windowed hits
  • report all
  • normal addr triggers
  • relative addressing
  • single hit buffers
  • single byte address report
  • windowed report (relative  vs. absolute address reports)

 

What can go wrong with Trace? / Limitations

I’m sure Trace sounds great, like the perfect solution for ECUs where limited communication is possible.  Unfortunately, there are many ways for trace to go wrong and not act like you might hope or expect it would.

  1. Memory controller limitations: missed hits inside the emulator.  The memory controller does not buffer memory hits.  It only reports the last accessed address.  The speed at which the microcontroller queries the memory controller limits how many hits can be captured.  As discussed, it takes several MCU clock ticks do retrieve data from the memory controller. In ~0.6uS, at least 5x 100nS memory accesses can happen, all of which would be missed by the trace system.
  2. Processing received addresses: missed hits inside the emulator.  It takes time (albeit a VERY short amount of time) for the microcontroller on the emulator to process address hits and decide what to do with them.  As it does not query the memory controller when deciding what to do with an address hit, this limits the speed at which it can query the memory controller and limits how many hits can be captured.
  3. Bandwidth / PC: missed hits due to serial comms.  This bandwidth and latency limitation is inherent to the hardware design and will not change.  There is very limited bandwidth to communicate with a PC compared to the speed of memory access.  It takes around 10uSec to communincate the shortest format abbreviated address hit in streaming mode.  That means around 100 memory accesses (at 100ns) can occur (and be missed) by the target during the time it takes ONE Trace hit to be communicated with the PC.  Multiple byte responses (which will be necessary for larger monitor windows) will require 2 or 3 times as long for communication.  These are best-case figures, assuming streaming mode.  If a single response is sent followed by a new command setup, the latency of the process could be increased by a factor of 20 easily.  (Note: single response is the default monitoring scheme for TunerPro RT commands.)  If a large number of address hits are buffered and then bulk transferred, the latency between each hit is significantly decreased but the time to communicate with the PC is significantly increased, leading to a longer pause in between each group of responses.  Bottom line: serial communication limits the maximum potential address hit capability to a fraction of bus speed.
  4. Addressing mix-ups: software/XDF.   Under XDF … Edit XDF info it is possible to specify chip size, offset parameters.  TunerPro uses the address of the table in the XDF for the start and stop addresses it sends as part of the Trace setup command.  The XDF setup parameters control the relative location of tables within TunerPro’s memory model.  These need to be specified in a way that the addresses TunerPro RT uses for representing the bin on your PC match how the Ostrich stores the bin in its memory.  The addresses matching allows TunerPro RT to match the Trace command responses it receives from the emulator with the correct bytes stored in PC memory and show you which bytes in a table are being accessed.  If the memory models differ, TunerPro RT will never show any bytes in the table being accessed because the responses to the Trace commands don’t match with the information it has in memory.
  5. Memory Shadowing: target system behavior: “Shadowing” refers to the practice of an embedded system copying memory from one place to another before using it.  In many cases, slow flash or ROM chip memory is copied to faster RAM memory and then accessed in RAM during normal operation.  In this style of use, there are no ROM accesses to trigger Trace hits after the initial shadow.  While this does not often happen, it is controlled by the target system and is not under the control of our emulators.

Conclusion

Trace was a “nifty extra” added to our emulators because we could and figured it might be handy in some cases.  We did NOT design our emulators around being able to deliver completely accurate and precise address tracing.  We do not have any plans to improve the data Trace feature.  We do not have any plans to release an emulator that has better trace performance.  Our emulators were designed to take the place of a chip and allow realtime changes – this they do well.  Trace should be considered a “bonus feature.”  Do not rely on it to gather all the data needed to tune an ECU.