I have an optical transmitter that encodes information into the position of pulses within a time duration (pulse position modulation). I'm trying to see if its possible to accurately and consistently (i.e., with a nanosecond resolution) capture the rising edge of these pulses with a Teensy 4.0. To accomplish this, I enable the board's cycle counter and attach an interrupt to the input pin. I then store the timing information in a circular buffer which is processed in loop() when the processor is available:

Code:
#define SIG 3

float clock_speed = 600*1e6;
const int buffer_size = 32;
uint32_t buffer[buffer_size];
uint32_t last = 0;
int write_index = 0;
int read_index = 0;

void setup() {
	ARM_DEMCR |= ARM_DEMCR_TRCENA;
	ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
	attachInterrupt(digitalPinToInterrupt(SIG), pulseReceived, RISING);
}

void pulseReceived() {
	uint32_t current = ARM_DWT_CYCCNT;
	if (last != 0) {
		buffer[write_index % buffer_size] = current - last;
		write_index += 1;
	}
	last = current;
}

void loop() {
	while (read_index < write_index) {
		Serial.print(read_index);
		Serial.print(",");
		Serial.println(buffer[read_index % buffer_size] * 1/(clock_speed) * 1e9);
		read_index += 1;
	}
}
To test the accuracy/stability, I generate 200 external pulses that are around 1560ns apart. Here are the results compared to my oscilloscope:

Teensy 4.0 Oscilloscope
Max (ns) 1621 1559.810
Min (ns) 1512 1559.308
Mean (ns) 1559.577 1559.693
STDV (ns) 9.234 0.114

The results aren't horrible, but now I want to optimize it to be as stable/accurate as possible to see if this is a feasible route. I have a couple ideas but wanted to see what else I should consider:

1. Increase Interrupt Priority

It looks like it should be possible by using NVIC_SET_PRIORITY, although I'm not sure what port to use. The guides I've seen (here and here) recommend doing something like:

Code:
NVIC_SET_PRIORITY(IRQ_PORTA, 0);
but IRQ_PORTx is not set for the Teensy 4.0. I looked in the IRQ_NUMBER_t enum for the 4.0 (imxrt.h) and saw a few potential candidates (IRQ_GPIO1_INT0 looks relevant?) but have no clue which port corresponds to which pin. It looks like there's a macro, digitalPinToPort (pins_arduino.h), which might help identify it

2. Use a General Purpose Timer in Free-Running Mode

According to the data sheet for the CPU:

Each GPT is a 32-bit “free-running” or “set and forget” mode timer with programmable prescaler and compare and capture register. A timer counter value can be captured using an external event and can be configured to trigger a capture event on either the leading or trailing edges of an input pulse.
Would GTP1 or GTP2 be sufficient to capture the counter value of a rising edge with nanosecond resolution?

3. Increase the Clock Frequency

When I overclocked the CPU, the standard deviation of the pulses was lower (which doesn't seem surprising). Doing this in conjunction with other methods might make the timing more stable?

4. Try Something Else!

I'm very interested in learning about other things I can try. I'm not set on using the Teensy 4.0, but it's an interesting challenge and I want to see how far I can take it.