Hi:
I would like to input from a GPIO location (specifically GPIO6_PSR, 0x42004008) as quickly as possible and transfer the words read to memory (or cache) uninterrupted. Up to 10,000 words or more may be required. Think of data acquisition in a digital storage oscilloscope, though that is not what this is for.
This is the assembly code loop currently being used:
"ldr r3, [r8] \n\t" // load value of GPIO6_PSR into r3
"str r3, [r9], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index
"cmp r9, r10 \n\t" // check loop counter against loop limit
"ble nextdata \n\t" // loop if limit not reached
This works great, but the performance is pathetic given the 600 MHz CPU - around 15 ns for each data point. That's something like 10 or 11 clock cycles.
Unwinding the loop and just creating a block of repeating (ldr/str)s to eliminate the test and branch only improves performance by 10 or 15 percent.
Is there any better way?
Thanks,
--- sam
I would like to input from a GPIO location (specifically GPIO6_PSR, 0x42004008) as quickly as possible and transfer the words read to memory (or cache) uninterrupted. Up to 10,000 words or more may be required. Think of data acquisition in a digital storage oscilloscope, though that is not what this is for.
This is the assembly code loop currently being used:
"ldr r3, [r8] \n\t" // load value of GPIO6_PSR into r3
"str r3, [r9], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index
"cmp r9, r10 \n\t" // check loop counter against loop limit
"ble nextdata \n\t" // loop if limit not reached
This works great, but the performance is pathetic given the 600 MHz CPU - around 15 ns for each data point. That's something like 10 or 11 clock cycles.
Unwinding the loop and just creating a block of repeating (ldr/str)s to eliminate the test and branch only improves performance by 10 or 15 percent.
Is there any better way?
Thanks,
--- sam