Using the Teensy 4.1 for high level control and development ease, this
Ai approach breaks the computational bottlenecks
associated with real time convolution, with the additional benefit
of radically reduced software overhead. Most
demanding Ai tasks such as frequency domain matrix multiplies are
reduced to less than a dozen
machine instructions.
We concentrates on performing the
algorithms using the efficiencies of the FFT.
Applications using time domain techniques such as the
A100 and newer H100 from Nvidia increase in operations by a N squared
function. These same applications using FFT
techniques increase in operations by N x Log (N), a
exponential reduction.
Multiple chips can be cascaded for higher radices, two
cascaded chips can perform a 1024 radix, this gives a 1
million point transform in just two passes.
This chip is being designed using macro cells and super cells. The end customer can use
the approach as a silicon core and surround the chip with
proprietary circuitry to further enhance the application
and reduce his overall application hardware
Ai approach breaks the computational bottlenecks
associated with real time convolution, with the additional benefit
of radically reduced software overhead. Most
demanding Ai tasks such as frequency domain matrix multiplies are
reduced to less than a dozen
machine instructions.
We concentrates on performing the
algorithms using the efficiencies of the FFT.
Applications using time domain techniques such as the
A100 and newer H100 from Nvidia increase in operations by a N squared
function. These same applications using FFT
techniques increase in operations by N x Log (N), a
exponential reduction.
Multiple chips can be cascaded for higher radices, two
cascaded chips can perform a 1024 radix, this gives a 1
million point transform in just two passes.
This chip is being designed using macro cells and super cells. The end customer can use
the approach as a silicon core and surround the chip with
proprietary circuitry to further enhance the application
and reduce his overall application hardware