Interrupts for UART using Teensy 3.1 or 3.0

Status
Not open for further replies.
duff, thanks for creating this library.
question on buffered write
Code:
    virtual size_t write( const uint8_t *buffer, size_t size ) {
        serial_dma_write( buffer, size );
        return size;
    }

which in turn calls this
Code:
void Uart1Event::serial_dma_write( const void *buf, unsigned int count ) {
    uint8_t * buffer = ( uint8_t * )buf;
    uint32_t head = tx_buffer_head;
    uint32_t cnt = count;
    uint32_t free = serial_dma_write_buffer_free( );
    if ( cnt > free ) cnt = free;
    uint32_t next = head + cnt;
    bool wrap = next >= TX_BUFFER_SIZE ? true : false;
    if ( wrap ) {
        uint32_t over = next - TX_BUFFER_SIZE;
        uint32_t under = TX_BUFFER_SIZE - head;
        memcpy_fast( tx_buffer+head, buffer, under );
        memcpy_fast( tx_buffer, buffer+under, over );
        head = over;
    }
    else {
        memcpy_fast( tx_buffer+head, buffer, count );
        head += cnt;
    }
    tx_buffer_head = head;
    if ( !transmitting ) {
        transmitting = true;
        __disable_irq( );
        tx.TCD->CITER = cnt;
        tx.TCD->BITER = cnt;
        tx.enable( );
        __enable_irq( );
    }
}

if TX_BUFFER_SIZE is 64 and count is >64, it looks like only 64 bytes are sent? what happens to the rest?

ok, I see memcpy_fast( tx_buffer+head, buffer, count );
but isn't tx_buffer only 64 bytes?
 
Last edited:
I think serial_dma_write should return the actual bytes sent (cnt) instead of void, and the buffered write returns that value instead of the size parameter. and memcpy should copy cnt, not count.
 
if TX_BUFFER_SIZE is 64 and count is >64, it looks like only 64 bytes are sent? what happens to the rest?

ok, I see memcpy_fast( tx_buffer+head, buffer, count );
but isn't tx_buffer only 64 bytes?
it doesn't get sent, I wanted to make it "non blocking". Earlier versions had this but the performance on a whole was not what I wanted so I had to make a tradeoff. If you look at the fill_internal_buffer_time example you will see that it is actually faster than the stock serial code to get through the print function. If you increase the internal buffer sizes (stock & mine) it dramatically improves over it.

The only downside to this you have split up your messages if they are greater than the buffer size. It will never send the message greater than the buffer size or what is free in the buffer. But then I would just increase the internal buffer size to a worst case scenario. If you're looking for a general purpose not have to think about it then use the serial drivers paul has, I wrote this because I wanted something that didn't block my code flow timing as much as his when sending data and then added in the event stuff as a bonus since i was rewriting the serial driver anyway.

I think serial_dma_write should return the actual bytes sent (cnt) instead of void, and the buffered write returns that value instead of the size parameter. and memcpy should copy cnt, not count.
maybe if there is enough people calling for this.

cnt and count have the same value but yes I should use the local version (cnt) it will work the same though, i'll fix that.
 
it doesn't get sent, I wanted to make it "non blocking". Earlier versions had this but the performance on a whole was not what I wanted so I had to make a tradeoff. If you look at the fill_internal_buffer_time example you will see that it is actually faster than the stock serial code to get through the print function. If you increase the internal buffer sizes (stock & mine) it dramatically improves over it.

The only downside to this you have split up your messages if they are greater than the buffer size. It will never send the message greater than the buffer size or what is free in the buffer. But then I would just increase the internal buffer size to a worst case scenario. If you're looking for a general purpose not have to think about it then use the serial drivers paul has, I wrote this because I wanted something that didn't block my code flow timing as much as his when sending data and then added in the event stuff as a bonus since i was rewriting the serial driver anyway.


maybe if there is enough people calling for this.

cnt and count have the same value but yes I should use the local version (cnt) it will work the same though, i'll fix that.


by "it doesn't get sent", you mean it is lost right? That's not good, considering the higher level write function, you return the size parameter, making the caller think the whole "size" bytes got sent when in fact, only TX_BUFFER_SIZE got sent.

the fill_internal_buffer_time example is not the same, PACKET_SIZE is 16 which is not greater than the TX_BUFFER_SIZE.

Yes, the objective is non blocking code, but likewise no lost data should be an objective as well, so I don't think it is a question of how many people want it. Returning actual bytes sent does not affect any current code that knows not to call write with size>TX_BUFFER_SIZE. Even if the call never use a size > TX_BUFFER_SIZE, if the tx buffer is not completely empty and free is less than TX_BUFFER_SIZE, only part of the data is sent.

The typiical code at the high level can be

Code:
size_t total=0;
while ((total += Event1.write(buffer+total,size-total))<size) ;

you said "Earlier versions had this but the performance on a whole was not what I wanted "
What was the performance like? the dma has to make it perform better than stock Serial library no?
Even slightly better is ok, as long as it is not worse.
 
by "it doesn't get sent", you mean it is lost right? That's not good, considering the higher level write function, you return the size parameter, making the caller think the whole "size" bytes got sent when in fact, only TX_BUFFER_SIZE got sent.
You are correct this should reflect the amount that got sent, I'll fix that. Thx for pointing that out!

the fill_internal_buffer_time example is not the same, PACKET_SIZE is 16 which is not greater than the TX_BUFFER_SIZE.
it starts at 16 but increases up to the size of the buffer ->
Code:
PACKET_SIZE += INC_VAL;

Yes, the objective is non blocking code, but likewise no lost data should be an objective as well, so I don't think it is a question of how many people want it.
yes and no, you really can't have it both ways, if you are sending more than what the buffer has available or larger than then it will have to block if you want it process it in one call to the print, write functions.

you said "Earlier versions had this but the performance on a whole was not what I wanted "
What was the performance like? the dma has to make it perform better than stock Serial library no?
Even slightly better is ok, as long as it is not worse.
it would wait until space was available in the buffer to send the whole message if it was larger or no space was available so i decided that it would send what it could and return. DMA doesn't let you send faster than the baud rate, that being said the "write" code it what determines the performance at the user level, plus the interrupt overhead in which the dma can be configured to interrupt less which my library does. It only interrupts when the message is sent per call to "print" or "write" or its variants since it inherence the print class.
 
Thanks for the clarification.
I'll wait for the update. That will also make write function conform to how the function is defined in the Stream class, which is the parent class.

I think sending data>TX_BUFFER_SIZE can still be asynchronous if sending the next chunk is done in the txEventHandler so main loop can still do other stuff. Is this possible? considering txEventHandler is invoked from an ISR.

If the main program waits for txeventhandler callbacks to completely send all data before updating the buffer contents, would it save some cycles to not do the memcpy to the internal tx buffer and just dma direct from the original buffer? maybe the whole thing in one shot and just callback when complete?
 
question, does this
Code:
attachInterruptVector( IRQ_UART0_ERROR, user_isr_tx );
mean that the txEventHandler is only called when there is transmit error ? or does it include tx completion?
or is the tx completion in the rxEventHandler since that is attached to IRQ_UART0_STATUS.

maybe it should be called error event and status event instead of tx and rx event.
 
I think sending data>TX_BUFFER_SIZE can still be asynchronous if sending the next chunk is done in the txEventHandler so main loop can still do other stuff. Is this possible? considering txEventHandler is invoked from an ISR.
yes the user could put code in to handle that with the availableForWrite function.

If the main program waits for txeventhandler callbacks to completely send all data before updating the buffer contents, would it save some cycles to not do the memcpy to the internal tx buffer and just dma direct from the original buffer? maybe the whole thing in one shot and just callback when complete?
I already tried this and it will only work if the users data is at global scope. If its defined locally, when the users function goes out of scope that memory allocated is destroyed and no longer points to valid region. Or you have block until its sent.

BUT, if you could set it up so the user has to define their sending data with global scope, you can blaze right through that "write" function. I mean its nano seconds BUT if you try to send anything else before that data is finished you either have to block, skip until its done or you can kill the current dma transfer and start a new one.

Another performance gain could be achieved by not inheriting Print class at all and rolling your own top level code but the gain is probably minimal.
 
Status
Not open for further replies.
Back
Top