Options for 'bare-metal' development

Status
Not open for further replies.
@Paul, that's not really a "great" solution, replacing one ISR with another slightly more complex ISR called half as many times. It's interesting though :) Edit: or did I misunderstand, and you're talking about much larger buffers?

As I mentioned above, you'd generate buffers of about 100 steps. That's a 100X reduction in interrupt overhead.
 
It wasn't clear to me when you mentioned it, sorry.

But I still think (using the 10% ISR overhead guess) at 41KHz it would only increase by 4KHz = 45KHz

The extra complexity isn't justified, I don't think.
 
Bummer the std display is such a slow one - trying to link your model just gave whole units not the display? Those other displays are all under $10 on eBay.

SparkFun shows an LPC1768 at $55? You could get a T_3.6 and a T_3.5 for that price.

I wonder how the T_3.5 OC'd to 144 or 168 MHz would run in your case? With more FLASH and 3X the RAM and the built in faster SDIO SD that should be less overhead - there should be ways to gain on the other MCU.
 
It's not just about a cheap display, the repraps have both controls and a full size SD card slot.

For the LPC1768 I'm using a RE-ARM board that I got from Kick Starter for ~$40 (I think it's going to be $49 though) however it's already in a MEGA2560 form factor (so a RAMPS 3D printer board plugs right in) and has the 3.3v to 5v logic level conversion required. One of my Teensy's is soldered into a MEGA2560 (originally blank) shield, with a bunch of extra components now, and its price so far is about the same. But that's the price of having fun, isn't it.

More flash and more RAM isn't going to do much for this project as they both have enough for what I've got planned. OC'ing isn't a bad thought though, but the T3.6 is sounding better because I'm not really using the 5v tolerance of the T3.5. It was just a safety feature in my mind. 240MHz will probably do what I want, but there's a bit of work getting one of them set up like my T3.5 and I'm not on holidays anymore - so it'll have to wait some time.

(Edit: I think the RRLCD uses an ST7920)
 
Last edited:
It's not just about a cheap display, the repraps have both controls and a full size SD card slot.

That's a bit more about the display - but still don't know what it is. Monochrome i2c with buttons? With SD is it SPI sharing the bus?

Not sure of the nature of the process - thought perhaps the RAM and extra Flash might lead to more room to work ahead. OC'ing is YMMV - would be nice if it fits your use case though.
 
ST7920, monochrome, SPI and on a Mega2560 it uses different pins for the SD SPI but the RE-ARM is all wired to the same SPI port. (That's a bit annoying actually, along with the decision to not use a PWM pin for "D8".)
 
But I still think (using the 10% ISR overhead guess) at 41KHz it would only increase by 4KHz = 45KHz

The extra complexity isn't justified, I don't think.

DMA should allow 4 outputs to all run up to at least 500 kHz with very little CPU overhead.

I think the much bigger issue is a very well established way of thinking in the CNC & 3D printer communities.
 
DMA should allow 4 outputs to all run up to at least 500 kHz with very little CPU overhead.

I think the much bigger issue is a very well established way of thinking in the CNC & 3D printer communities.

I'm not sure what the bigger issue is, it's better to talk about it and then try stuff out, I find.

I've "moved" my code from the LPC1768 to the T3.5 (I didn't change much, just added a heap of volatile variables that match the hardware of the LPC chip and a loop calling the ISR for each stepper in turn) and I saw that the extra 20MHz on the T3.5 gave about 30% extra performance. This equates to about 55KHz stepping instead of 41KHz.

Then, I removed all the code to do with control, just leaving behind the timer calculations, I added in loops of 100 where it was all done in one function (and 4 extra multiplies: match_register_x *= prescaler_x; since this would be needed on the T3.5) and tested 4 steppers for how many calculations could be completed in 1 second. The result was 126KHz. So I expect on the LPC chip that would be ~97KHz.

My conclusions are:
1/ The extra complexity does give you something not to be sneezed at. However, in pulling the code out I realised "end-stops" need to be tested by DMA now, which is a further complication.
2/ "500KHz" is not going to happen with any of these devices.
3/ "little CPU overhead" is a dream, and basically ignored a large part of my original argument, but I admit my initial calculations were way off.
 
Right now I'm working on EHCI support for the USB host port on Teensy 3.6. But in a month or so when that's done, I'll look into possibly creating a library or solid example of using DMA and the combine-compare timer feature.
 
I feel inclined to throw my two cents in here, because I play in both worlds, Teensy and CNC router stuff.

I think Teensy for cnc is one of the best ideas I have chanced across in this Forum.

I own a little YooCNC 6040, which I retrofitted with new electronics and a water system for routing PCBs and G10 glass composites. I currently run LinuxCNC.

20150528_170406_smaller.jpg

I think it's a great idea to create a CNC controller based upon Teensy. I know I would buy it, assuming the board had nice isolated screw-terminal connections on it like others.

(By the way, on the topic of opto-isolated IO pins, limit and home switches, etc, do not settle for 5 volts, have option for the isolated inputs to accept up to 24 volts like MESA 7i76 does, for noise immunity, it solves many common noise issues especially on lesser machines which might be poorly grounded, etc. Plus many common inductive proximity sensors work up to 24 volts.)

That would be a hot seller, because it would be right at the center of several hobby universes; fast Arduino (faster than others), CAN bus?, Ethernet option (think smooth stepper), various user-developed sensors & probes, user firmware tweaking, open source, tons of IO for home/limit switches. And then throw in the 3d printer folks and their requirements = big market.

I am currently working on a Teensy/Arduino optical displacement sensor for workpiece height correction, to make the router bit follow warped PCBs, etc to solve the un-even height issues when removing PCB copper on cheaper machines. I strongly suspect it would be easier for me to make my Teensy-based sensor talk to a Teensy-based cnc controller, than to LinuxCNC. If enough CPU resources were freed (via those DMA tricks for handing the pulses) on the Teensy cnc controller to use my sensor with it directly, that would be even better.

I think if you can leverage DMA to run stepper pulses fast as "4 outputs to all run up to at least 500 kHz" that would be icing on the cake, as most hobby routers only require below say 100,000 or less to run respectably, so there is a nice margin, and there would be more "room" for dividing the pulses down for purpose of micro-stepping.

The DMA benefits might be a lot worse than 500Khz max per motor, but looking only at speed of pulses is limiting the appreciation of the other big issue, leaving enough CPU left over to do other things, like in my case, running a latency-sensitive sensor on the same Teensy device without significantly increasing jitter & latency for the sensor or the motor pulses. From this perspective, the more CPU cycles that are "freed up" to handle other tasks, the better.

My other thought was Teensy 3.6 has a CAN bus, which I suspect is becoming a popular bus for industrial control of servos and steppers. Maybe that's a nice touch, but I know little of the specs and requirements for CAN bus in general, and less for use controlling CNC machines, but maybe it could bring advantages? I think Linuxcnc supports the CAN bus now, and there is some buzz over there about it's advantages, etc.

The LinuxCNC developers and enthusiasts are a great source of quick info on technical subjects like timing requirements, etc and are generally available on their forum and on their IRC chat, see http://linuxcnc.org

Done right, I bet it would become it's own thing, TeensyCNC based Controller, with it's own website(s), etc, from folks who build it out.

Do get that DMA working for stepper pulses, that's huge potential. I am happy to offer any facts I can as an operator, and testing such a beast on my machine.

Now I wonder if Teensy 3.6 has enough power/DMA tricks to do closed loop servo too, hmmm. (another topic, really)

But it already sounds like a killer app for controlling steppers at those suggested pulse rates,
and enough IO to handle home & limit switches, a probe input, some relays, motor speed 1 pulse-per-turn encoder input, etc.

And once again my mind wanders into the topic of could Teensy 3.5/6 handle a closed loop requirements, 4 motor encoders, servos, hmmm.

One challenge at a time, though.
 
Last edited:
...
(By the way, on the topic of opto-isolated IO pins, limit and home switches, etc, do not settle for 5 volts, have option for the isolated inputs to accept up to 24 volts like MESA 7i76 does, for noise immunity, it solves many common noise issues especially on lesser machines which might be poorly grounded, etc. Plus many common inductive proximity sensors work up to 24 volts.)

Very good remark! - I struggled some time with false triggers on limit switches myself. All measures will be taken to protect against this, and your suggestion to have 12 or 24V signals here is a good tip. As we're opto-isolating them this is easy to do.

I'm aware of the existing CNC commmunities (linuxcnc, smoothieware, tinyG, GRBL...), and I'm getting inspiration there - thanks.


...
One challenge at a time, though.

Absolutely !
Let's get this controller working first, and then improve it :
* higher speed through DMA
* more interface types (Ethernet, CAN, ...)
* more GCode support (cutter compensation)
* ...
 
I made a first experiment to have DMA generate a configurable waveform.

https://github.com/PaulStoffregen/StepperPulse/blob/master/k66_dma_stepper.ino

file.png

The waveform comes from this data:

Code:
    (10000u << 16) | 0xDD, // set on match
    (30000u << 16) | 0xD9, // clear on match
    (40000u << 16) | 0xDD, // set on match
    (45000u << 16) | 0xD9, // clear on match
    (25000u << 16) | 0xD9, // clear on match

The first wide yellow pulse is 10000 to 30000 for the first 2 lines. Then the next narrow pulse is 40000 to 45000. The blue waveform is a 50% duty cycle showing the timer count, where 0-32767 is high and 32768-65535 is low. The long time the yellow stays low is due to a redundant clear at 25000, which occurs in the next timer cycle, and then another timer rollover is needed before the 10000 match can happen.

So far, this is just experimenting with the DMA and timer settings. Next up will probably be playing with the scatter-gather feature to automatically chain blocks of outputs without any timing gaps, and the channel linking so a 2nd I/O pin can be controlled for the direction signal.
 
Thanks Paul, will take a look at your code asap.

I don't know how familiar you are with controlling steppers for this kind of machines, so let me summarize the design challenge here :

* for each motion (= a movement from one position to a next position) there are 3 phases :
1. an acceleration phase
2. a constant speed phase
3. a deceleration phase

Each of the phases is optional, but there must be at least 1 of course..
Constant speed is simple, as the stepper pulses have a constant spacing in time (ie. a fixed number of timer-ticks)

Acceleration/Deceleration is more challenging, as speed is constantly changing and so is the time between steps..
* In the so-called T-profile (Trapezoid = 2nd Order) speed is changing linearly over time (constant acceleration)
* In the so-called S-profile (= 3rd Order) speed is changing quadratic (constant jerk)

For the tool (spindle / laser / 3d-print-head) to exacly follow the prescribed trajectory, the speeds and accelerations of all motors need to be coordinated.
A problem here is that when 2 axis (eg X and Y) move at relative speeds of (eg) 3 and 2, for every 3 steps of X there are 2 steps for Y.
Ideally these steps are spread even in time (X: 0 - 1 - 2 -3 - 4 - 5 -, Y: 0 - 1.5 - 3 - 4.5 - ),
but many algorithms (eg. Bresenham) will drive the fast axis (X) and at each step of X optionally drive Y. (X: 0 - 1 - 2 - 3 - 4 - 5 -, Y : 0 - 1 - 3 - 4 - )
In that case the steps for Y will not be distributed even in time, decreasing accuracy and performance.
A possible solution is to 'oversample', eg X makes 3 steps in (eg) 24 ISR-ticks, and Y makes 2 steps in those 24 ISR-ticks. Now the steps for all axis are better spread in time. The price is (eg) 8 times more ISR calls.

Finally, at any time the user may issue a 'feed hold', or a limit switch may trigger, which means all motors should come to a controlled stop, asap. This is a challenge for pre-calculating steps : at any time you should be able to divert from the pre-calculated plan, and switch to a different one. When doing this, it is important to not loose position etc, as after a resume, the machine has to resume the original plan.

I'm eager to learn from your FlexTimer/DMA experiments and see how we can make this work.
 
Indeed pre-computing steps will make "feed hold" a big challenge. I'll keep this in mind while experimenting with the DMA scratter-gather.

Accumulating total steps taken will also be a challenge.

The good news is step sizes will have very fine granularity, basically the same as if the ISR were running & oversampling at 60 MHz rate.
 
Ok, let's talk about hardware. Should there be a baseline hardware board, perhaps similar to RAMPS 1.4 for Arduino Mega? Is RAMPS 1.4 a good starting point, or too old to be useful?

With all board designs there's a lot of trade-offs to make between features, size, complexity and cost. Is it even possible to come up with a decent baseline board that has what most people need and can be expanded or modified for the rest?
 
My ideas on hardware are on a dedicated wiki-page

In short : what we need on top of the Teensy itself is :
1. power supply (going from 24..36V to 3.3V)
2. open-collector drivers towards the stepper drivers and solid state relays - think ULN2803
3. opto-coupled inputs for limit-switches, probe, buttons, etc. - think ILQ615 - to some extents we could need some (overvoltage) protection / (low pass) filtering on these inputs.
4. maybe TTL-RS232 signal conversion - think MAX3232 - I'm still not sure which is most reliable here

Then later, we can think displays (SPI / I2C), and Spindle Control (Serial RS485), Ethernet etc..

Also check the reply from macaba in this thread
 
Actually, where your expertise could be really helpful is in validating which GPIO's are 'safe' to use.

I'd like to use a consecutive number of bits on a port, to keep things clean and simple.
For example :

PortC[0..11] for 6 axis Step-pulse and Direction
PortD[0..7] for limit switches
PortE[0..1] for serial communication

But at the same time I want to keep as many options open as possible for future add-ons (SPI, I2C, Ethernet, CAN, you-name-it). I haven't done this check yet.

Remark : the above layout would also work for a Teensy3.2 in case you want just 3 (or less) axis, and less speed..

Here is a link to an Excel cheat sheet I've been collecting to make these decisions
 
Ok, let's talk about hardware. Should there be a baseline hardware board, perhaps similar to RAMPS 1.4 for Arduino Mega? Is RAMPS 1.4 a good starting point, or too old to be useful?

With all board designs there's a lot of trade-offs to make between features, size, complexity and cost. Is it even possible to come up with a decent baseline board that has what most people need and can be expanded or modified for the rest?

It seems odd that people start these CNC control threads based on AVR hardware/firmware as their starting point. There is already an open source project based on LPC1769, both hardware and software, wouldn't that be the place to start?

http://smoothieware.org/
http://smoothieware.org/smoothieboards
http://smoothieware.org/smoothieboard-v1

Review video: https://www.youtube.com/watch?v=vsu_vAKvRO0

If I was going to port to Teensy I would use that as a base.
 
Yes, I know, and there are even more boards and SW : GRBL, TinyG, LinuxCNC, xPro, RAMPS, ... (I am currently running GRBL on my machine, with a self-made shield).
I started with an xPro V2 board. I soon replaced it with decent stepper drivers and the performance (speed, accuracy, torque, reliability) of my machine went up 500%. I also had long lasting problems with false limit-switch triggers with the xPro.

So for the hardware, I know that the all-in-one boards are no decent solution for CNC. I understand they are an attractive one-stop-shop offer for beginners, but that's not what I am building.

Then for the software : I'm quite happy with the minimalist approach from GRBL, but it's really squeezed into the Arduino, and there is not a free byte left to play with.. We could port GRBL to the Teensy, just to have more memory and IO's. FYI, GRBL has been ported to the Smoothieboard, STM32 etc.. After studying the GRBL code, I decoded it was a better option to rewrite rather than do a dumb port.
 
I've also created a spreadsheet that shows the alternate pin functions of a T3.5 (see attachment).

RE: RAMPS 1.4, there are a few issues I often have with them, end-stop switches picking up stepper motor wire cross-talk, the power plug melting when using high powered heaters, only 5 drivers (having just one more would be really nice) and having 3 more MOSFETs for fans would be ideal too (two for extruder cooling and one for printed part cooling). But they are dirt cheap, and many existing machines (of all types) are built around them. It's also not hard to add an extra driver and/or more MOSFETs since they have a lot of spare header pins.

Regarding Smoothie, yep, that's what came with my RE-ARM board and it worked right away, well, with the supplied config file. The whole reason the RE-ARM was designed on "AVR hardware" though was so people could upgrade their existing machines without any hassle. But Smoothie is based on GRBL along with all its design-decisions/limitations (so it [GRBL] can work on an AVR) which is something a couple of us have said we want to move away from. I'm also underwhelmed with its LCD interface, it's very basic and a long way away from firmwares like Marlin that have a nice graphic status screen and remember menu selections when returning to previous menus.

Using Smoothie as a place to start will see you throwing a lot of it away, but to be honest, my work on the LPC1768 did start there by me pulling out the library code for USB Serial, USB Mass Storage Devices, SD card, FAT-FS, Interrupt controller, Analog to Digital... Basically all the stuff that they got from mbed under the MIT license but no files that said "Smoothie" at the top. None of that is useful on the Teensy though. The only thing you might keep from Smoothie is the gcode processor, but I've already written my own, and it's actually not very complex.

Also, it's reported (the guy maintaining GRBL points this out) that Smoothie doesn't perform as well as expected: https://github.com/gnea/grbl/issues/67#issuecomment-267372291
 

Attachments

  • TeensyALTfuncs.zip
    13.2 KB · Views: 146
In my mind, to establish a well-regarded Teensy based motion controller project, it's important to determine if a significant improvement can be made over existing platforms, so my opinion says that's either of these things:

- Higher order motion.
(LinuxCNC setting the benchmark? Aforementioned s-profile is good step in the right direction)

- Zero/negligible CPU-load pulse generation.
(Paul's DMA experiments is good step in right direction. Really lightweight interrupts is another option. A higher pulse rate is likely to be possible as a side-effect but this potentially falls under the category of "Something people think they need but they don't really")

- More axis of motion.

Everything else is either:
- An already-solved problem (e.g. GCode parsing).
- A bolt-on feature (e.g. optoisolators = add on circuitry, CAN/RS485 = add on circuitry & library, Ethernet = add circuitry & library).
- A case of alternate board layouts (e.g. more IO, 3D printer specific variants, onboard stepper driver ICs).

Everything I've seen so far in this thread indicates that there is proof-of-concept work occurring of higher order motion and zero/low CPU-load pulse generation so I'm very encouraged by that, thank you. I'm looking forwards to seeing the results of experiments as they occur.
 
I have a diy delta 3D printer running with the following setup:
- Arduino Mega 2560 (with Repetier firmware)
- Ramps 1.4 modified to 24V
- DRV8825 stepper drivers (cooled from the bottom with a heatpipe)
- Nema 23 stepper motors

It works well, but I'm constrained in some ways. I would recommend to improve the following points:
- Overvoltage protection for the stepper drivers (like here https://my3dprinter.wordpress.com/category/schrittmotortreiber/)
(The induced voltage when moving the stepper motors manually destroys the stepper driver easy.)
- Higher step frequency would reduce the noise and vibration (like TMC2100). Keyword microstepping.
- The Arduino runs at the limit (probably because of the delta calculations). In my case it is not possible to connect an LCD Display on the same Arduino, without loosing print quality.

I also have a diy CNC machine which runs LinuxCNC and the openloop stepper drivers are simply connected to the parallel port.
For such external stepper drivers (in my case DQ860MA) a Teensy hardware board would essentially need enable-, step- and direction-terminals for each axis and some input terminals for the limit switches (24V prefered).

I can very well imagine that a Teensy hardware board could be produced universal for CNC machines as well as for 3D printers.
 
In an old Teensy 3D Printer thread i have been discussing a the hardware design for a Teensy-based 3D printer controller.
There is als some work done by GhostProtoype to port the Marlin firmware HAL to the Teensy 3.5 and 3.6.
https://github.com/teemuatlut/Marlin/tree/32bit-HAL

I have started the schematic and layout work for a Teensy 3.5 or 3.6 based 3D printer controller: https://github.com/Flydroid/Teensy-3DPC
It is designed to use the Pololu style stepstick drivers but in the SilentStepstick TMC2130 version with SPI configuration. Flyback diods are already included in the design. I haven't considered optocouplers for input protection, probably because i haven't seen them on other printer controller boards.
As I haven't come far with the design yet, I'm very much open to input.
Especially if the DMA step generation has some requirements on the use of specific pins on Teensy.

I like the idea of a common Teensy motion controller design on which people can spin CNC or 3D printing specific boards.
 
GPIO Pinout for MY 3D printer based on Teensy 3.1

A couple years ago, I prototyped up a Teensy 3.1 based 3D printer based on some work by an engineer at Freescale.

http://arduino-pi.blogspot.ca/2014/12/teensy-31-repstrap-printer-with-dc.html

teensy-printer-pinout-v1.png

I have been dabbling with it off and on for the past two years, so it's nice to see other's joining the quest.




Actually, where your expertise could be really helpful is in validating which GPIO's are 'safe' to use.

I'd like to use a consecutive number of bits on a port, to keep things clean and simple.
For example :

PortC[0..11] for 6 axis Step-pulse and Direction
PortD[0..7] for limit switches
PortE[0..1] for serial communication

But at the same time I want to keep as many options open as possible for future add-ons (SPI, I2C, Ethernet, CAN, you-name-it). I haven't done this check yet.

Remark : the above layout would also work for a Teensy3.2 in case you want just 3 (or less) axis, and less speed..

Here is a link to an Excel cheat sheet I've been collecting to make these decisions
 
Status
Not open for further replies.
Back
Top