The reason "why" has to do with trade-offs in the actual silicon fabrication processes.
On Teensy, the processor, RAM and non-volatile storage is all on the same piece of silicon inside a single chip. The silicon can't be overly optimized for any one thing, but rather a balance that achieves overall performance for all 3.
On Raspberry Pi, the processor, RAM and non-volatile storage are separate chips. Each piece of silicon can be highly optimized for its specific task. The Broadcom BCM2835 processor on the RPi actually has a memory chip stacked on top of it, so the 512M RAM is separate silicon optimized for memory density (and would probably only be able to implement a 1980-era processor). Some other "single chip" SoC boards actually have 2 pieces of silicon mounted inside 1 plastic package. On a RPi, the non-volatile storage in the SD card is probably also 2 pieces of silicon, one optimized for high density flash and a small controller chip.
The key point, the reason why, involves dramatically optimizing the silicon fabrication for a particular purpose, at the expense of other applications. There are people who are truly experts in this silicon fab stuff. I'm not one of them. This is only my general knowledge. Someone who really does this stuff could speak much better about the specific silicon trade-offs (expect much of this stuff is closely guarded trade secrets of some of the world's most powerful companies). But here's some very general ideas....
Flash memory's dual gate requirement has traditionally been the big speed-limiting issue in silicon fabrication. Normal CMOS fabrication requires only 1 thin oxide layer to separate the gates from the chip's substrate, and it only needs to insulate well enough for the transistor to work at relatively slow clock speeds. In flash memory, 2 thin oxide layers are needed. The non-volatile storage is achieved by trapping electrons on a floating transistor gate between the oxide layers. Each layer needs to insulate extremely well, since those electrons are supposed to remain trapped there for over 100 years at room temperature.
Silicon fabrication is done in layers, usually be growing an oxide layer (pure glass) on top of the wafer plus everything done in the previous steps. The a photosensitive mask chemical is added and exposed to light through the masks that define where the circuit feature will be. The wafer is then exposed to a strong acid that etches away the oxide (glass) where the mask allowed light. Then the wafer is coated with other stuff and baked at very high temperature, causing that stuff to become part of the chip (eg, "stuff" can be materials the implant into the silicon itself, or grow more layers on top of remaining oxide that may itself be on top of other layers). Then more acid or other chemicals remove the excess stuff, the resist and unneeded oxide. This is repeated many times, building up the many features and layers inside the chip, starting with the N+ and P+ implants that form the source and drain of transistors, then the "polysilicon" layers that form the transistor gates, and finally metal layers for signal routing.
One pretty incredible challenge in this fabrication process is not destroying the work from all the previous layers. My understanding is the general approach involves using progressively lower temperatures for each step. The upper layers tend to have lower resolution and other limitations. The entire process is really a pretty marvelous achievement of modern technology. But it's far from magic. There are a LOT of difficult tradeoffs.
Those tradeoffs can be made in many different ways, which can optimize the process for a particular application, but too much optimization for one thing can cause that silicon fab to be nearly useless for others.
Flash memory's requirement for floating gates apparently imposes a lot of limitation on all the other layers that can be fabricated inside the chip. Again, there are poeple who really know the details, but I sadly only have general knowledge in this area. I've been told DRAM processes involve multiple layers of polysilicon, which has high resistivity (slow circuit speeds) but can be made with incredibly fine resolution and is made at much higher temperature. For designing logic circuits that run at high speed, you generally want to connect the polysilicon gates to low impedance metal routing as closely as possible, since the gates are capacitive. There are a LOT of trade-offs in these silicon processes.
So that's why. On Teensy and all flash-based microcontrollers, the silicon is optimized for balance to achieve performance. You get logic circuitry, volatile and non-volatile memory all on the same piece of silicon, but it can't be too heavily optimized for any one of those things without sacrificing performance on the others. The on-chip flash memory imposes a lot of restraints on the silicon design. You also tend to get fairly low power, because everything is on the same chip, and also because the optimizations for flash memory tend to be similar for the optimizations for low power.
Even though Raspberry Pi might be called a SoC (System on Chip), in reality it's separate chips for the CPU+GPU and the DRAM, where the Boardcom chip is highly optimized for fast logic circuitry and the DRAM chip on top is optimized for dense volatile memory. Inside the SD card, there's a high density flash chip (or stack of such chips in the larger cards) fabricated on a silicon process that's optimized for flash only. In fact, NAND flash is so optimized for density that a small percentage of the sectors are bad and more defects develop over time, so SD cards have a small controller (fabricated on different silicon) that manages the media defects and performs wear leveling. Extremely optimized NAND flash might not even retain data for 10+ years, since the controller makes heavy use of error detection and correction algorithms and automatically reassigns data to new areas of the media as defects develop. They don't tell you such things when you buy a 32GB card at a retail store, but internally that capacity is made possible by these types of extreme optimizations.
This turned out really long, but hopefully that gives you a better idea of why the market is filled with 2 different classes of products.