AFAIK, you can make limited switches re: SPI pins. But I have never attempted to do so - too much pain for too little gain IMO.
FWIW, I would look at SDFat as a means of storing data on a SD card. Bill Greimans SDFat library has some very good examples of fast logging to SD Card.
Whether or not you can get 500 samples/s out of four channels will likely depend on a couple of things, i.e. what resolution are you looking for, how often does the stuff have to get written to disk, and so on. On the Teensy 3.1, you can make use of two ADCs and you should be able to get at least 2ksps out of them each, even at 13 bits of resolution. However, choose your pin assignments carefully to make that happen (i.e. two channels each) and in an ideal world discover that you can differentially sample data.
In my data logger application, I compress the data into second-by-second data and then store locally until a master CPU calls for the stuff. Then the local MCU squirts the bird and goes back to work with a minimum of disruption (using a Teensy LC for the measurements, 3.1 as a master). One reason I'm using this two-step process is that very consistent timing is important in my application (I'm measuring power and the phase lag between current and voltage measurements has an impact on accuracy).
The Teensy can still talk serially via USB with your PC. However, as with any input/ output, etc. I'd limit myself because you don't want to prevent your MCU from hitting its potential re: measurements in the name of spitting it all out to your serial console.