If you are referring to the 32-button limit per USB Joystick, I recommend making your Teensy an USB Keyboard + Joystick pair, with the extra buttons producing standard keypresses.
This thread has a custom Joystick that can support up to 128 buttons, but a lot of OSes and programs have issues with joysticks having that many buttons.
(I think the first microcontroller project I ever did was this arcade joystick with a couple of buttons with a Teensy 2.0++, that I used to produce configurable keyboard key codes so I could play SuperTux and online Flash games with it. I had a rotary switch to select between preprogrammed sets.)
If you use a Schottky diode (for the low voltage drop) per button (or a diode pair per button pair), you need
R digital outputs and
C digital inputs for
R×
C buttons. The diode allows you to detect each button/key separately. 100 BAT54W-HG3 Schottky diodes in SOD123 package (first one that popped up in a Mouser search; so definitely just an example) should cost less than $10 USD. I'd add a current-limiting resistor on each output, say 10 kOhm, for safety; and those are always useful to have around.
The idea is that the
R outputs each select one "row" of buttons, with a diode each between the row and the button, and the button connected to the corresponding "column" input. Only one of the rows is selected at a time, so the inputs directly read the states of the buttons on that row.
If you connect the diodes from the row to the button, with anodes on the rows, and cathodes (lines/bars) on the buttons, then a high output selects a row, and the column inputs are inputs with internal pulldowns enabled. (That way, when a line "floats", it reads as low; and high when connected to the row via a pressed button and the diode.)
When you have multiple buttons pressed, the diodes stop the other columns from being pulled high through the other buttons. Without the diodes, combinations of buttons cause other buttons to look like they're being pressed also. The voltage is 3.3V or so, and with a 10kOhm current-limiting resistor on each row, the current is at most 0.33mA. Not much.
If you are running out of pins, you can instead use a 3-to-8-line decoder/multiplexer chip, like a 74VHC238, to use just 3 output pins to control 7 or 8 rows of buttons. (The extra row, usually row 0, is a special "not connected" row with no buttons, so that when not being actively scanned, the decoder/multiplexer does not need to source/sink any current. Also note that the direction of the diodes also dictates the type of decoder/multiplexer; here, you'd want one that pulls only one output high at a time. The other type pulls one output low, and all others high. That type would also work if you reverse the diodes, and enable the internal pullups on the inputs.)
With say three of those (so 9 row selector output pins for 7+7+7 = 21 rows) and 10 column input pins you get 210 individually detectable buttons.
If you don't need the pins on Teensy 4.0 for anything else, you can do e.g. 10×13 for 130 buttons. (I'd use dual common anode diodes then for sure, maybe BAS40 or BAT54A in SOT-23-3 packages, just to reduce the component count!)
I've designed a couple of example boards using EasyEda for Teensy LC,
this one for through-hole components, 32 buttons and 9 pots, and
this one using surface-mount components. I'm only a hobbyist, and haven't even had any of these made (feel free to, if you like, though), but perhaps they give you some ideas!