I don't want to discourage you. I also don't want to speak too harshly here, but from the nature of these questions it's pretty clear you're at only the very beginning of your project.
When/if you dig deeper into the code, you'll discover the USB support in Teensyduino for those 8 bit AVR chips is basically the same as those obsolete stand-alone zip files. The single C file approach was separated into two files, the original C code for the USB control stuff and basically the same C code renamed to C++ functions for the sake of Arduino. On the keyboard stuff, a translation layer for non-US layouts and decoding UTF8 was added, but even that is pretty straightforward C-only code inside C++ functions, and it's just another layer on top, not actual USB device code. The actual USB stuff is pretty much identical.
You're probably also going to discover the concept of "plan to port my project to many, many more chips" may not be very realistic. Or at least starting from my C-only code or my nearly identical C/C++ code in Teensyduino probably isn't a good idea. That code is pretty tightly coupled to the specifics of those chips used on Teensy 2.0 and Teensy++ 2.0. If you look at the USB code inside Teensyduino for Teensy LC & 3.x, you'll see I pretty much started over from scratch. That ought to give you some idea of where you'll be headed...
I'd advise you to take a look at Dean Camera's LUFA library.
http://www.fourwalledcubicle.com/LUFA.php
As you look into the code, you'll see Dean and I have pretty much opposite coding styles! Where I take a minimal, coupled-tightly-to-hardware approach, Dean puts in a lot of work to abstract almost everything. His library was designed to support many chips. I believe it supports all the AVR USB chips, whereas my work only supports the specific chips we used one 4 Teensy models, and 2 of those have been discontinued for nearly 8 years and haven't been supported for a very long time. I believe Dean also did some work for NXP to port his library to some of their LPC chips. Even if you don't use Dean's code, it has a design that's aligned with your many-chips goal. My code has pretty much the opposite design! I personally find all that extra abstraction stuff distracting, but then my focus has always been laser-focused on specific chips. For your project's goal to support a very wide range, you're probably going to need that sort of approach.
You're also going to discover two unpleasant realities of USB. First the obvious: the protocol is very complex, especially when you factor in the huge range of device class specs and quirks of the 3 major operating systems across many older versions still in widespread use. The second reality you'll find is the hardware support in different microcontrollers varies dramatically. That's why I basically re-wrote everything from scratch for the 32 bit boards. It's also why I haven't (yet) supported the 2nd USB port on Teensy 3.6... it's completely different from the 1st USB port, even though they're on the same chip.
Again, I don't want to discourage you. I do hope you can find something useful in Dean's library or even my code. But please understand I'm very focused on just the specific hardware we use on Teensy. As you can see from the still-lacking support of the 2nd USB port on Teensy 3.6, I have a tremendous amount of work to do. Other than pointing you towards LUFA and the general idea of abstractions you'll need to support many chips, I really can't help you any more.