Mildly disappointing initial results.

Using the internal reference with gain set to two I expect 0 to 5v = 0 to 5 octaves.

The characteristics that affect your input data to ouput voltage relationship are, for the DAC:

- the number of bits (16)

- the integral non-linearity (±4typ ±12max LSB)

- the differential non-linearity (±0.2typ ±1max LSB)

- the offset error, and zero-code error, which affect the whole line and in particular the values near counts of zero (±1typ ±4max mV offset, 1typ 4max mV zero-code)

- the full-scale error, which affects the values near max count (±0.03typ ±0.2max %FSR)

- the gain error, which affects the slope of the line (±0.01 typ ±0.15max %FSR)

- the initial accuracy of the voltage reference

and then the way all of those change with temperature. The spec sheet uses a variety of different units for these, to hide the scary large values. Convert them all to mV or μV. Yes, 0.2% FSR is 10mV.

For the surrounding circuitry, also consider

- the power supply (needs to be greater than the voltage you are outputting)

- noise on the ground (affects effective number of bits and how close you can get to zero count)

- the gain and offset of any external buffering

- the output impedance (ideally close to zero, 5Ω for this DAC with no external buffering), and the input impedance of the circuit you are driving (typically 50k to 100k for Eurorack modular).

To isolate the effects of those parameters that affect the end of the line, its typical to measure linearity and gain between values close to, but not at, the extreme count values. For example in the datasheet it says "Using line passing through codes 512 and 65,024" for how they measure linearity.

I suggest measuring the actual voltages you get for the codes corresponding to 0.5V, 1V, 1.5V and so on to 4.5V. That lets you check linearity and slope. Your voltmeter should have a higher accuracy than the DAC you are measuring; for a 16bit DAC providing 0 to 5V 1LSB is 76μV so on a 5V range (or 4V if that is the max on your meter) you are looking at better than 0.1mV precision (50,000 counts or 40,000 counts with only 1 count error). Otherwise, the errors you measure are just as likely to be from your meter as from the DAC.

(My meter for example does not meet that requirement; it has 40,000 counts and ±(0.05%+5) accuracy, so exactly 3.0000000 V has an error of 1.5mV and could measure anywhere between 3.0020 and 2.9980V - that is ±26LSB! So you can only adjust to within the limits of your meter. Check the specs on your voltmeter. Check too the specs on the frequency meter on your VCO and the frequency meter on your volmeter, if it has one. Assuming you have a TipTop Z3000 VCO, it has a display tolerance of 0.5Hz at values below 1kHz and makes no claims regarding accuracy.

If I set up C on my VCO with 0v (it has a frequency display) and then change the DAC to output 5v I don't quite get C+5

"don't quite" tells little, and you mentioned your VCO has a frequency display. How many cents off? If you set up C with 1V, and then send 1V and then 4V how close are those values to C and C+3 (in cents)?

That's with the DAC directly plugged in to the VCO - no opamp etc

OK good so I don't need to ask if you copied some circuit with a 1k output resistor, which is common and causes a perfect DAC to have 0.99V/oct.

Having said that, while it is OK to send the output of a DAC (provided that DAC has short circuit protection) direct to VCAs etc, for pitch CVs whewre you care about accurate tracking over 4+ octaves you will need an output buffer. This should use a precision, low Vos op-amp and precision resistors (better than 1%, better than 100ppm/°C), plus have small value, 25-turn trimmers to adjust the gain and the offset over a range somewhat greater than the worst-case values calculated for your DAC. It should also have the output resistor inside the feedback loop. This is then adjusted to give the correct slope (and offeset, although that is less critical since your VCO can be trimmed for a small offset).

You should be able to get tracking within 20 cents over almost the entire 5V range, with worse values for one or perhaps two notes at the ends.

The Teensy did better with accuracy.

That is a hard statement to evaluate without some measurements. In particular, the Teensy 3.1 did not produce 5V out of the DAC, that much is certain. Also, have a look at the on-chip DAC specifications for Teensy 3.1 and compare them to the DAC you are using.