Don't be fooled by determinism
A lot of research goes into getting models to behave completely deterministically, all the way down to hardware cycles on GPU. This may be useful in academic and low-level research settings where experimental reproducibility is a necessity. But in applied AI, absolute determinism from models isn't very useful. In fact, it's probably damaging to overall result quality. Why?
It's not useful because foundation models receive unstructured data from an unbounded, infinite range. Even if you're able to get what appears to be absolutely consistent behavior from a model on a certain input, it's possible that some extremely subtle variation of that input - even so much as an extra comma, misspelling, or word rearrangement - can result in a different output. This is the practicality of real-world data. It's messy, and will constantly surprise you in new ways.
It can be damaging because some creativity can help with reasoning. Teams often set model temperature to 0 in the hopes of increasing consistency, but they're somewhat hamstringing a model's ability to think about the task. It's like putting handcuffs on someone trying to complete a jigsaw puzzle. Maybe they can get it done, but they'll be pretty limited in their available range.