The weirdness of quantum theory is largely self-imposed. The EPR paper demonstrates that if you believe the wave function is a complete description of the physical system, it leads to absurdities. Ever since Bell's theorem, people just accepted that maybe quantum mechanics is absurd, and so they started to adopt viewpoints of nonlocal collapsing wavefunctions, things not having properties until you measure them, or even an ever-branching multiverse.
But Bell's theorem has a bit of major flaw.
Let N denote a specification of all the beables, of some theory, belonging to the overlap of the backward light cones of spacelike separated regions 1 and 2. Let A be a specification of some beables from the remainder of the backward light cone of 1, and B of some beables in the region 2. Then in a locally causal theory {A|A, N, B} = {A|A, N} (2) whenever both probabilities are given by the theory.
Notice how Bell talks about "backwards light cones." How on earth, given the postulates of quantum mechanics alone, can you mathematically derive the "backwardness" of a light cone? Every operator in quantum mechanics is time-symmetric, that's one of its foundational postulates. There is no way within quantum theory to distinguish between one light cone as the "backward" one as opposed to the other. The rest of what Bell wrote is great, just delete the word "backward," then his formulation of causality is consistent with the postulates of quantum theory, but then violations of Bell inequalities don't become an issue nor contradict with local realism in any way. The supposed conflict between local realism and Bell's theorem
is mathematically impossible to derive from the postulates of quantum mechanics, and requires you to introduce an additional postulate outside of the theory.
Of course, that's not to say that speaking of the asymmetry of time is meaningless, but it's ultimately a macroscopic feature of the universe, as it requires referencing the
past hypothesis to make sense of it, which is a feature of general relativity and can't be derived from the postulates of quantum theory (to our knowledge). It's physically real, but macroscopic. If you zoom in on little particles buzzing around, you can't tell whether or not their buzzing around is going forwards in time or in the reverse. It's a meaningless concept at such a scale.
If you just stick with the mathematics of quantum theory at face value and don't try to introduce extra postulates on top of it, then the supposed weirdness to a significant degree goes away. There is no need to pretend that particles are in multiple places at once until you measure them, the wave function just becomes an epistemic description.
Indeed, one thing most believers in the non-epistemic wave function love to sweep under the rug is that any wave function can be expanded out into a list of expectation values for all of the observables of the system, and you can also expand out all unitary operators to apply directly to the list of expectation values rather than to the wave function, and you can compute the exact same results you would using the wave function.
The wave function is not even necessary for calculation, it's just more mathematically convenient because it's effectively a compressed form of the mathematically equivalent vector of expectation values. If you compute the vector of expectation values rather than the wave function, it becomes far more obvious that you're working with something epistemic, as you're basically just manipulating statistics, and you don't even need a Born rule because you're again just manipulating a list of expectation values, so at the end of the calculation you are left with a list of expectation values, which already gives you the statistics on the final results directly. All complex numbers also disappear, they are purely an artifact of mathematical compression.
In a time-symmetric approach, you can also evolve the list of expectation values from both ends until they meet at an intermediate point, and you find that this applies enough constraints to the system to compute what are called weak values. The weak values then can be shown to evolve locally and continuously, describable with differential equations, and at any point you can compute the expectation values directly from the weak values using the
Aharonov-Bergmann-Lebowitz rule. You also can compute the weak values using two state vectors rather than vectors of expectation values and get the same results, so it's sometimes called the Two-State Vector Formalism.
Hence, the information needed to explain the probability distribution of the particles is locally available at the particles and doesn't need any sort of "spooky action at a distance" or treating anything as if it's in multiple places at once or spread out as waves that "collapse" when measured.
When you expand the operators out so that they act directly on expectation values, then suddenly quantum theory becomes rather simple and intuitive. There are rules that govern what kind of operators you can construct, they have to be time-reversible, preserve handedness, and be completely positive. This limitation disallows you from constructing a non-perturbing operator, and thus it's trivial to prove that any operator that describes the interaction between a measuring device and a particle must perturb the properties of the particle it is *not* measuring.
This is why you have to do statistics. Not because the particles are in all places at once, but because every time you measure a property, the physical interaction also perturbs the other properties you did not measure. If you measure position, you perturb its momentum, if you measure X, you perturb Y and Z.
That's where "wave function collapse" comes from. Nothing is collapsing. If I know Z=+1 but don't know X or Y, and I measure X anyways, I will measure something I can only at best describe statistically beforehand. After I measure X, let's say I learn it is X=-1, then I also have perturbed Z and Y in the process, so I no longer know if it's true that Z=+1.
This is not spooky but you can reproduce such an effect classically. Indeed, many aspects of quantum theory deemed "spooky" can be trivially reproduced in a classical model: double-slit experiment, Elitzur–Vaidman paradox, Deutsch's algorithm, superdense coding, quantum encryption and key distribution, etc.
It's only in the *contextual* cases, like in the GHZ paradox or the Frauchiger-Renner paradox that you actually find it hard to explain classically, but these also become trivially easy to explain the moment you abandon the additional postulate of imposing time-asymmetry onto quantum theory that is mathematically impossible to derive from the postulates of the theory. Drop that postulate, compute the weak values, and then you will see very clearly how the information locally evolves through the system and why the correlations are what they are.
It seems almost everyone comes in one of two flavors. The first flavor is "shut up and calculate." No one can understand quantum theory, so don't bother. The other flavor is using it as a springboard to justify sci-fi beliefs, like a multiverse, or straight-up mystical beliefs, like somehow consciousness has something to do with it. Despite the fact that much simpler explanations that are borderline classical have existed in the literature for decades, despite the fact the fact that every supposed "paradox" has been given simple explanations, the science media only ever reports on the wacky "paradoxes" that supposedly disprove there's objective reality or some nonsense like that, while most people remain unaware of the responses.