An inherent flaw in transformer architecture (what all LLMs use under the hood) is the quadratic memory cost to context. The model needs 4 times as much memory to remember its last 1000 output tokens as it needed to remember the last 500. When coding anything complex, the amount of code one has to consider quickly grows beyond these limits. At least, if you want it to work.
This is a fundamental flaw with transformer - based LLMs, an inherent limit on the complexity of task they can ‘understand’. It isn’t feasible to just keep throwing memory at the problem, a fundamental change in the underlying model structure is required. This is a subject of intense research, but nothing has emerged yet.
Transformers themselves were old hat and well studied long before these models broke into the mainstream with DallE and ChatGPT.
It depends on the type of fusion.
The easiest fusion reaction is deuterium/tritium - two isotopes of hydrogen. The vast majority of the energy of that reaction is released as neutrons, which are very difficult to contain and will irradiate the reactor’s containment vessel. The walls of the reactor will degrade, and will eventually need to be replaced and the originals treated as radioactive waste.
Lithium/deuterium fusion releases most of its energy in the form of alpha particles - making it much more practical to harness the energy for electrical generation - and releases something like 80% fewer high energy neutrons – much less radioactive waste. As a trade-off, the conditions required to sustain the reaction are even more extreme and difficult to maintain.
There are many many possible fusion reactions and multiple containment methods - some produce significant radioactive waste and some do not. In terms of energy output, the energy released per reaction event is much higher than in fission, but it is much harder to concentrate reaction events, so overall energy output is much lower until some significant advancement is made on the engineering challenges that have plagued fusion for 70+ years.