Doing the Lord’s work in the Devil’s basement

  • 0 Posts
  • 22 Comments
Joined 4 months ago
cake
Cake day: May 8th, 2024

help-circle











  • Then these models are stupid

    Yup that is kind of the point. They are math functions designed to approximate human tasks.

    These models should start out with basics of language, so they don’t have to learn it from the ground up. That’s the next step. Right now they’re just well read idiots.

    I’m not sure what you’re pointing at here. How they do it right now, simplified, is you have a small model designed to cut text into tokens (“knowledge of syllables”), which are fed into a larger model which turns tokens into semantic information (“knowledge of language”), which is fed to a ridiculously fat model which “accomplishes the task” (“knowledge of things”).

    The first two models are small enough that they can be trained on the kind of data you describe, classic books, movie scripts etc… A couple hundred billion words maybe. But the last one requires orders of magnitude more data, in the trillions.



  • Very useful in some contexts, but it doesn’t “learn” the way a neural network can. When you’re feeding corrections into, say, ChatGPT, you’re making small, temporary, cached adjustments to its data model, but you’re not actually teaching it anything, because by its nature, it can’t learn.

    But that’s true of all (most ?) neural networks ? Are you saying Neural Networks are not AI and that they can’t learn ?

    NNs don’t retrain while they are being used, they are trained once then they cannot learn new behaviour or correct existing behaviour. If you want to make them better you need to run them a bunch of times, collect and annotate good/bad runs, then re-train them from scratch (or fine-tune them) with this new data. Just like LLMs because LLMs are neural networks.