• 0 Posts
Joined 1 year ago
Cake day: June 30th, 2023


  • If it is just a repackaging of ChatGPT’s existing “search the web” function, I don’t know why they’d bother. It can at best summarize a page of search results for a very literal-minded query, and even then it’s often lobotomized by the fact that OpenAI has made it easy for a large number of top websites to opt out of having their pages accessible to their search crawler, which means you’re only getting a summary of the search result snippet and metadata. A competent user of Google search can run rings around it in terms of research, even with Google’s decline in quality. I guess it makes it faster to answer basic queries for recent information not in the training data, but that hardly seems worthy of a big event.

  • IIRC based on the source paper the “verbatim” text is common stuff like legal boilerplate, shared code snippets, book jacket blurbs, alphabetical lists of countries, and other text repeated countless times across the web. It’s the text equivalent of DALL-E “memorizing” a meme template or a stock image – it doesn’t mean all or even most of the training data is stored within the model, just that certain pieces of highly duplicated data have ascended to the level of concept and can be reproduced under unusual circumstances.

  • Tbh, I block ads when I can but have a hard time getting angry about this. YouTube is both incredibly useful and incredibly expensive to operate – seriously, what other service lets you upload hours of HD video which anyone in the world can access instantly, indefinitely, for free, and at the same scale YT does? It’s a peerless engineering marvel and it would be a tragedy if it were to shut down. If seeing some short skippable ads is what it takes to keep that resource viable, that’s honestly pretty fair.