Jump to content

fionwe1987

Members
  • Posts

    3,855
  • Joined

  • Last visited

About fionwe1987

  • Birthday 07/03/1987

Contact Methods

  • Website URL
    http://
  • ICQ
    0

Profile Information

  • Gender
    Male

Recent Profile Visitors

7,476 profile views

fionwe1987's Achievements

Council Member

Council Member (8/8)

  1. Sigh. You might want to read the articles you link. No, you don't need supercomputers to do what this architecture is allowing: In which century does this qualify as a supercomputer? Additionally, 70B parameter models are great, if you're in late 2022 or early 2023. Meanwhile, the new Nvidia GB200 is designed to support 27 trillion parameter models: Source: https://www.theverge.com/2024/3/18/24105157/nvidia-blackwell-gpu-b200-ai The Phison architecture is primarily a way to save on cost. It's really impressive, and there's a heck of a lot of use cases where 70B parameters is more than enough, and a company wanting to retrain Meta or Mistral's open source models with private data would be well advised to go that route than hope to lay their hands on Nvidia chips. But we're very far from the state of the art being challenged. It will happen, of course. Neuromorphic chips and other chip architectures are coming which may totally wipe out Nvidia's current lead. But nothing stops regulation from being adjusted for that. Well, the complete clusterfuck that is the US government being so hapless does, but nothing conceptually prevents state of the art chips, of whatever architecture, from being regulated, especially if the goal is to keep them from China. Individual companies can still get slower, cheaper systems and do plenty of mischief, but that genie got out of the bottle before we knew enough to even think of regulating it. Now we know better, and the focus should be on conditional access to these chips. A 27 trillion parameter model, I think, can start approaching video and audio based training at levels of success that earlier models had with text. And that's truly terrifying, and regulations should focus there to make sure the tech isn't reaching actors who will use it to cause the kind of complete mayhem you can get if you can generate video with the level of quality models can generate text today.
  2. Right that's the thing. I think a possible explanation would be FEM gave rise to proto-consciousness, which was then hacked and evolved in response to the pressures of social living, where minimization is harder, since the value of stimuli is significantly harder to predict. But that's a very convenient explanation, and I don't know that it's falsifiable.
  3. That's a nice piece. However, it's worth noting that while FEM may explain the reason for why consciousness first evolved, it doesn't explain how that evolution has proceeded since. My brain may indeed be on autopilot while driving a familiar route. Yet it often is not on autopilot when I'm stationary in my even more familiar couch, and thinking about writing a story, or picturing the next watercolor I want to work on. I'm surely engaging conscious processes in those moments, but nothing external has changed. My brain is not at the minimal energy consumption rest state, despite no sensory input that would disturb it. I could meditate and take it there, but often, I don't. Nor is this failure goal directed, or optimal strategy, because whether spending time thinking up a story or water color is worth the energy consumed is not something we can predict well, especially if we are not published authors or famed watercolorists (or insert art/skill here that we spend time cogitating on). Yet, this kind of cognition is critical for all the external facing success stories of human beings, at least. I definitely don't think this is unique to us, but we're somehow wired for this kind of hoped for future-state driven non-minimization of energy use in our brains, which must have had different evolutionary drivers than what first evolved consciousness, if FEM is true. It's the "useless" cogitating that we need to explain it we want to figure out consciousness. And AI today doesn't run unprompted. The servers for LLMs do not draw power when no prompts are entered. So whatever argument can be made for their consciousness, that consciousness does not exist when they're on but not given any external prompts.
  4. I thoroughly enjoyed the movie, till the very rushed end. And there's nothing earlier I'd seriously consider cutting, so they'd have had to make the movie longer. But that was an excellent adaptation, on the whole. There's a lot to quibble about, and I agree with several I read here, but Definitely a movie I want to watch again. And the IMAX experience was truly worth it.
  5. Yeah. It's bonkers, really. I get it from a cost perspective, but you can't just use that to justify any and every mangling of a story. If doing longer and better stories means somewhat less visually stunning shows, I'll take that tradeoff any day.
  6. Rushed, and mostly meh. That's my verdict after finishing the season. It isn't a disaster, and I agree with whoever said it has the potential to recover, but if they cannot understand why this story needs breathing room, side quests, and character centred episodes, it never will. The CGI was nice. Some decent set pieces, some decent action. Utterly unsure why they felt they had to mash up Hei Bai, Koh and the Fog of Lost souls. That's too much packed together. Bumi and Omashu felt weird, also. They were much more successful with Kiyoshi island. Having Azula be the one driving Zhao wasn't the worst idea, but it meant Zhao became a completely useless villain, and the animated version is just far superior.
  7. Korra ain't for 10 year olds! The more I watch this show, the more I feel Netflix fucked up by picking Last Airbender to live-actionize. Korra is much better suited, tonally, and 8 hour long episodes would actually allow them to deepen that story, and the restrictions of the animated medium, which prevented them more than hinting at Korra and Asami's attraction, etc. could be shown more explicitly. For ATLA, I think the cost of making the live action show drove them to tonally change it in ways to attract a broader audience that end up seriously hurting the story. What works so well in the animation is the innocence and genuine childishness of the gaang in the backdrop of a generational war of conquest. It's clear now why Bryke left the show, this just doesn't feel right.
  8. I just started. It's a terrible call, to start with the Airbender genocide. It just doesn't land. If I didn't know the show, why would I care about any of this, and so soon? This should have been flashback, as it was in the animation. A grittier flashback, fine, but don't lead with it, that makes no sense.
  9. Not watched this yet, gotta wait till the weekend. The reviews are very mixed though. Pacing seems to be the major complaint. And the acting, which was expected I suppose.
  10. Can you cite some sources on this? The big foundation models from Google, OpenAI, Meta, Mistral, etc are reported at taking hundreds of millions of GPU hours using NVIDIA's A100s. The notion that a cluster of PS5s can match this is fantastical, and if you're going to make such a claim, please prove it. No. But that's the point. The kind of video models we're seeing now require the best, and still struggle. The less than best gets you Will Smith eating spagetti, which isn't as much of a threat to elections. What are the sizes and parameters for these models, roughly? Are these comparable in scale to foundation models? That's the point. Its not the concept of GPUs, its the manufacturing bottlenecks that make them regulate-able. There's exactly one company in Netherlands that makes the photolithography equipment you need to make A100s, or any advanced chip. What they are actually is a cluster of closely related component manufacturers that are just not easy to replicate. We shall see. You seem to have a view of the extraordinary ease of this that matches nothing I've read or seen. As this article notes: That's regulation, working. China may have the resources and know how to overcome this. At some point. But every individual company will not. Thus, my point stands. Regulating AI is possible, and the route to do it is through the high-end chips needed to train foundation models.
  11. In other news, the One Power is now real! https://venturebeat.com/ai/true-source-unveils-ai-llm-service-based-on-the-wheel-of-time/ With these jokers in charge, I can hardly wait...
  12. From your lips to our AI-overlord's ears: https://venturebeat.com/ai/true-source-unveils-ai-llm-service-based-on-the-wheel-of-time/ And which geniuses are behind this? These dinguses!
  13. Running the model is not training the model. The larger the model you have to train, the longer it will take on older/less powerful chips, and you hit real limits if you use GPUs in regular PCs and XBOXs. Now, this holds for the largest foundation models, which are the ones that, so far, have been able to do the kind of impressive things the Sora model is doing here. That is beginning to change, but we're still nowhere close to a PC GPU being able to train a relatively large model. When/if that happens, we truly would be in an un-policeable madland. Nope. There's a reason NVIDIA shot up to being more valuable than Amazon and Google recently. The tech is proprietary, and they have a pretty deep moat. There are definitely attempts by other companies, and also China, to get to their level, but Foundation Model training right now, at the scale Open AI etc, do, requires that class of chips. Google, and maybe Amazon, are the only folks running close. Apple is a wild card, and probably will have something they're yet to release. Fei Fei Li is a good person to read/listen to, about this. The concern has been that these chips have been hogged up by big tech, so even all Universities in the US combined cannot come close to the level of compute needed to train something like GPT 4 or Gemini 1.0. This isn't stuff you can do in your garage. Yes, I'm talking about training only.
×
×
  • Create New...