Meta has thrown down the gauntlet in the race for more efficient artificial intelligence. The tech giant released pre-trained models on Wednesday that leverage a novel multi-token prediction approach, potentially changing how large language models (LLMs) are developed and deployed.

This new technique, first outlined in a Meta research paper in April, breaks from the traditional method of training LLMs to predict just the next word in a sequence. Instead, Meta’s approach tasks models with forecasting multiple future words simultaneously, promising enhanced performance and drastically reduced training times.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 days ago

    Nice. And we need some way to scale LLMs beyond current limitations, or making them substancially more intelligent becomes prohibitively expensive, soon.

    They also released two other kinds of models… One concerned with images and one with audio.