7
Wisecrack
280d

Someone figured out how to make LLMs obey context free grammars, so that opens up the possibility of really fine-grained control of generation and the structure of outputs.

And I was thinking, what if we did the same for something that consumed and validated tokens?

The thinking is that the option to backtrack already exists, so if an input is invalid, the system can backtrack and regenerate - mostly this is implemented through something called 'temperature', or 'top-k', where the system generates multiple next tokens, and then typically selects from a subsample of them, usually the highest scoring one.

But it occurs to me that a process could be run in front of that, that asks conditions the input based on a grammar, and takes as input the output of the base process. The instruction prompt to it would be a simple binary filter:
"If the next token conforms to the provided grammar, output it to stream, otherwise trigger backtracking in the LLM that gave you the input."

This is very much a compliance thing, but could be used for finer-grained control over how a machine examines its own output, rather than the current system where you simply feed-in as input its own output like we do now for systems able to continuously produce new output (such as the planners some people have built)

link here:
https://news.ycombinator.com/item/...

Comments
  • 3
    It's very hard to find decent stuff about llm since all the commercial articles with no programming involved. I finally managed to load a book and can ask questions about it. Took long to find out
  • 2
    @retoor could you share? I'm struggling to find anything decent as well
  • 2
    @dmonkey just load a whole book in the context var. If you need a book, search for "Harry Potter book txt github".

    Let me know if you need help: https://huggingface.co/tasks/...
  • 1
    I think it would be more efficient to enforce the grammar during training via an appropriate loss. If the models output generally isn't in the ballpark of the grammar you might have to backtrack a lot. So you can not just pick any grammar it must be one that is adjacent to, or rather a subset of the one the model was trained with.
  • 1
    @TheSilent

    Thats perfectly plausible. Doesn't give you hard guarantees but I don't see why it wouldn't work in a soft regime.
Add Comment