I have argued for awhile now that the probabilistic nature of LLMs can represent a form of context that, when applied, can be utilized for robotic applications.

Seems I’m not the only one that had this idea. While simple, Microsoft researched applied a high level control library to demonstrate LLMs (ChatGPT) developing robotic task code.

  • hlfshell@programming.devOP
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    This isn’t mine, it’s just an interesting blogpost I came across. Nor am I arguing that it should replace a robotics engineer.

    My main thought, not fully represented in the post, is that LLMs can act as a context engine for high level understanding of instructions + spatial awareness, and then apply it to actuation. This is somewhat touched upon in the article.

    I do think that there is some interesting work in LLM powered task level planning. I’m hoping to find the time put together a good example of this, utilizing the ability for LLMs to make logical leaps based on instruction. In the article, it took the command “I’m thirsty” to mean move to a drink. In a more applicable application, we can use LLMs to identify that a room with multiple identified objects (refrigerator, oven, stove, cabinets, etc) is in fact a kitchen. Then, from there, determine that “I’ve seen a room I’ve identified as a kitchen - I can navigate there to attempt to find a drink”.