Large Language Models Enable Few-Shot Clustering

paper

Authors: Vijay Viswanathan, Kiril Gashteovski, Carolin Lawrence, Tongshuang Wu, Graham Neubig

Word Count: 4893 words

Estimated Average Read Time: 9 - 10 minutes

This article discusses how large language models (LLMs) can be used to enable more query-efficient, few-shot semi-supervised text clustering. They explore three stages where LLMs can be incorporated into clustering:

Before clustering - by improving input features using keyphrases generated by an LLM

During clustering - by providing constraints to the clusterer using an LLM as a pairwise constraint pseudo-oracle

After clustering - using LLMs for post-correction of cluster assignments

They find that incorporating LLMs before and during clustering can provide significant improvements in cluster quality. Using an LLM to generate keyphrases for textual representation achieved state-of-the-art results on the datasets they tested.

While incorporating LLMs comes at additional cost, they find that an LLM can achieve comparable or better performance than a human oracle at a fraction of the labeling cost. This suggests that LLMs have the potential to enable more effective semi-supervised text clustering.

For applications development using LLMs or GANs, their work demonstrates how large language models can be used in a simple and targeted way to augment existing machine learning models and achieve performance gains. While their examples focused on clustering, a similar approach could potentially benefit other tasks where LLMs can provide “pseudo labels” or simulated demonstrations.

Overall, their results indicate that LLMs have great potential to enable more efficient use of human feedback for improved modeling performance.