Large Language Models Enable Few-Shot Clustering
Authors: Vijay Viswanathan, Kiril Gashteovski, Carolin Lawrence, Tongshuang Wu, Graham Neubig
Word Count: 4893 words
Estimated Average Read Time: 9 - 10 minutes
This article discusses how large language models (LLMs) can be used to enable more query-efficient, few-shot semi-supervised text clustering. They explore three stages where LLMs can be incorporated into clustering:
Before clustering - by improving input features using keyphrases generated by an LLM
During clustering - by providing constraints to the clusterer using an LLM as a pairwise constraint pseudo-oracle
After clustering - using LLMs for post-correction of cluster assignments
They find that incorporating LLMs before and during clustering can provide significant improvements in cluster quality. Using an LLM to generate keyphrases for textual representation achieved state-of-the-art results on the datasets they tested.
While incorporating LLMs comes at additional cost, they find that an LLM can achieve comparable or better performance than a human oracle at a fraction of the labeling cost. This suggests that LLMs have the potential to enable more effective semi-supervised text clustering.
For applications development using LLMs or GANs, their work demonstrates how large language models can be used in a simple and targeted way to augment existing machine learning models and achieve performance gains. While their examples focused on clustering, a similar approach could potentially benefit other tasks where LLMs can provide “pseudo labels” or simulated demonstrations.
Overall, their results indicate that LLMs have great potential to enable more efficient use of human feedback for improved modeling performance.