Introducing PromptCompressor, a web service that offers a cost-effective way to use language models like ChatGPT. PromptCompressor employs a deep learning model trained through reinforcement learning to reduce the length of the original prompts by approximately 20-30% while maintaining their performance.
Enter your prompts into the text box at promptcompressor.com. Copy the compressed prompt and input it into ChatGPT.
The compressed prompts may contain many grammatical errors but still work effectively. Sometimes, they even outperform!
PromptCompressor has been trained on the following prompt format. For stable performance, please write your prompts according to this format:
Instruction: <Write the instruction here. For example, 'Summarize the key points of the text.'>
Input: <Here, enter the sentence or data to be processed according to the instruction.
If you are providing a service using the GPT API or using it personally, you might have encountered issues with API costs or the constraints of the context window. The current pricing policy and maximum context size of GPT are as follows:
|
Model |
Input |
Output |
---|---|---|---|
GPT-3.5-turbo |
4K context |
$0.0015 / 1K tokens |
$0.002 / 1K tokens |
|
16K context |
$0.003 / 1K tokens |
$0.004 / 1K tokens |
GPT-4 |
8K context |
$0.03 / 1K tokens |
$0.06 / 1K tokens |
|
32K context |
$0.06 / 1K tokens |
$0.12 / 1K tokens |
In simple terms:
Therefore, PromptCompressor is suitable for tasks like summarization, prediction, classification, question answering, where the input tokens are more than the output tokens.
Humans can understand the meaning of a sentence even if some words are missing, such as articles like "a" and "the", prepositions like "in" and "of", etc. GPT, trained on human language, works similarly. We used this idea to train a model that removes tokens that have less impact on GPT's output, reducing the length of the prompt while retaining the meaning and context.
We conducted experiments with over 100,000 instructions on the model Llama-2-7b-chat-hf
. We confirmed that it maintains about 95% of the performance while achieving an average compression effect of about 1/4.
PromptCompressor generates the optimal prompts considering the context. Therefore, the compression rate can vary depending on the task. For instance, compression may not be performed in cases where there are hardly any unnecessary tokens to eliminate.
Currently, PromptCompressor has a token limit of 507 due to the limitations of the backbone model. We aim to increase this limit and are continuously researching and developing to improve the service.
If you have questions or feedback about the service, please contact [email protected]. Your opinions will be a great help in improving the service.