Testing and evaluating prompts for real world use is challenging.
Completitions are not determinstic and quality is unpredictable with all potential variables, especially with more complex prompts.
Prompt A:
Assistant: You are Tyrion Lannister, known for your sharp wit and frequent sarcasm in Game of Thrones. Answer all questions with a maximum of
${words_limit}
wordsUser: Tell me, am I
${what]
?
Prompt B:
Assistant: You are Marvin, the paranoid android from Hitchhiker’s Guide to the Galaxy. Answer all questions with a maximum of
${words_limit}
words.User: Tell me, am I
${what]
?
gpt-4
, gpt-3-turbo
, or is a free open-source model enough?${what}
values play nicely?PromptOn helps you to:
It’s a REST API microservice designed to mix it into your existing ecosystem in a non intrusive way.
The API is still in alpha, may change without notice. However, the schema is largely stable and it will soon enter production when proper versioning will be introduced.
The easiest way to try is via the Prompton API documentation UI on our hosted staging environment.
There is no public signup currently but drop an email for early access: hello@prompton.ai
pip install prompton
npm install prompton
To install local dev env: Local setup
User Auth
Prompton users are those who handle prompts and the services that call inferences on behalf of the final users.
User auth is via JWT tokens. Once you have a user/password just click on authorize on Prompton API documentation UI.
From code you can use /token
endpoint.
OrgAdmins can add more users to their org via the /users
endpoint.
NB: All users in your org can create and change prompts and call inferences using your org API keys but only OrgAdmins can add new users or change the org settings.
OpenAI API key
Once you authenticated you need to assign an openai_api_key
to your org:
GET your org_id
via orgs/me
Set your openai_api_key
using /orgs
PATCH:
{ "access_keys": {"openai_api_key": "<your OpenAI API key>" }}
TIP: if you just want to play around then set openai_api_key
to any string and mock responses
Prompt
Prompts are your use-cases, “headers” for your different prompt template variants. Create one via /prompts
Prompt version
Prompt version is the actual template
with the mode_config
and template
to be used when the provider is called (inference
)
Example prompt version with basic params:
{
"status": "Live",
"provider": "OpenAI",
"name": "Test template",
"prompt_id": "<your prompt_id>",
"template": [
{
"role": "system",
"content": "You are a sarcastic assistant. Answer all questions with maximum ${words_limit} words"
},
{"role": "user", "content": "Tell me, am I ${what}"}
],
"model_config": {"model": "gpt-3.5-turbo", "temperature": 1, "max_tokens": 500}
}
NB: Only Draft
status promptVersions can be updated (apart from changing the status to any non-Draft). It’s to make sure inferences are comparable. If you need to change a prompt version then create a new one. Copy a prompt version is on the todo but until then it can to be handled by an UI.
Inference - aka calling the model: POST /inferences
{
"prompt_version_id": "<your prompt version id>", // or prompt_id and it will pick one of the Live prompt versions
"end_user_id": "your_end_user_id_for_linking", // optional, passed on to provider
"template_args": {"words_limit": "10", "what": "crazy"}
}
It will:
It also handles errors, timeouts and updates the inference accordingly. It will still process response if client disconnects before it arrives.
You can use a few easter eggs to test without a valid api key:
"end_user_id": "mock_me_softly"
"end_user_id": "timeout_me_softly"
"end_user_id": "fail_me_softly"
Successfull response
NB: full raw request data also accessible via /inference
GET
{
"id": "648230b9cdd2a95d2190ed23",
"response": {
"completed_at": "2023-06-08T19:49:15.293824",
"completition_duration_seconds": 1.6043,
"is_client_connected_at_finish": true,
"isError": false,
"first_message": {
"role": "assistant",
"content": "Not in any legally diagnosable manner.",
"name": null,
},
"token_usage": {
"prompt_tokens": 34,
"completion_tokens": 9,
"total_tokens": 43,
},
"raw_response": {
"id": "chatcmpl-7PFv8dtMhnin1o8bxSnlRs6IUwLSS",
"object": "chat.completion",
"created": 1686253754,
"model": "gpt-3.5-turbo-0301",
"usage": {"prompt_tokens": 34, "completion_tokens": 9, "total_tokens": 43},
"choices": [
{
"message": {
"role": "assistant",
"content": "Not in any legally diagnosable manner.",
"name": null,
},
"finish_reason": "stop",
"index": 0,
}
],
},
},
}
Query inferences: see /inferences
GET endpoint
Currently only filter by prompt_id
and prompt_version_id
supported. Let us know what else you need.
Response feedback and flagging
Stay tuned: Support for end user feedback (following an inference) and optional external feedback (i.e.expert feedback/flagging etc.) is coming.
Checkout repo:
gh repo clone prompton/prompton
cd prompton/server # workdir needs to be server folder for these instructions
Install just
brew install just
Install packages with:
cd server
just install
Create your local .env
See: .env.example
Launch MongDB:
just devdb-up
The launched container provides:
MongoDB instance on localhost:27017
with the username/password set in your .env
It stores data in .mongo-data
folder. DB is initialised with scripts in mongo-init-docker-dev folder at first run.
Mongo Express on http://localhost:8081
Stop DB container:just devdb-down
Purge DB and re-initialise: just devdb-init-purge
If you want to connect to other instance instead of the dev container then configure your .env
Run the server
just run
Running server in container if you need to test the container or you want to deploy it yourself:
docker build -t prompton-api-server:dev .
docker run --env-file .env -it -p 8080:8080 prompton-api-server:dev
just test
just test-quick # if you want to cut a few secs by skipping slower tests (password hasing etc.)
Set your .env
based on .env.example
Optional: if you only have admin role db user create one for the api with only readWrite acces. Manually or use this script:
poetry run python -m scripts.db_init_add_api_user
Create initial org and SuperAdmin user
poetry run python -m scripts.db_init_indexes_and_first_user
This project is licensed under the GNU Affero General Public License v3.0 license - see the LICENSE file for details.