This project is designed to help you learn and experiment with dspy.ai by providing simple, "hello world"-style examples using different language models (LLMs and SLMs).
The goal is to demonstrate how to send basic instructions (such as "Say Hello World") to various language models using dspy.ai, with examples for OpenAI, Anthropic, and a SLM running on-device.
-
chatresponse_claude.py
Sends a simple instruction to an LLM via Anthropic's Claude 3.5 Sonnet model and prints the response. -
chatresponse_openai.py
Sends a simple instruction (e.g., "Say Hello World") to an LLM via OpenAI's GPT-4o model and prints the response. -
chatresponse_slm.py
Sends a simple instruction to a locally running SLM (Small Language Model) using Ollama's Llama3.2-1b and prints the response. -
classify_slm.py
Given a sentence, classify the sentiment to one of 3 values: positive, negative or neutral. Sends this to a locally running SLM (Small Language Model) using Ollama's Llama3.2-1b and prints the response and confidence. -
cot_slm.py
Given a mathematical problem, use the Chain of Thought primitive, to reason over the answer. Sends this to a locally running SLM (Small Language Model) using Ollama's Llama3.2-1b and prints the response. -
cot.py
Given a mathematical problem, use the Chain of Thought primitive, to reason over the answer. Sends this to an OpenAI model (gpt-4o-mini) and prints the response. -
followuptask.py
Given a sentence, find the top 3 follow up tasks. Sends this to an OpenAI model (gpt-4o-mini) and prints the response. -
infoextraction.py
Given a sentence, extract entities and generate headlines. Sends this to a locally running SLM (Small Language Model) using Ollama's Llama3.2-1b and prints the response. -
tool_example.py
Demonstrates DSPy's tool integration using the ReAct (Reasoning + Acting) pattern with a custom FizzBuzz tool. Shows how to create custom tools, integrate them with DSPy modules, and use iterative reasoning to solve problems step-by-step.
Contains scripts and data for evaluating summarization quality:
- summarization_metric.py
Evaluates the quality of generated summaries using custom or model-based metrics. Can be used to compare different summarization models or approaches. - dataset.jsonl
Example dataset in JSON Lines format for summarization evaluation.
Contains scripts and results for evaluating the style of generated text:
- style_evaluation_metric.py
Evaluates whether generated answers match a requested style (formal, casual, neutral) using DSPy and LLMs. Includes metrics for style match and answer length, and prints a results table. - results_gpt_4o.txt, results_gpt_4_1_mini.txt, results_claude_sonnet4.txt
Example output files showing evaluation results for different models.
- Clone this repository
- Create
.envfile and fill in OPENAI_API_KEY and CLAUDE_API_KEY API keys - Run scripts
python llm_chatresponse_openai.py