diff --git a/examples/Custom-LLM-as-a-Judge.ipynb b/examples/Custom-LLM-as-a-Judge.ipynb index 894ed0c..98b862c 100644 --- a/examples/Custom-LLM-as-a-Judge.ipynb +++ b/examples/Custom-LLM-as-a-Judge.ipynb @@ -499,7 +499,7 @@ "It looks like the numeric rater scored almost 94% in total. That's not bad, but if 6% of your evals are incorrectly judged, that could make it very hard to trust them. Let's dig into the Braintrust\n", "UI to get some insight into what's going on.\n", "\n", - "![Partial credit](../images/Custom-LLM-as-a-Judge/Partial-Credit.gif)\n", + "![Partial credit](../images/Custom-LLM-as-a-Judge-Partial-Credit.gif)\n", "\n", "It looks like a number of the incorrect answers were scored with numbers between 1 and 10. However, we do not currently have any insight into why the model gave these scores. Let's see if we can\n", "fix that next.\n" @@ -670,11 +670,11 @@ "It doesn't look like adding reasoning helped the score (in fact, it's half a percent worse). However, if we look at one of the failures, we'll get some insight into\n", "what the model was thinking. Here is an example of a hallucinated answer:\n", "\n", - "![Output](../images/Custom-LLM-as-a-Judge/Output.png)\n", + "![Output](../images/Custom-LLM-as-a-Judge-Output.png)\n", "\n", "And the score along with its reasoning:\n", "\n", - "![Reasoning](../images/Custom-LLM-as-a-Judge/Reasoning.png)\n" + "![Reasoning](../images/Custom-LLM-as-a-Judge-Reasoning.png)\n" ] }, { diff --git a/images/Custom-LLM-as-a-Judge/Output.png b/images/Custom-LLM-as-a-Judge-Output.png similarity index 100% rename from images/Custom-LLM-as-a-Judge/Output.png rename to images/Custom-LLM-as-a-Judge-Output.png diff --git a/images/Custom-LLM-as-a-Judge/Partial-Credit.gif b/images/Custom-LLM-as-a-Judge-Partial-Credit.gif similarity index 100% rename from images/Custom-LLM-as-a-Judge/Partial-Credit.gif rename to images/Custom-LLM-as-a-Judge-Partial-Credit.gif diff --git a/images/Custom-LLM-as-a-Judge/Reasoning.png b/images/Custom-LLM-as-a-Judge-Reasoning.png similarity index 100% rename from images/Custom-LLM-as-a-Judge/Reasoning.png rename to images/Custom-LLM-as-a-Judge-Reasoning.png diff --git a/images/Custom-LLM-as-a-Judge/Classifier.png b/images/Custom-LLM-as-a-Judge/Classifier.png deleted file mode 100644 index 8326d0e..0000000 Binary files a/images/Custom-LLM-as-a-Judge/Classifier.png and /dev/null differ