@@ -32,14 +32,15 @@ def helper_function():
3232
3333### 2. Evaluator (` evaluator.py ` )
3434
35- Your evaluator must return a ** dictionary** with specific metric names :
35+ Your evaluator can return either a ** dictionary** or an ** ` EvaluationResult ` ** object :
3636
3737``` python
3838def evaluate (program_path : str ) -> Dict:
3939 """
40- Evaluate the program and return metrics as a dictionary.
41-
42- CRITICAL: Must return a dictionary, not an EvaluationResult object.
40+ Evaluate the program and return metrics.
41+
42+ Can return either a dict or EvaluationResult object.
43+ Use EvaluationResult if you want to include artifacts for debugging.
4344 """
4445 try :
4546 # Import and run your program
@@ -60,10 +61,11 @@ def evaluate(program_path: str) -> Dict:
6061```
6162
6263** Critical Requirements:**
63- - ✅ ** Return a dictionary** , not ` EvaluationResult ` object
64+ - ✅ ** Return a dictionary or ` EvaluationResult ` ** - both are supported
6465- ✅ ** Must include ` 'combined_score' ` ** - this is the primary metric OpenEvolve uses
6566- ✅ Higher ` combined_score ` values should indicate better programs
6667- ✅ Handle exceptions and return ` combined_score: 0.0 ` on failure
68+ - ✅ Use ` EvaluationResult ` with artifacts for richer debugging feedback
6769
6870### 3. Configuration (` config.yaml ` )
6971
@@ -121,18 +123,17 @@ log_level: "INFO"
121123
122124# # Common Configuration Mistakes
123125
124- ❌ **Wrong:** `feature_dimensions : 2`
126+ ❌ **Wrong:** `feature_dimensions : 2`
125127✅ **Correct:** `feature_dimensions : ["score", "complexity"]`
126128
127- ❌ **Wrong:** Returning `EvaluationResult` object
128- ✅ **Correct:** Returning `{'combined_score' : 0.8, ...}` dictionary
129-
130- ❌ **Wrong:** Using `'total_score'` metric name
129+ ❌ **Wrong:** Using `'total_score'` metric name
131130✅ **Correct:** Using `'combined_score'` metric name
132131
133- ❌ **Wrong:** Multiple EVOLVE-BLOCK sections
132+ ❌ **Wrong:** Multiple EVOLVE-BLOCK sections
134133✅ **Correct:** Exactly one EVOLVE-BLOCK section
135134
135+ 💡 **Tip:** Both `{'combined_score' : 0.8, ...}` dict and `EvaluationResult(metrics={...}, artifacts={...})` are valid return types
136+
136137# # MAP-Elites Feature Dimensions Best Practices
137138
138139When using custom feature dimensions, your evaluator must return **raw continuous values**, not pre-computed bin indices :
0 commit comments