Running Benchmarks¶
This guide explains how to run a benchmark and understand the results.
Prerequisites¶
Before running benchmarks, ensure you have:
Completed the initialization process (see Getting Started)
A valid
masbench_config.yml
file in your project rootYour server executable (
.jar
file) accessibleTest levels in your specified levels directory
A working client implementation
Basic Benchmark Execution¶
To run a benchmark, use the masbench run
command followed by a benchmark name:
masbench run my-first-benchmark
The benchmark name is used to:
Create a unique folder for the results
Name the output files
Organize multiple benchmark runs
Important
Benchmark names must be unique. If you try to run a benchmark with a name that already exists, masbench will display an error and exit.
Adding Notes to Your Benchmark¶
You can add a descriptive message to your benchmark run using the -m
or --message
flag:
masbench run algorithm-v2 -m "Testing A* with improved heuristic"
This message helps you remember what changes you were testing when reviewing results later.
Output Structure¶
After running a benchmark, you’ll find the following structure in your benchmark folder:
benchmarks/
└── my-first-benchmark/
├── logs/
│ ├── my-first-benchmark_server.zip
│ └── my-first-benchmark_client.clog
└── my-first-benchmark_results.csv
File Descriptions¶
- Server Logs (
*_server.zip
) Contains detailed server execution logs, including level loading, client communication, and any server-side errors.
- Client Logs (
*_client.clog
) Raw output from your client, including debug information, algorithm progress, and any client-side errors.
- Results CSV (
*_results.csv
) Processed benchmark data in CSV format with the following columns:
LevelName
: Name of the level fileSolved
: Whether the level was solved (true/false)Actions
: Number of actions in the solutionTime
: Execution time in millisecondsGenerated
: Number of nodes generated during searchExplored
: Number of nodes explored during searchMemoryAlloc
: Memory allocated during executionMaxAlloc
: Peak memory allocation
Default Output vs Extended Metrics¶
By default, masbench only captures the following basic metrics from your benchmark runs:
LevelName
: Name of the level fileSolved
: Whether the level was solved (true/false)Actions
: Number of actions in the solutionTime
: Execution time in milliseconds
The additional metrics (Generated
, Explored
, MemoryAlloc
, and MaxAlloc
) are not included in the output by default.
To track these extended metrics, your client must output specific strings at the end of execution as comments. Each string must be on a new line:
#Explored:
#Generated:
#Alloc:
When masbench detects these strings in your client output, it will parse the values and include them in the results CSV. If these strings are not present, the corresponding columns will be empty or contain default values.
Note
The output must be formatted as comments (starting with #
). For example, in Python you should add something like:
print("#Explored: 123", flush=True)
print("#Generated: 1039", flush=True)
print("#Alloc: 200", flush=True)
This will produce output that masbench can parse:
[client][message] #Explored: 156
[client][message] #Generated: 234
[client][message] #Alloc: 3072
Example Results¶
Here’s an example of what the CSV results might look like:
LevelName,Solved,Actions,Time,Generated,Explored,MemoryAlloc,MaxAlloc
SAsoko1_01.lvl,true,12,45,127,89,2048,4096
SAsoko1_02.lvl,true,18,78,234,156,3072,6144
SAsoko1_03.lvl,false,0,300000,5670,4321,8192,16384
Troubleshooting Common Issues¶
Benchmark Already Exists¶
If you see this error:
Error: Benchmark with name 'my-benchmark' already exists. Please remove it before running a new one.
You have several options:
Choose a different name: Use a new benchmark name
Remove the existing benchmark: Delete the folder manually
Rename the existing benchmark: Move it to preserve the results
# Remove existing benchmark
rm -rf benchmark-results/my-benchmark
# Or rename it to preserve results
mv benchmark-results/my-benchmark benchmark-results/my-benchmark-old
Server Not Found¶
If you get an error about the server not being found:
Check that the
ServerPath
in your config points to the correct fileEnsure the
.jar
file exists and is accessibleVerify you have Java installed and available in your PATH
Client Command Issues¶
If your client fails to run:
Test your client command manually first
Ensure all dependencies are installed
Check that the client command in your config is correct
Review the client logs for specific error messages
Performance Tips¶
Selecting Test Levels¶
Start with a small set of representative levels
Include levels of varying difficulty
Consider creating a separate “quick test” folder for rapid iteration
Use the full level set for final benchmarks
Next Steps¶
Once you have benchmark results:
Analyze the CSV data to identify performance patterns
Compare different algorithm implementations
Use the comparison tools to visualize differences
See also
For comparing benchmark results, see the Comparison guide
For initial setup, see the Getting Started guide