Model Evaluator for OpenClaw — Review, Trust Score & Install Guide

Short answer: Model Evaluator is a verified OpenClaw skill for ai & llms. Trust Score 92/100 based on source transparency, permission scope, install safety, update recency, community signal, and documentation quality.

Trust Score: 92/100

How we calculate Trust Scores →

What Model Evaluator does

Model Evaluator provides a comprehensive framework for evaluating AI model performance. Run standardized benchmarks, create custom evaluation suites, compare models head-to-head with statistical significance testing, and track quality over time. Supports automated grading with rubrics, human preference collection, and regression detection.

How to install Model Evaluator

  1. Install the OpenClaw CLI: npm install -g clawhub@latest
  2. Install this skill: npx clawhub@latest install model-evaluator
  3. Verify the install: openclaw skills list

Security review

This skill is currently classified as Verified with a low risk profile. Our reviewers inspected the SKILL.md manifest, dependency tree, declared permissions, network calls, and shell commands before publishing this score. See our editorial policy and Trust Score methodology for the full rubric.

Best for

Avoid if

Alternatives & related skills

Frequently asked questions

What metrics does it track?

Accuracy, fluency, relevance, factuality, latency, cost, and custom rubric scores. All configurable per use case.

Can it evaluate subjective quality?

Yes — it uses LLM-as-judge with configurable rubrics and supports human preference annotation.

How do I install Model Evaluator?

Run `npx clawhub@latest install model-evaluator` from any directory with Claude Code or OpenClaw installed. The skill is added to your local SKILL.md registry and is available to your agent immediately — no restart required.

Related