evaluation - Concepts
Explore concepts tagged with "evaluation"
Total concepts: 17
Concepts
- Heuristic Evaluation - A usability inspection method where experts evaluate an interface against established design principles to identify potential usability problems.
- Summative Assessment - Evaluation at the end of a learning period to measure what has been learned.
- Go/No-Go Decision - A binary decision point where a project, deal, or action is either approved to proceed or stopped.
- Prompt Adherence - The degree to which a large language model follows the instructions, constraints, and formatting specified in a prompt.
- Nielsen's 10 Usability Heuristics - Ten general principles for interaction design developed by Jakob Nielsen, used as guidelines for evaluating user interface usability.
- AI Benchmarks - Standardized tests and evaluation suites used to measure and compare AI model capabilities across tasks.
- Usability - The degree to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction.
- Distinction Bias - The tendency to view options as more dissimilar when evaluating them simultaneously than when evaluating them separately.
- Steerability - The ability to control and direct an AI model's behavior, tone, style, and output characteristics through instructions and configuration.
- Perplexity - A measurement of how well a language model predicts text, with lower values indicating better performance and more confident predictions.
- Guerrilla Usability Testing - Steve Krug's low-budget, do-it-yourself approach to usability testing: a few users, once a month, on whatever you've got — valuing frequency and simplicity over scientific rigor.
- Decision Audit - A structured retrospective review of past decisions to evaluate the quality of the decision process and extract lessons for future decision-making.
- AI Red Teaming - Systematic adversarial testing of AI systems to discover vulnerabilities, biases, and failure modes before deployment.
- Pros and Cons - A simple decision-making technique that involves listing the advantages and disadvantages of each option to clarify thinking and facilitate comparison.
- Litmus Test - A decisive test or criterion used to quickly evaluate whether something meets a key threshold or standard.
- Decision Quality - A framework for evaluating decisions based on the quality of the process used rather than the outcome achieved, recognizing that good decisions can have bad outcomes and vice versa.
- Trunk Test - A usability test by Steve Krug that checks whether a user dropped onto any page of a site can instantly identify where they are, what the site is, and what they can do there.
← Back to all concepts