alignment - Concepts

Explore concepts tagged with "alignment"

Total concepts: 11

Concepts

Sales Enablement - The practice of equipping sales teams with the content, tools, knowledge, and processes they need to effectively engage buyers and close deals.
Direct Preference Optimization - A simplified alternative to RLHF that fine-tunes language models directly on human preference data without training a separate reward model.
Strategic Alignment - The process of ensuring that an organization's structure, resources, and activities are consistently directed toward achieving its mission and vision.
Reinforcement Learning from Human Feedback (RLHF) - A training technique that aligns LLM outputs with human preferences by using human feedback to guide model behavior.
Shared Understanding - Common knowledge, perspectives, and mental models that enable effective team collaboration.
Reward Model - A neural network trained to predict human preferences, used to provide a scalar reward signal for optimizing language model behavior in RLHF.
Instruction Tuning - A fine-tuning technique that trains language models to follow natural language instructions by learning from examples of instruction-response pairs.
Constitutional AI - AI training method using a set of principles (constitution) to guide model behavior and self-improvement.
Team Charter - A document defining a team's purpose, goals, roles, and operating principles.
Reward Hacking - A failure mode in reinforcement learning where an agent exploits flaws in the reward function to achieve high reward without fulfilling the intended objective.
Shared Vision - A collectively held picture of the future that members of a group genuinely want to create together, generating intrinsic commitment rather than mere compliance.

← Back to all concepts