| Term | Description |
| AI Recommendation | A feature that suggests optimal settings for model compression or fine-tuning based on user requirements and model characteristics. |
| API | Application Programming Interface; a set of protocols and tools for building software applications. |
| App Design | A no-code/low-code platform within LLMOps for creating LLM-powered workflows and applications. |
| Bias Check | A validator in the Monitoring feature that ensures LLM outputs do not contain biased language towards specific demographics. |
| Compression | The process of reducing the size of an LLM while maintaining its performance capabilities. |
| Data Source Manager | A component in LLMOps for managing and organizing datasets used for fine-tuning. |
| Fine-Tuning | The process of adapting a pre-trained language model to specific tasks or domains using additional training data. |
| Gibberish Text | A validator that checks if the LLM-generated text is coherent and makes sense. |
| Governance Configuration | Settings and rules applied to ensure LLM usage complies with organizational policies and regulations. |
| LLM | Large Language Model; an AI model trained on vast amounts of text data to understand and generate human-like text. |
| LLM Tracing | A feature that records and displays detailed information about each LLM interaction, including prompts, responses, and performance metrics. |
| LoRA | Low-Rank Adaptation; a fine-tuning technique that adapts only a small set of parameters, reducing computational requirements. |
| Model Quantization | A technique to reduce the precision of model weights, decreasing model size and potentially improving inference speed. |
| Monitoring Dashboard | A visual interface displaying real-time insights into LLM usage, performance, and trends. |
| PII | Personally Identifiable Information; data that could potentially identify a specific individual. |
| Prompt | The input text given to an LLM to elicit a specific type of response or completion. |
| QLoRA | Quantized LoRA; a combination of quantization and LoRA techniques for efficient fine-tuning. |
| RAG | Retrieval-Augmented Generation; a technique that combines information retrieval with text generation to produce more accurate and informative responses. |
| Response Time | The time taken by an LLM to generate a response to a given prompt. |
| Saliency Check | A validator that ensures an LLM-generated summary covers the main topics present in the source document. |
| Sensitive Topic | A validator that checks if the input or output contains potentially sensitive or controversial subjects. |
| Token | The basic unit of text that an LLM processes. A token can be a word, part of a word, or a single character, depending on the model's tokenization scheme. |
| Validation Flow | A customizable sequence of checks in the Monitoring feature to ensure LLM outputs meet safety, relevance, and quality standards. |
| Vector Store | A database optimized for storing and retrieving high-dimensional vectors, often used in RAG systems for efficient similarity search. |
| Wiki Provenance | A validator that checks if LLM-generated text contains hallucinations by comparing it with relevant Wikipedia information. |