- Add model reference resolution for LLM, Question Classifier, and Parameter Extractor nodes.
- Support parsing various model reference formats, including dictionaries, UUID strings, and name strings, when `model_id` is present.
- Add warning logs for cases where model resolution fails.
- Extract duplicate model binding logic into `_get_model_by_name_or_fallback`.
- Implement logic to prioritize workspace default configuration, falling back to the tenant's first available model if not found.
- Simplify binding code for embedding, rerank, and LLM models.
- Add logic to detect tenant plan QPS limits and return a specific error message when triggered.
- Simplify boolean check in model activation quota validation.
Corrected variable reference in get_end_user_connected_config log statement. The previous code referenced app.workspace_id which could be incorrect or undefined in this context.
- Add logic to detect tenant plan QPS limits and return a specific error message when triggered.
- Simplify boolean check in model activation quota validation.
- Fix issue where tenant ID lookup from shared records failed to query the workspace correctly.
- Switch QPS counting from sliding window to simple counter to improve performance and simplify logic.
- Remove unnecessary `time` module import.
- Replace direct memory agent service calls with unified MemoryService in read endpoint
- Update query preprocessor to use new prompt format and return structured queries
- Enhance MemorySearchResult model with filtering, merging, and ID tracking capabilities
- Add intermediate outputs display for problem split, perceptual retrieval, and search results
- Fix parameter alignment and remove unused history parameter in memory agent service
- Streamline rate limit check flow by removing redundant tenant-level QPS checks.
- Restrict checks to API Key QPS and plan degradation protection only.
- Update constant naming and error message handling for consistency.
Change Body(...) to Body(None) for the message parameter which is never
used directly (request body is read via request.json() instead).
The required marker caused unnecessary 422 validation errors.
- Refactor rate limiting mechanism to limit per API Key instead of per tenant (workspace).
- Update error code logic and Redis key naming conventions.
- Adjust quota usage statistics to display the QPS of the API Key closest to its limit.
- Add `API_KEY_RATE_LIMIT_EXCEEDED` error code.
- Refactor `QuotaExceededError` to support resource type localization.
- Optimize rate limiting service by implementing the sliding window algorithm.
- Add rate limit validation for tenant plans.
- Unify quota check decorator to support both synchronous and asynchronous operations.
- Enhance quota usage statistics endpoints.