-
Notifications
You must be signed in to change notification settings - Fork 569
fix(litellm): fix gen_ai.request.messages to be as expected
#5255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: constantinius/fix/redact-message-parts-type-blob
Are you sure you want to change the base?
Conversation
| return { | ||
| "type": "blob", | ||
| "modality": "image", | ||
| "mime_type": item["image_url"]["url"].split(";base64,")[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Incorrect mime_type includes data: prefix
The mime_type extraction is incorrect. Using split(";base64,")[0] on a URL like "data:image/jpeg;base64,..." returns "data:image/jpeg" instead of "image/jpeg" as documented in the function's docstring. The "data:" prefix needs to be removed to get the proper MIME type.
| for message in messages: | ||
| content = message.get("content") | ||
| if isinstance(content, list): | ||
| message["content"] = [_map_item(item) for item in content] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Function mutates original kwargs messages corrupting API request
The _convert_message_parts function modifies message dictionaries in-place rather than creating copies. Since messages comes directly from kwargs.get("messages", []), this mutates the original request data that LiteLLM will use for the actual API call to the LLM provider, potentially corrupting the request. A deep copy of the messages should be made before modification.
| "type": "blob", | ||
| "modality": "image", | ||
| "mime_type": item["image_url"]["url"].split(";base64,")[0], | ||
| "content": item["image_url"]["url"].split(";base64,")[1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: IndexError when data URL lacks base64 encoding marker
The code assumes all data URLs contain ";base64," after checking only that they start with "data:". Data URLs can omit base64 encoding (e.g., "data:image/jpeg,rawdata"). In such cases, split(";base64,") returns a single-element list, and accessing [1] will raise an IndexError.
| return { | ||
| "type": "uri", | ||
| "uri": item["image_url"]["url"], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Else branch crashes when image_url is None or incomplete
The code on line 78 safely retrieves image_url = item.get("image_url") or {} to handle missing or None values, but the else branch on line 89 directly accesses item["image_url"]["url"] instead of using the safe image_url variable. When item["image_url"] is None, empty, or lacks a "url" key, the safe check on line 79 evaluates to False, routing to this else branch which then crashes with TypeError or KeyError.
| "content": item["image_url"]["url"].split(";base64,")[1], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The data URI parsing in _convert_message_parts is unsafe. It can cause an IndexError with malformed URIs and incorrectly extracts the mime_type with the "data:" prefix.
Severity: HIGH | Confidence: High
🔍 Detailed Analysis
In the _convert_message_parts function, a data URI from item["image_url"]["url"] is parsed by splitting on ";base64,". This implementation has two flaws. First, if the URI is malformed and does not contain ";base64,", accessing the second element of the split result will raise an IndexError, crashing the callback as it is not wrapped in a try-except block. Second, the first element of the split is assigned to mime_type, which incorrectly includes the "data:" prefix (e.g., "data:image/png" instead of "image/png"), contradicting docstrings and other tests in the codebase.
💡 Suggested Fix
Implement robust parsing for the data URI. First, validate that the url string contains ";base64,". If not, handle the error gracefully. If it is valid, extract the MIME type by taking the part before ";base64," and removing the "data:" prefix. Consider using a regular expression for more reliable parsing.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: sentry_sdk/integrations/litellm.py#L84-L85
Potential issue: In the `_convert_message_parts` function, a data URI from
`item["image_url"]["url"]` is parsed by splitting on ";base64,". This implementation has
two flaws. First, if the URI is malformed and does not contain ";base64,", accessing the
second element of the split result will raise an `IndexError`, crashing the callback as
it is not wrapped in a `try-except` block. Second, the first element of the split is
assigned to `mime_type`, which incorrectly includes the "data:" prefix (e.g.,
"data:image/png" instead of "image/png"), contradicting docstrings and other tests in
the codebase.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 7657057
| else: | ||
| return { | ||
| "type": "uri", | ||
| "uri": item["image_url"]["url"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Unsafe direct access bypasses null-safe variable check
Line 78 creates a null-safe image_url variable using item.get("image_url") or {}, but lines 83, 84, and 89 bypass this by directly accessing item["image_url"]["url"]. If item["image_url"] is None, missing, or an empty dict, the code enters the else branch (since empty string doesn't start with data:), and line 89 crashes with TypeError or KeyError. The code should use the safe image_url variable instead of re-accessing item["image_url"].
Issues
Closes https://linear.app/getsentry/issue/TET-1635/redact-images-litellm