Skip to content

PoC for lightweight analysis for OpenAI#4802

Draft
bradlarsen wants to merge 3 commits intomainfrom
lightweight-analyze-poc
Draft

PoC for lightweight analysis for OpenAI#4802
bradlarsen wants to merge 3 commits intomainfrom
lightweight-analyze-poc

Conversation

@bradlarsen
Copy link
Contributor

@bradlarsen bradlarsen commented Mar 9, 2026

This draft PR is a PoC for what "lightweight analysis" could look like.


Note

Medium Risk
Changes detector verification flows for OpenAI and DigitalOcean to always read/close HTTP response bodies and attach richer ExtraData, which could affect verification behavior and memory/logging characteristics. Risk is moderate because it touches live API calls and parsing in token verification paths.

Overview
Adds a new lightweight_analyze helper package to support lightweight analysis by copying/closing HTTP response bodies and standardizing common ExtraData fields (lwa.*).

Updates the OpenAI and DigitalOceanV2 detectors’ verification code to use a logger-aware context, always preserve the raw API response in ExtraData (lwa.response), and (when parsing succeeds) augment results with standard identity fields (id/name/email); DigitalOcean verification now also switches to typed response structs and returns ExtraData from both access- and refresh-token verification paths.

Written by Cursor Bugbot for commit ac8f950. This will update automatically on new commits. Configure here.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ bradlarsen
❌ lukem-ts
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

var resData accountResponse
if err = json.Unmarshal(resBody, &resData); err != nil {
ctx.Logger().Error(err, "failed to parse response")
return false, extraData, err
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing failure incorrectly invalidates verified access tokens

Medium Severity

The old verifyAccessToken returned verified=true unconditionally on HTTP 200, since the status code alone confirms the token is valid. The new code adds JSON body parsing and returns verified=false with an error if parsing fails. This means a valid token could be incorrectly reported as unverified due to a transient body-read issue or an API response format change — even though the HTTP 200 already confirmed validity. The LWA data extraction failure here is incorrectly treated as a verification failure.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukem-ts cursor is correct in its explanation here. The changes here for lightweight analyze do change the detector behavior: if a response body doesn't parse, the finding would be considered live (i.e. verified) before, but now the error would be propagated, making it an "indeterminate" finding.

It's a question of judgment about how error handling should be done in the detectors. For now, let's keep existing detector behavior the same as it was before lightweight analyze, and only log parsing errors (like you've done here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason GitHub is not showing the "add a suggestion" UI buttons, so I can't write a patch here directly, so here goes:

You probably want to change line 171 from this:

			return false, extraData, err

to this in order to preserve the old behavior:

			return true, extraData, nil

Email: &resData.Info.Email,
})

return true, extraData, nil, resData.AccessToken
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed access_token validation in refresh token verification

Low Severity

The old code explicitly validated that access_token existed in the OAuth response and was a string, returning verified=false with an error if missing. The new code unconditionally returns verified=true with resData.AccessToken, which defaults to an empty string if the field is absent from the JSON. This means a malformed 200 response without an access_token would result in verified=true with an empty newAccessToken being set in AnalysisInfo.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like cursor is correct here also: if the API response doesn't include an access_token, instead of indeterminate status, the finding will get verified status?

var resData accountResponse
if err = json.Unmarshal(resBody, &resData); err != nil {
ctx.Logger().Error(err, "failed to parse response")
return false, extraData, err
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukem-ts cursor is correct in its explanation here. The changes here for lightweight analyze do change the detector behavior: if a response body doesn't parse, the finding would be considered live (i.e. verified) before, but now the error would be propagated, making it an "indeterminate" finding.

It's a question of judgment about how error handling should be done in the detectors. For now, let's keep existing detector behavior the same as it was before lightweight analyze, and only log parsing errors (like you've done here).

var resData accountResponse
if err = json.Unmarshal(resBody, &resData); err != nil {
ctx.Logger().Error(err, "failed to parse response")
return false, extraData, err
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason GitHub is not showing the "add a suggestion" UI buttons, so I can't write a patch here directly, so here goes:

You probably want to change line 171 from this:

			return false, extraData, err

to this in order to preserve the old behavior:

			return true, extraData, nil

Comment on lines +127 to +131
lwa.AugmentExtraData(extraData, lwa.Fields{
ID: &resData.Info.UUID,
Name: &resData.Info.Name,
Email: &resData.Info.Email,
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm: there is no appropriate field in the digitalocean API response that we could use for the URL field?

Email: &resData.Info.Email,
})

return true, extraData, nil, resData.AccessToken
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like cursor is correct here also: if the API response doesn't include an access_token, instead of indeterminate status, the finding will get verified status?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants