Skip to content

Conversation

@hmac
Copy link
Contributor

@hmac hmac commented Nov 3, 2022

This recognises barriers of the form

STRINGS = ["foo", "bar"]

case foo
when "some string literal"
  foo
when *["other", "strings"]
  foo
when *STRINGS
  foo
end

where the reads of foo inside each when are guarded by the comparison
of foo with the string literals.

We don't yet recognise this construct:

case foo
when "foo", "bar"
  foo
end

This is due to a limitation in the shared barrier guard logic.

@hmac hmac mentioned this pull request Nov 3, 2022
@hmac hmac marked this pull request as ready for review November 4, 2022 00:49
@hmac hmac requested a review from a team as a code owner November 4, 2022 00:49
@calumgrant calumgrant requested a review from aibaars November 7, 2022 09:29

final override SplatExpr getExpr() { result = super.getExpr() }

final ExprCfgNode getOperand() { result.getExpr() = e.(SplatExpr).getOperand() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a class SplatExprChildMapping and change this to

Suggested change
final ExprCfgNode getOperand() { result.getExpr() = e.(SplatExpr).getOperand() }
final ExprCfgNode getLhs() { e.hasCfgChild(e.getOperand(), this, result) }

Copy link
Contributor Author

@hmac hmac Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but why (the extra class)?

Copy link
Contributor Author

@hmac hmac Nov 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well regardless of why, I think that's fixed the issue below with SplatExprCfgNode.getAnOperand() not giving results for array literals. Still, in the absence of any documentation comments, it would be nice to know why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be multiple CFG nodes for the same AST node due to control flow graph splitting. The e.hasCfgChild predicate picks the CfgNode that matches the AstNode child node and is connected to the current CfgNode.

branch = true and
exists(CfgNodes::ExprNodes::CaseExprCfgNode case |
case.getValue() = testedNode and
exists(CfgNodes::ExprNodes::WhenClauseCfgNode branchNode |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about InClauses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't handle in clauses. It looks much harder to be sure that an expression is guarded by an in pattern, because the patterns can be complex. I guess we'd have to check that there are no new variable bindings in the pattern or something. Anyway, I think that's something to tackle in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit that handles very simple in clauses where the pattern is a single string literal, such as

case foo
in "foo"
  foo
end

) {
branch = true and
exists(CfgNodes::ExprNodes::CaseExprCfgNode case |
case.getValue() = testedNode and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also handle the case where getValue does not exist?

case
  when foo == "foo" then ...
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, I didn't know you could do this. This appears to already be covered by our existing StringConstCompareBarrier. I'll add a test for it.

pattern instanceof ExprNodes::StringLiteralCfgNode
or
// array literals behave weirdly in the CFG so we need to drop down to the AST level for this bit
// specifically: `SplatExprCfgNode.getOperand()` does not return results for array literals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised by this comment. It could be that there is a bug in the CFG library, but other than that getOperand should return an array literal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggested change for SplatExprCfgNode appears to have fixed this. I've pushed an update.

hmac added 6 commits November 9, 2022 15:03
This recognises barriers of the form

    STRINGS = ["foo", "bar"]

    case foo
    when "some string literal"
      foo
    when *["other", "strings"]
      foo
    when *STRINGS
      foo
    end

where the reads of `foo` inside each `when` are guarded by the comparison
of `foo` with the string literals.

We don't yet recognise this construct:

    case foo
    when "foo", "bar"
      foo
    end

This is due to a limitation in the shared barrier guard logic.
foo
end

if foo == "foo" or foo == "bar" # not recognised
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be fixed by #11153

@aibaars
Copy link
Contributor

aibaars commented Nov 9, 2022

@hmac I trivially fixed all merge conflicts in the test cases through the GitHub UI. You probably need to refresh the expected output, though.

hmac added 3 commits November 10, 2022 16:38
Recognise if statements with conditionals made up or logical `and` or
`or` clauses as barrier guards.
The `when` node now acts as a join point for patterns in the when
clause, with match/no-match completions. This is similar to how `or`
expressions work.

The result of this is that the `when` clause "controls" the body of the
`when`, which allows us to model barrier guards for multi-pattern when
clauses.

For this code

case x
when 1, 2
  y
end

The old CFG was

x --> when --> 1 --no-match--> 2 ---no-match---> case
                \               \                  ^
                  \               \                |
                   \                --match----+   |
                     \                         |   |
                       \                       |   |
                         ------match---------> y --+

The new CFG is

x --> 1 --no-match--> 2 --no-match--> [no-match] when --no-match--> case
       \               \                                             ^
         \               \                                           |
           \               --match--> [match] when --match--> y -----+
             \                       /
               \                   /
                 -------match-----

i.e. all patterns flow to the `when` node, which is split based on
whether the pattern matched or not. The body of the when clause then has
a single predecessor `[match] when`, which acts as condition block that
controls `y`.
The `or` case ran extremely slowly before this change. Also exclude
string interpolations from consideration, for correctness, and add some
more tests.
@hmac hmac force-pushed the case-barrier-guard-3 branch 2 times, most recently from a2fe67c to f7d17d6 Compare November 11, 2022 05:29
@hmac hmac force-pushed the case-barrier-guard-3 branch from f7d17d6 to 2b4217b Compare November 11, 2022 05:42
@hmac
Copy link
Contributor Author

hmac commented Nov 11, 2022

I've updated this with quite a lot more stuff:

  • We now recognise guards of the form if x and y, if x or y
  • We recognise guards from multi-pattern when clauses, such as when "a", "b", "c"

To support multi-pattern when clauses I reworked the CFG for when clauses so that control flows from patterns to the when with the correct match/no-match edge. This is a bit weird, because it means that the when controlflow tree is neither post-order nor pre-order - instead, the last node is sometimes the when (if there's no match) and sometimes the body of the when (if there's a match).

This means we get one edge to the body instead of one per pattern, and makes it easy to define a barrier guard for when clauses. The downside is it is a bit weird, and if there's a nicer way to do it I would love to know.

Another result of this is a strange failure in one of the dataflow tests (see CI). This seems to imply that dataflow through case expressions is no longer working correctly, but I haven't found any obvious problems apart from that test failure, so I don't know what's going on there.

@hvitved
Copy link
Contributor

hvitved commented Nov 17, 2022

Another result of this is a strange failure in one of the dataflow tests (see CI). This seems to imply that dataflow through case expressions is no longer working correctly, but I haven't found any obvious problems apart from that test failure, so I don't know what's going on there.

I have pushed some CFG fixes that resolve this.

hmac added 2 commits November 18, 2022 18:14
The existing barrier guard machinery recognises guards such as `if x and y`,
so there's no need to explicitly model them.
@hmac
Copy link
Contributor Author

hmac commented Nov 21, 2022

@hvitved is there anything remaining to change on this PR or do you think it's ready to go?

}

/** A control-flow node that wraps a `SplatExpr` AST expression. */
class SplatExprCfgNode extends ExprCfgNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well extend UnaryOperationCfgNode.


private class WhenTree extends PreOrderTree, WhenClause {
private class WhenTree extends ControlFlowTree, WhenClause {
final override predicate propagatesAbnormal(AstNode child) { child = this.getAPattern() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add getBody() here as well.

or
last(this.getBody(), last, c)
last(this.getBody(), last, c) and
c instanceof NormalCompletion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed.

* ```
*/
private predicate stringConstCompareOr(
CfgNodes::ExprNodes::BinaryOperationCfgNode guard, Ssa::Definition def, boolean branch
Copy link
Contributor

@hvitved hvitved Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def is not used by the caller, so can be removed from the parameter list.

exists(CfgNodes::ExprNodes::CaseExprCfgNode case |
case.getValue() = testedNode and
(
exists(CfgNodes::ExprNodes::WhenClauseCfgNode branchNode |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer guard = any(CfgNodes::ExprNodes::WhenClauseCfgNode branchNode |

// when "foo", "bar"
pattern instanceof ExprNodes::StringLiteralCfgNode
or
exists(CfgNodes::ExprNodes::SplatExprCfgNode splat | splat = pattern |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I prefer pattern = any(CfgNodes::ExprNodes::SplatExprCfgNode splat |

* x == "foo" or x == "bar"
* ```
*/
private predicate stringConstCompareOr(

Check warning

Code scanning / CodeQL

Missing QLDoc for parameter

The QLDoc has no documentation for branch, but the QLDoc mentions or
c.(ConditionalCompletion).getValue() = false
or
last(this.getBody(), last, c) and
c instanceof NormalCompletion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry; I was not clear: What I meant was that the c instanceof NormalCompletion should be removed; we still need last(this.getBody(), last, c) (I wonder why the CFG tests don't fail).

Copy link
Contributor Author

@hmac hmac Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this and thought it was OK to remove because there's a clause in the succ predicate for case trees that I thought might cover it. On my phone so I can't link easily, but L431 of this file on main.

Happy to re-add this bit I deleted though. I'll do that tomorrow.

Copy link
Contributor

@hvitved hvitved Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, it works because of the (somewhat non-compositional)

last(this.getABranch().(WhenClause).getBody(), pred, c)

I still think we should add this back, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also see if we can fix up that line to something like last(this.getABranch(), pred, c)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing ^ caused some problems, so I've just left it as-is for now. The change in WhenTree is fixed though.

@hmac hmac force-pushed the case-barrier-guard-3 branch from 3e974fe to 2822c94 Compare November 24, 2022 20:13
@hmac hmac merged commit 375403f into github:main Nov 29, 2022
@hmac hmac deleted the case-barrier-guard-3 branch November 29, 2022 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants