Email or username:

Password:

Forgot your password?
Top-level
Woozle Hypertwin

@Girgias

Thanks for your input on this!

I don't know if the answer to this question is impossibly complicated, but what I don't understand is why type-coercion has to take place for comparison operators when both sides of the comparison are the same type.

TLDR: I think what is happening here is that the == operator has a clear preference for numeric coercion -- if a string can be interpreted as a properly-formatted number, it will do that. Only if a string cannot be interpreted that way will it reluctantly do a string comparison:

echo ('0e11' == '0e22');
// ^ shows "1"
echo ('0e11' == '0e2a');
// ^ shows nothing

Also, this is clearly happening at the point of the comparison, and not before:

$x='0e11';
$y='0e22';
var_dump($x);
// string(4) "0e11"
var_dump($y);
// string(4) "0e22"
echo $x == $y;
// 1

The same rule appears to be in effect for !=, >, and <.

Also, this only happens for numerically-formatted contents; the strings 'true' and 'false' are not coerced to boolean values.

Editorializing

(The following is intended not so much as arguing with you as hoping to arm you with more information to present to those who might influence the decisionmaking process.)

There may be valid mental model reasons for doing this, but I think they really need to be spelled out clearly in the manual somewhere and mentioned whenever affected operators are discussed.

To me, right now, it makes no sense. The comparison is between two things that are unambiguously the same type, so there is no reason to coerce either of them to anything else.

There's not even any reason to be attempting to parse the contents of either string (to see if either one might be intended to represent a number).

A string should be a string unless it can't be. I'd rather see string-to-number comparisons generate an error than have this behavior.

The Manual Is Wrong?

This page heavily implies that coercion only takes place if there's a type-mismatch:

If both operands are numeric strings, or one operand is a number and the other one is a numeric string, then the comparison is done numerically.

A warning-example on that page also clearly shows that while 0 == "a" used to coerce "a" to a number before the comparison (in php7), php8 no longer does -- which to my mind implies that in the problematic example which started this discussion, the coercion is taking place via a different mechanism, to wit:

Because the string contents can be interpreted as a number, the '==' operator inexplicably decides to convert it.

It's not even following the usual string-to-int "empty string is zero, anything else is 1" conversion rule:

echo ((int)'a' == (int)'b');
// ^ shows "1"
echo ('a' == 'b');
// ^ shows nothing

Even when both values have been parsed as strings and put into variables marked as type "string", these operators are peering inside the contents to see if they can be interpreted differently.

How does this make sense?

cc: @IceWolf
@cptwtf @ramsey
@Crell

5 comments
Paul Reinheimer

@woozle @Girgias @IceWolf @cptwtf @ramsey @Crell

My guess is that a lot of choices were made when ~everything came in as a string over GET or POST. PHP tried really hard to do what people probably wanted. Register Globals filled the world with strings!

Woozle Hypertwin

@preinheimer @Girgias @IceWolf @cptwtf @ramsey @Crell

Well, PHP has remedied many other decisions that later turned out to be problematic -- I'd say this one is definitely ready to take its turn ^.^

Larry Garfield

@woozle @preinheimer @Girgias @IceWolf @cptwtf @ramsey It's been a long effort to get this far. :-) And every change has someone whining that we've broken their app because it relies on silly implicit type conversions. Caution is needed as we go.

Gina Peter Banyard

@preinheimer @woozle @IceWolf @cptwtf @ramsey @Crell yes, and frankly considering this premises comparing two strings as numerical makes _more_ sense than forcefully casting the string to an int, as it did prior to PHP 8.

But more importantly `==` Vs `===` is a nothing burger. The behaviour of `==` is important because this is the behaviour used with the ordering operators (e.g. `=<`).

And saying "eh just use `===`" kinda misses the point to why it is a problem.

Woozle Hypertwin

@Girgias @preinheimer @IceWolf @cptwtf @ramsey @Crell

What doesn't make sense to me is that the operator would ever try to do type-conversion when it's comparing the same type on both sides, and where that type already has comparison rules (even for < and >).

Note that when the string-contents are not formatted as a number, all four operators just do what you'd expect (i.e. use string-comparison rules).

That means it must be first attempting to parse the string as a number, and only if that fails does it use string rules.

This seems wasteful of CPU, however trivially, in addition to the unexpected behavior.

...and given that this can cause false-positive matches specifically when looking at the output of a hash-function, I have to wonder if it doesn't also represent a security vulnerability.

Language opinions aside, the manual should at least clarify the point that all 4 operators will first try to compare two strings as numeric before falling back to what you'd expect -- and should probably note conditions under which two differing strings can be evaluated as equal (there are others, though probably more obvious/expected).

@Girgias @preinheimer @IceWolf @cptwtf @ramsey @Crell

What doesn't make sense to me is that the operator would ever try to do type-conversion when it's comparing the same type on both sides, and where that type already has comparison rules (even for < and >).

Note that when the string-contents are not formatted as a number, all four operators just do what you'd expect (i.e. use string-comparison rules).

Go Up