Dirty Jobs: Debugging Till the Last Minute by@offcode

Dirty Jobs: Debugging Till the Last Minute

image
Adam Schmideg HackerNoon profile picture

Adam Schmideg

PHP/Laravel is as sexy as sandals with socks. It is the staple food of a million mediocre web programmers. That was my verdict with zero prior experience with it. I was not vocal about my opinion, though, because I did not have to touch PHP code with my bare hands. Until Workwize hired me. As a lead developer.

You would expect a lead developer to be the most knowledgeable among the devs. Lead by example, recruiters say. Be the alpha male or female of the pack, they unconsciously mean.

I had quite some catching up to do. Database seeding, factories, the whole Eloquent ORM. But reading the docs does not generate code, the ultimate measure of our progression. I picked a task, a simple one to warm up my backend muscles: Display the total amount spent by the customer.

An endpoint already took care of it but it returned a value based on an outdated calculation. The pricing model changed since it got more sophisticated. All I had to do, I thought, was update the calculation. The database looks decent. Enough to replace the query, with a few lines of code.

Debugging

I ran the tests, made a minimal change, and ran the tests again. It is like saying hello to a codebase I have never seen before. Two-hundred something tests in twenty seconds. Ok, we can be friends. Let’s focus on my task and run testTotalSpent() only. Nine seconds. What the? Tried again. (Do you have also this habit of not believing your eyes and trying the same thing again, fingers crossed?) Nine seconds. Annoying. But I set out to solve a different problem, I realized.

My first Eloquent query felt like “look ma, no hands.” It had some syntactic errors, though. I left out a few parens accidentally. And found a few peculiarities of the database. The customers.customer_id refers to … no, not to itself (what would that mean anyway?) but to users.id. Some references are even reversed which is fine as long it is only a one-to-one relation. One day, they will come back and shoot you in the foot. Some fields are remnants of an optimistic design decision back in the old days, like products.price. They are not used anymore. Probably.

No one had the courage to remove them and rely on the new prices table only. The code does some gymnastics to handle these cases. I seriously considered refactoring the whole pricing and product model. A long-time friend talked me out of it. One of the reasons I call him a friend.

I tamed the query to the point it hurried from table to pivot to table, it followed all the connections, aggregated the expenses, and returned–nothing. I checked the database. Really empty. I was wrong and the query was right. One of the humiliating aspects of programming. But I remembered the test setUp called a couple of model factories. Where are those records? Gone with the wind?

How stupid I am! Each test needs a clean slate, so records inserted earlier will not mess with my currently executing function. This is implemented by running each test in a transaction and rolling it back as soon as the test has finished its walk of life, no matter if it passed or failed. (Testing has a spiritual element to it.) To run each test in isolation makes sense. But can we make an exception this time? Commit the transaction and keep the data that was created, so I can inspect it.

This insight did not come to me for free. I had to run the test so many times to come to this conclusion and every time the lag, those nine seconds I mentioned earlier. You can imagine my level of frustration: I complained about it to a junior colleague. And to the right person, it turned out. He knew what kept the test runner busy. It performed a spring cleaning (truncate tables, migrate data, dust off the shelves) before even listening to my commands.

Now I had two reasons to stop the obsessed lady sweeping in front of my database and to turn off her cleaning mania. To speed up running a test and to finally get my hands on the data I created. Two reasons, perfectly justifiable from a technical and productivity point of view. And I had a third one I preferred to keep to myself.

Looking good for the team

At this point in time, I was already two weeks into fixing the bug. A trivial two-line change, my ass. A junior helped me out with a question I could have solved myself. Nobody challenged me about what I had spent my time on and my plate was full with other fish to fry. I still felt pressure to have something to show for my title, lead developer. Once I cannot fix the bug in a day (quite probable, unfortunately), I should bring something else to the table. To speed up test execution for all my teammates sounded like a perfect first contribution.

My junior colleague already had a simple solution. He commented out a line in the test case, the line responsible for all the cleanup magic that says use RefreshTable. He just had to pay attention and never commit this change. If he accidentally did, he could fix it in the next commit. Not very elegant. I can do better.

Four commits later, it worked on my machine. I dug elbow-deep into phpunit configuration. Finally, the tables stood naked in front of me with all their data exposed. It was easy to see now what I had got wrong. The OrderItem table contained so many fields that they did not fit all on my screen. I scrolled to the right to see what to be had, then I simply ignored what I did not understand. The fields are displayed not alphabetically but in the order of creation. Some important fields were added later, like price whose value gets copied from the Prices table at the time the order is placed.

I typed a neat Eloquent query, six lines altogether, some joins, some wheres, and a groupBy. Let me reproduce it here to keep it for eternity

OrderItem::query
  ->join('orders', 'orders.customer_id', '=', 'customer.id')
  ->join('order_items', 'order_items.order_id', '=', 'orders.id')
  ->where('order_items.created_at', '>=', $startDate)
  ->where('customer.id', $customer->id)
  ->sum('order_item.price')
  ->groupBy('order_item.type');

I captured the essence of the data structure. This is the moment when the symphonic orchestra starts to play in a movie. The main character slowly lifts his hand. And hits Enter on the keyboard. Cut to his face. You see anticipation. The first signs of victory. Then something goes awry. His face melts into disbelief. Cut to the screen. The cursor blinks. Data is printed. totalSpent: [Buy => 3]. Obviously wrong. No “Rent”, only “Buy”? And a month worth of spending cannot be three.

In SQL we trust, in Eloquent we do not. The query builder has a toSql() method. It is too smart (or too dumb, depending on how you look at it) and returns a question mark for the parameters. This is not a reason to despair, getBindings() will give you the parameters you want. A few lines of PHP wizardry from Stackoverflow merge these two into a SQL statement.

$query = str_replace(array('?'), array('\'%s\''), $builder->toSql());
$query = vsprintf($query, $builder->getBindings());

This method is not flawless, though. I tried to execute the generated query. Mysql complained about the last line, about the group-by clause. Nobrainer. I deleted the line. After all, I wanted to see the records selected. Grouping them is secondary at this point. The bastard did not run out of complaints. About the select clause this time. Understandable, without a group-by, it makes no sense to aggregate the prices. Delete that line too. Replace it with a simple select *, it never hurts. It does not execute, either, since no table is specified. Ok, select order_items.*. Now shut up and show me what you have.

The records, oh my god, the records I so badly wanted to see got listed on the screen. I am basically done, I thought, the algorithm works. The only thing left is to convince Eloquent to do what my raw SQL can already do.

Eloquent probably originated behind the walls of Hogwarts. Sometimes I even thought Valdemort himself had coded its first version. I could get the records properly on the Eloquent side. But when I tried to get the sum of them, it returned only the first. The total price of items bought was properly calculated. But it failed to give me how much was rented or leased.

Here is my situation back then in numbers

number_of_commits_ahead_of_main: 8
number_of_commits_behind_main: 32
days_passed_fixing_the_damn_thing: 19
pressure_that_i_should_show_something: 1.8 bar

I decided to start afresh. A new branch, a new life. Instead of the select() method of my first attempt, I used

->selectRaw('order_items.type, sum(order_items.price * order_items.quantity)'

Yes, I realized I had forgotten to multiply the price by the quantity. This error remained hidden because my test data contained only one of each item. My previous version worked only for a quantity of one.

This new version produced correct results and Eloquent returned all the objects. Victory. Case closed, branch merged to main. I did not feel the satisfaction I deserved. It is not enough to fix a bug, you need a lesson you learned. You need a car chase and finally, the criminal caught. One question bugged me still. Why did Eloquent return only one record?

I know only a part of the answer. When I wrote these lines

OrderItem::select()
  ->groupBy('order_item.type')
  ->sum('order_item.price')

(ignore the issue with quantity), it actually made no sense from a SQL perspective. Here is what it says in plain English. Select all fields of the order items, group the records by their type, and sum their prices. It says nothing about what to do with the other fields? Group by creation date too? No. Aggregate them? Definitely not. Take the first, the last, the median?

This is exactly what Mysql complained about. How many times do I have to learn the ancient lesson?! Read the whole error message and understand what it says.

How come Eloquent managed to run that query and return the first record? Why did it not give me a SQL error? This is the part I cannot answer. I suspect Eloquent performs some black magic. In certain edge cases, it produces a different raw SQL than what it actually runs. But I have no evidence for it.


Epilogue

As I was writing this story, I wanted to paste here the actual error message. Which I have not read until two minutes ago. Here it is.

Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'order_items.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

There is this sql_mode parameter that seems to be set to only_full_group_by by default. But look at this snippet from my config/database.php

'connections' => [
  'mysql' => [
    'strict' => false,
  ]
]

It instructs Mysql to behave a lot more relaxed. That includes skipping checking the group-by clause. Eloquent was not at fault here. As I said earlier: read the error message.


This article was first published here.


PHP/Laravel is as sexy as sandals with socks. It is the staple food of a million mediocre web programmers. That was my verdict with zero prior experience with it. I was not vocal about my opinion, though, because I did not have to touch PHP code with my bare hands. Until Workwize hired me. As a lead developer.

You would expect a lead developer to be the most knowledgeable among the devs. Lead by example, recruiters say. Be the alpha male or female of the pack, they unconsciously mean.

I had quite some catching up to do. Database seeding, factories, the whole Eloquent ORM. But reading the docs does not generate code, the ultimate measure of our progression. I picked a task, a simple one to warm up my backend muscles: Display the total amount spent by the customer.

An endpoint already took care of it but it returned a value based on an outdated calculation. The pricing model changed since it got more sophisticated. All I had to do, I thought, was update the calculation. The database looks decent. Enough to replace the query, with a few lines of code.

Debugging

I ran the tests, made a minimal change, and ran the tests again. It is like saying hello to a codebase I have never seen before. Two-hundred something tests in twenty seconds. Ok, we can be friends. Let’s focus on my task and run testTotalSpent() only. Nine seconds. What the? Tried again. (Do you have also this habit of not believing your eyes and trying the same thing again, fingers crossed?) Nine seconds. Annoying. But I set out to solve a different problem, I realized.

My first Eloquent query felt like “look ma, no hands.” It had some syntactic errors, though. I left out a few parens accidentally. And found a few peculiarities of the database. The customers.customer_id refers to … no, not to itself (what would that mean anyway?) but to users.id. Some references are even reversed which is fine as long it is only a one-to-one relation. One day, they will come back and shoot you in the foot. Some fields are remnants of an optimistic design decision back in the old days, like products.price. They are not used anymore. Probably.

No one had the courage to remove them and rely on the new prices table only. The code does some gymnastics to handle these cases. I seriously considered refactoring the whole pricing and product model. A long-time friend talked me out of it. One of the reasons I call him a friend.

I tamed the query to the point it hurried from table to pivot to table, it followed all the connections, aggregated the expenses, and returned–nothing. I checked the database. Really empty. I was wrong and the query was right. One of the humiliating aspects of programming. But I remembered the test setUp called a couple of model factories. Where are those records? Gone with the wind?

How stupid I am! Each test needs a clean slate, so records inserted earlier will not mess with my currently executing function. This is implemented by running each test in a transaction and rolling it back as soon as the test has finished its walk of life, no matter if it passed or failed. (Testing has a spiritual element to it.) To run each test in isolation makes sense. But can we make an exception this time? Commit the transaction and keep the data that was created, so I can inspect it.

This insight did not come to me for free. I had to run the test so many times to come to this conclusion and every time the lag, those nine seconds I mentioned earlier. You can imagine my level of frustration: I complained about it to a junior colleague. And to the right person, it turned out. He knew what kept the test runner busy. It performed a spring cleaning (truncate tables, migrate data, dust off the shelves) before even listening to my commands.

Now I had two reasons to stop the obsessed lady sweeping in front of my database and to turn off her cleaning mania. To speed up running a test and to finally get my hands on the data I created. Two reasons, perfectly justifiable from a technical and productivity point of view. And I had a third one I preferred to keep to myself.

Looking good for the team

At this point in time, I was already two weeks into fixing the bug. A trivial two-line change, my ass. A junior helped me out with a question I could have solved myself. Nobody challenged me about what I had spent my time on and my plate was full with other fish to fry. I still felt pressure to have something to show for my title, lead developer. Once I cannot fix the bug in a day (quite probable, unfortunately), I should bring something else to the table. To speed up test execution for all my teammates sounded like a perfect first contribution.

My junior colleague already had a simple solution. He commented out a line in the test case, the line responsible for all the cleanup magic that says use RefreshTable. He just had to pay attention and never commit this change. If he accidentally did, he could fix it in the next commit. Not very elegant. I can do better.

Four commits later, it worked on my machine. I dug elbow-deep into phpunit configuration. Finally, the tables stood naked in front of me with all their data exposed. It was easy to see now what I had got wrong. The OrderItem table contained so many fields that they did not fit all on my screen. I scrolled to the right to see what to be had, then I simply ignored what I did not understand. The fields are displayed not alphabetically but in the order of creation. Some important fields were added later, like price whose value gets copied from the Prices table at the time the order is placed.

I typed a neat Eloquent query, six lines altogether, some joins, some wheres, and a groupBy. Let me reproduce it here to keep it for eternity

OrderItem::query
  ->join('orders', 'orders.customer_id', '=', 'customer.id')
  ->join('order_items', 'order_items.order_id', '=', 'orders.id')
  ->where('order_items.created_at', '>=', $startDate)
  ->where('customer.id', $customer->id)
  ->sum('order_item.price')
  ->groupBy('order_item.type');

I captured the essence of the data structure. This is the moment when the symphonic orchestra starts to play in a movie. The main character slowly lifts his hand. And hits Enter on the keyboard. Cut to his face. You see anticipation. The first signs of victory. Then something goes awry. His face melts into disbelief. Cut to the screen. The cursor blinks. Data is printed. totalSpent: [Buy => 3]. Obviously wrong. No “Rent”, only “Buy”? And a month worth of spending cannot be three.

In SQL we trust, in Eloquent we do not. The query builder has a toSql() method. It is too smart (or too dumb, depending on how you look at it) and returns a question mark for the parameters. This is not a reason to despair, getBindings() will give you the parameters you want. A few lines of PHP wizardry from Stackoverflow merge these two into a SQL statement.

$query = str_replace(array('?'), array('\'%s\''), $builder->toSql());
$query = vsprintf($query, $builder->getBindings());

This method is not flawless, though. I tried to execute the generated query. Mysql complained about the last line, about the group-by clause. Nobrainer. I deleted the line. After all, I wanted to see the records selected. Grouping them is secondary at this point. The bastard did not run out of complaints. About the select clause this time. Understandable, without a group-by, it makes no sense to aggregate the prices. Delete that line too. Replace it with a simple select *, it never hurts. It does not execute, either, since no table is specified. Ok, select order_items.*. Now shut up and show me what you have.

The records, oh my god, the records I so badly wanted to see got listed on the screen. I am basically done, I thought, the algorithm works. The only thing left is to convince Eloquent to do what my raw SQL can already do.

Eloquent probably originated behind the walls of Hogwarts. Sometimes I even thought Valdemort himself had coded its first version. I could get the records properly on the Eloquent side. But when I tried to get the sum of them, it returned only the first. The total price of items bought was properly calculated. But it failed to give me how much was rented or leased.

Here is my situation back then in numbers

number_of_commits_ahead_of_main: 8
number_of_commits_behind_main: 32
days_passed_fixing_the_damn_thing: 19
pressure_that_i_should_show_something: 1.8 bar

I decided to start afresh. A new branch, a new life. Instead of the select() method of my first attempt, I used

->selectRaw('order_items.type, sum(order_items.price * order_items.quantity)'

Yes, I realized I had forgotten to multiply the price by the quantity. This error remained hidden because my test data contained only one of each item. My previous version worked only for a quantity of one.

This new version produced correct results and Eloquent returned all the objects. Victory. Case closed, branch merged to main. I did not feel the satisfaction I deserved. It is not enough to fix a bug, you need a lesson you learned. You need a car chase and finally, the criminal caught. One question bugged me still. Why did Eloquent return only one record?

I know only a part of the answer. When I wrote these lines

OrderItem::select()
  ->groupBy('order_item.type')
  ->sum('order_item.price')

(ignore the issue with quantity), it actually made no sense from a SQL perspective. Here is what it says in plain English. Select all fields of the order items, group the records by their type, and sum their prices. It says nothing about what to do with the other fields? Group by creation date too? No. Aggregate them? Definitely not. Take the first, the last, the median?

This is exactly what Mysql complained about. How many times do I have to learn the ancient lesson?! Read the whole error message and understand what it says.

How come Eloquent managed to run that query and return the first record? Why did it not give me a SQL error? This is the part I cannot answer. I suspect Eloquent performs some black magic. In certain edge cases, it produces a different raw SQL than what it actually runs. But I have no evidence for it.


Epilogue

As I was writing this story, I wanted to paste here the actual error message. Which I have not read until two minutes ago. Here it is.

Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'order_items.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

There is this sql_mode parameter that seems to be set to only_full_group_by by default. But look at this snippet from my config/database.php

'connections' => [
  'mysql' => [
    'strict' => false,
  ]
]

It instructs Mysql to behave a lot more relaxed. That includes skipping checking the group-by clause. Eloquent was not at fault here. As I said earlier: read the error message.


This article was first published here.

Comments

Signup or Login to Join the Discussion

Tags

Related Stories