The work is truly never finished. Software is art and as an artist I’m never satisfied with the result nor am I satisfied with the experience on behalf of my users.
Just add a button there.
Just hook it up to the API.
Just upload it somewhere.
Using „just“ assumes that the task at hand requires no thought, and all that’s left is to implement it. But every change to a system can bring its own problems – for users, for maintainability, or for performance. Stop and consider whether it even needs to be implemented at all, or whether there isn’t some more elegant way to solve it.
Teach your clients, your bosses, and your colleagues that nothing is that simple, and that everything requires you to put in more than just hammering at the keyboard while writing code.
I’ll walk you through several areas you should focus on if you’re not happy with the number of bugs that make it to your production server. My intention isn’t to publish yet another „avoid duplication“ or „write tests“ article – there are plenty of those – but to highlight practices that aren’t yet so deeply ingrained. Probably also because you can prevent bugs not only with technical skills but with communication skills as well.
First let me clarify what I even consider a bug. I don’t want to dive into an unpopular academic exercise about the most precise possible definition, but I like the claim that a bug is any deviation in the application’s behavior from what the user expects. That covers not only obvious application crashes and 500 errors, but also incorrect data operations and their presentation, inconsistencies, unintuitive interfaces, and the forming of a wrong mental model in the user’s head. [1]
Requirements and feedback
A lot of bugs are born while there isn’t a single line of code written yet. You shouldn’t start programming based on an ambiguous or incomplete specification. Requesting its clarification or correction is cheap; reworking a finished feature or an entire application is expensive, time-consuming, and no fun for anyone.
Even when the specification is fine, you need to keep reassuring yourself that you understand it correctly and that you’re sticking to it. If you’re developing a larger whole, I recommend keeping the client informed about its state and shape not just at the end, but throughout. How often depends on the form of cooperation. For internal development, by all means several times a week; in a vendor-client relationship, at least twice a month.
Some changes, by their nature, allow deployment to production even when they aren’t completely finished yet. If, for example, you’re reworking the application’s data model so it can accommodate some new functionality whose user interface you’ll develop in a later phase, deploy even such an „invisible“ change to production right away. The goal is to get feedback as quickly as possible – you can immediately observe the impact on server load, you’ll catch any problems with integrating it into the rest of the system early, and you’ll warn the client about the threat of an inevitable delay. Similarly, you can deploy and test, say, data gathering weeks or months before anyone sees that data in the application. [2]
Deadlines
- Client: „How long will it take you to develop X?“
- Developer: „Four to five weeks.“
- Client: „But we need it in two!“
How do you usually react in a situation like that? Do you resign yourself to having to work sixteen hours a day instead of eight? Do you promise the client it’ll be done in two weeks, but once they’re up, tell them it’ll take another three?
Just as nine women can’t deliver a baby in one month, just as twice the number of developers won’t cut the development time in half, the estimates you give can’t be magically circumvented either. A rushed product will contain more bugs.
The right solution is to agree on whether it’s more important to meet the requested deadline or the scope of the specification, and to come up with some compromise – typically a shorter development of a trimmed-down version. The important thing is to stand by the fact that the estimate for the original scope is non-negotiable.
Code reviews
An application written by a single programmer will never be as high-quality as one developed by more people. But you only benefit from a larger team if you do code reviews. All code, except for trivial bugfixes, should be seen by at least one developer besides the author before it’s merged into the main branch and deployed.
When examining a colleague’s code, I first familiarize myself with the specification of their task and figure out how I would probably approach solving it and everything I’d have to do. I consider the appropriateness of the chosen data structures, the architecture, and the names of classes, methods, and variables. The code must make it clear what it does. I then discuss the differences between the imaginary and the actual solution with the author – sometimes we find that mine wasn’t thought through, sometimes that theirs wasn’t.
You need to look at every piece of code very critically and think about what combination of input data could break it. But the ego has to be set aside – both sides must realize that it’s the code under the microscope, not its author. The reviewer therefore must not attack the author, and the author must not take criticism of their own code personally. Sometimes it’s hard to say goodbye to that method we put so much care into. But if it doesn’t fit into the broader context, it has to go.
Thorough testing should be part of the review as well.
Version control
This isn’t just backing up. I consider version control an integral part of the code; I personally spend roughly a third of my development time on it. Used correctly, a version control tool will make team collaboration easier and help you find the sources of bugs. You should learn to use yours perfectly.
Make small atomic commits. That is, ones that concern only a single task and don’t break anything. The goal is to increase the readability of diffs for review and to make it easier to revert those changes if needed. [3]
If you use Git, learn to use the reset and rebase commands (including its interactive variant). Rewriting history is important for preventing merge hell, and it also nudges you toward committing often, so you don’t accidentally lose your work. Since frequent commits don’t always reconcile with atomicity and stability, you can use the aforementioned commands to clean up the history right before submitting it, so it meets those criteria. [4]
Stable commits let you use the bisect command. You’ll use it the moment you have some bug that you know wasn’t in the system at some point, but you can’t track down where it arises. Using binary search, git bisect finds the first commit in which it manifests. But if you have commits in your history in which the application is significantly broken, this search is made harder.
Automation
If any frequently performed operation consists of several non-trivial steps, it’s very prone to human error. That’s why operations like deploying the application, running migrations, compiling JavaScript and CSS, clearing caches, and so on should take the form of a single command. You can use build tools like Phing or Make for this. Besides gaining resilience to errors, you’ll also save time.
And once the application is bootable from scratch by invoking a single command, nothing stands in the way of checking it automatically on a so-called continuous integration server such as Travis or Jenkins. These can run static analysis, automated tests, or coding standard compliance checks on every commit in the repository. If you write tests but haven’t set up their automatic execution and notifications when they fail, it’s as if you didn’t have them at all. Continuous integration isn’t meant to replace code reviews, but to complement them. That way, during reviews you can focus on substantive problems and not bother with code formatting, whose correctness can be verified automatically.
Among the most fatal bugs I count the so-called gulf of execution and gulf of evaluation. These terms from HCI theory express the two most common sources of user frustration: „I know what I want to do, but I don’t know how“ and „I did something, but I don’t know whether what I wanted actually happened.“ ↩︎
Crawling other people’s websites, downloading data from an API, or importing e-mails. ↩︎
The
git revertcommand, which takes the hash of the commit to revert as a parameter, creates a new commit that is the exact opposite. So the lines the original commit added will be removed, and the removed lines added back. That’s why it pays off to make small commits – larger ones usually can’t be reverted this easily and you have to „cherry-pick“ the changes by hand. ↩︎„With Git, the trick is to think of history less as ‚history‘ and more as a step-by-step description of your codebase.“ – Max Howell ↩︎
Avery Pennarun, a Google employee, on the not-so-simple job of working with the smartest programmers on the planet:
Logic is a pretty powerful tool, but it only works if you give it good input. As the famous computer science maxim says, „garbage in, garbage out.“ If you know all the constraints and weights - with perfect precision - then you can use logic to find the perfect answer. But when you don’t, which is always, there’s a pretty good chance your logic will lead you very, very far astray.
Overtime may bring short-term benefits, but it doesn’t work in the long run and can even be harmful.
The feeling of finishing tons of work in a short period and depriving oneself of quality personal time can be addicting, especially when it results in „saving the day“ for a project. Rolling up your sleeves and cranking to the end of a deadline makes you feel valuable in a very concrete way. Without your overtime, the project doesn’t get done on time. With it, the project is saved. It’s hard to find such black and white ways to add value in daily „normal“ work.
To broaden my horizons, I replaced Nette and Texy on my site with Jekyll and Markdown. Instead of dynamic PHP, it’s now fully statically generated.
As a bonus, you can also follow it on GitHub.
*if you know how to use it and configure it correctly
A certain bad habit is spreading around here. From time to time someone shows up with the claim that Doctrine is slow and therefore unusable on real projects. This misconception stems from several myths that Doctrine 2 is shrouded in.
As I wrote three and a half years ago, the main motivation for using an ORM is representing data from the database in the application through consistent objects. Objects that are not generic universal hashmaps with unknown contents, but objects that have a fixed, well-defined interface and therefore I know what data I can get from them and how to manipulate it. So I don’t want the ORM to save me from writing SQL queries, and I don’t even want to be shielded from the specific database I’m using. I can therefore afford to optimize data retrieval very much like with other „lightweight“ libraries which, according to the malicious tongues linked above, are no longer slow.
So if I want a high-performance application, I don’t leave most SQL queries to be generated by Doctrine, but write them myself using DQL, which is a related language that Doctrine uses. Instead of tables of columns, in it you reference entity classes and attributes. It’s easily extensible, which comes in handy if you need to use some feature of your database that DQL doesn’t support out of the box.
In case I want to select data from the database for reading only (typically for output in a template), there’s no need to transfer whole objects. In that case I list the columns I’m interested in within DQL, and Doctrine returns the query result to me in plain arrays. Thanks to DQL you can also avoid the 1+N problem (lazy loading inside a loop). However, above a certain traffic level you can’t afford to keep reaching for fresh data from the database with any tool, and you have to deploy application-level caching either way.
Doctrine’s opponents often argue that it’s a behemoth. I’m not sure what exactly they mean by that. They probably don’t like that its source is spread across many files, and they find the approach of mPDF far more appealing. In any case, though, the size of the library has no impact on performance if you enable and correctly configure OpCache, which significantly eases PHP’s workload.
Furthermore, in production you should disable automatic generation of proxy classes:
$config->setAutoGenerateProxyClasses(FALSE);
Otherwise Doctrine will generate a proxy for every entity on every request, which is, yes, slow.[1]
You should also set up a cache for metadata (entity configuration) and compiled DQL queries. See the list of all available drivers.
$cache = new \Doctrine\Common\Cache\ApcCache();
$config->setMetadataCacheImpl($cache);
$config->setQueryCacheImpl($cache);
You can verify the correctness of these settings with the CLI command orm:ensure-production-settings.
The worst idea you can get is to write your own ORM. It’s an incredibly complex and demanding area. The problems you’ll be solving were solved long ago by Doctrine’s authors. But if you have two years to develop your own ORM full-time and think you can do it better than they did, I won’t stop you.
Proxy classes serve for lazy loading of associated data – Doctrine generates code that, when an unloaded property is accessed, issues a query to the database. ↩︎
A brilliantly funny read, though only the most initiated will appreciate it.
1842 – Ada Lovelace writes the first program. She is hampered in her efforts by the minor inconvenience that she doesn’t have any actual computers to run her code. Enterprise architects will later relearn her techniques in order to program in UML.
A fascinating series describing several stories from life at both small and large companies. Although most of us don’t take the obstacles and problems it describes all that seriously, almost everyone has encountered some variation of them.
Benedict Evans on how general-purpose tools like Office are gradually being replaced by single-purpose but efficient apps, which in turn reduces the need for classic computers with a keyboard and mouse:
When I worked at Orange there was a multi-megabyte Excel file on the network drive called, I believe, ‘sum_of_x.xls’ containing complex macros and every major operating metric for the entire company, there for anyone who needed to analyze high-level data. That should probably not, really, be in Excel today.
The same applies to Powerpoint - it’s a very good tool for that 150 slide deck, but what if you’re making a 10-slide deck each week that consists entirely of operating metrics pulled out of a back-end system, manipulated in Excel and pasted into slides, plus commentary, that are emailed to 25 people? Shouldn’t that change from a 2 hour task to a SAAS dashboard and a 30-second task? And would it still need a mouse and keyboard?
I’ve gone through several phases in my professional career. I started programming by clicking together and writing simple windowed applications in C++Builder. Then came the Internet, and I turned my attention to the web. I learned HTML and tried out a dead end in the form of Microsoft FrontPage. It served me well for a long time. Around 2005 I started learning PHP and earning a regular few thousand crowns from small gigs. At the same time, as I started studying at CTU FEE, I learned Nette, and in my second year I responded to a job ad at a company where I experienced — and still experience — the biggest leaps in my knowledge and skills.
My progress in programming over the last 15 years has always stemmed from envying other people’s creations and wanting my own. Whether it was compiled .exe files, websites, content management systems, Nette extensions — simply everything that was the center of my attention at the time.
I’ve had an iPhone for three and a half years now. Over that time I’ve developed a warm relationship with all those icons on my home screen. I follow Apple news closely and know about everything that stirs in the iOS and Mac community. Thanks to that I’ve refined a sense for quality and user experience that I try to apply when building websites too.
But that’s of course not enough for me. I needed to find out how mobile developers make my favorite apps. I installed Xcode and started learning from the Stanford lectures. I wrote a few simple apps to understand the principles being covered.
Mobile development is something completely different from the web — it’s not enough to answer a stateless HTTP request and die. An app can run for dozens of minutes and must behave correctly under any combination of user inputs. Demanding computations have to be offloaded to background threads, because the user interface has to stay smooth at all times. Everything runs much closer to the metal, and sometimes you can tell from how low-level the code is. The whole thing is a serious challenge, and I love those.
A user on a phone doesn’t have a keyboard and mouse at their disposal, so they won’t cope with the hellishly programmed forms that are common on the web; instead, all functionality has to be thought through in detail and served to them on a silver platter, so they reach their goal as easily as possible. While on the web you can get away with churning out four forms a day that only need to behave roughly like this, on iOS you can spend a month building a single screen and nobody will be surprised.
After six months of development, today I’m releasing my first app, which I developed together with Michal Langmajer — we’re really proud of it. If you’re in a muddle about which friends you owe money to and who owes you, definitely check it out.
Mobile development is so far just a hobby for me, for long winter and summer evenings, but maybe one day it’ll turn into a serious livelihood. Wish me luck!
A lifehack that actually works for reaching high productivity. I figured it out for myself long ago, but it’s great to see it spelled out in black and white.
In short: if you need to get task X done, you find a seemingly far more important task Y that you’re avoiding, and by procrastinating on Y you end up finishing X. Then, when you need to do Y, you find an even more urgent task Z to avoid.
