Clean Documentation

Fabian Wesner
Fabian Wesner CTO Spryker
25. September 2014 in

Technology

Clean code is a common concept among software developers. But what about clean documentation? How are comments used properly? And what should a technical documentation contain?

When I do due diligences, I always ask for the technical documentation in advance. In most cases, I receive nothing or just very few information which is collated quickly. Although many developers know about the importance of a good technical documentation, they quite often have no clue how to create it. While there are proven methods to implement clean code there is nothing the like for clean documentation. In this post, I want to show our best practices and how we deal with the technical documentation of Yves & Zed.

First and foremost, it’s essential to be aware of the different types of technical documentation and their corresponding target groups: First, there are
 /** comments **/ within the code, which are used by the IDE to provide autocompletion and which may be helpful for other developers to maintain the code. Secondly, there is a technical documentation, which describes the architecture and technologies, and functions as a guide for new colleagues who need to learn to work with your system. And thirdly, there is a user manual, which describes how to use the software. User manuals are out of the scope of this blog post. They’re usually written by technical writers and there is a lot of information on how to write proper user manuals anyway.

No Comments === Good Comments

From my experience, many developers aren’t sure how to deal with comments. I personally think that textual comments in code very often indicate code smells which need to be refactored. Clean code has good namings, short methods, and follows the step-down rule. This way it’s easy to read and you can drill down into the methods if you need more details. Whenever comments are necessary to understand the logic, you should start to refactor the code and remove the comments afterward. When you add new textual comments to the code, you should keep in mind that comments are not for free since they need to be updated whenever the code changes. It’s much better to write readable clean code instead of using comments to explain bad code.

The following code snippet shows a part of our order calculator. As you can see there are two loops which are described by comments:

 

public function getShippingCostsForOrder(SalesOrder $order)
{
$shippingCosts = 0;

// sum up order shipping costs
foreach ($order->getExpenses() as $expense) {
if ($expense->getType() === ExpenseConstants::EXPENSE_SHIPPING) {
$shippingCosts += $expense->getValue();
break;
}
}

$orderItems = $this->factory->createModelOrderprocessFinder()->getOrderItemsForGrandTotalAggreate($order);

// sum up order-item shipping costs
foreach ($orderItems as $item) {
foreach ($item->getExpenses() as $expense) {
if ($expense->getType() === ExpenseConstants::EXPENSE_SHIPPING) {
$shippingCosts += $expense->getValue();
break;
}
}
}

return $shippingCosts;
}

 

When you look at the code you probably first read the comments to get an idea of what happens. You may also notice that this method violates the functions should do one thing rule because it adds the costs of the order and the order items. Therefore it makes sense to refactor it and to extract methods. As you can see, the resulting method doesn’t require any comments to make the logic understandable. You can just read the code:

public function getShippingCostsForOrder(SalesOrder $order)
{
$shippingCosts = 0;

$shippingCosts = $this->addOrderShippingCosts($order, $shippingCosts);

$orderItems = $this->getItemsForOrder($order);

$shippingCosts = $this->addOrderItemsShippingCosts($orderItems, $shippingCosts);

return $shippingCosts;
}

 

As described above, textual comments are not useful for controllers, models, and views. Very often, they’re outdated or simply wrong. Even if you provide a public library, it makes more sense to write a technical documentation and provide a set of unit tests which demonstrate how to use it.

Sometimes comments are used to explain possible keys for the array typed configuration parameter. Instead of comments you should provide set methods which are self-explanatory (e.g. setTimeout() is easier to find and use than $config[‘timeout’]).

In other cases, people use comments to name design patterns. Of course, it’s a good idea to name patterns, but it’s always better to use the class-name as an indicator (e.g. CatalogFacade, VoucherFactory, BookstoreSingleton, etc.) rather than comments.

The only real use cases for comments are generated DocBlocks before every class and method. Those comments are needed by the IDE to detect the variables’ types and to provide autocompletion. In addition, there are some rare cases where it makes sense to provide textual comments, e.g. if you use a cumbersome or unusual way of implementation, it is a good idea to leave a short note to explain why.

Clean Documentation

The technical documentation is usually written in a wiki and is addressed to developers. For Yves & Zed we use confluence in combination with the gliffy-plugin. Our technical documentation has three purposes:

  • It is supposed to function as an entry point for new developers who need to learn about the system.
  • It serves as the architecture specification and provides orientation for all developers.
  • It is needed at the latest when there is a due diligence (e.g. for a funding round or exit) and somebody wants to check it.

So there are good reasons for writing a clean documentation, but very often developers have no idea where to start and what to write. A good technical documentation doesn’t aim to describe everything. It focuses on the main topics which are the overall architecture, a glossary with specific terms, tutorials for reoccurring workflows, and a short description for each aspectof the application. The biggest mistake people make is to document features. There is no benefit at all if you try to explain how your checkout works or which mails you send to your customers. Features are explained in the code itself and if the code is clean you can read it easily. There is a high risk of information being outdated because features often undergo changes.

The overall architecture is best described with diagrams. Most important is the bird’s eye view of your application:

 



In addition, you can provide different views on your system. Think in terms of layers, patterns and dependencies. You start from the overall view and drill down into the structure of the code.

 


Here you should also name the key technologies and tools (like MySQL or RabbitMQ). Usually there’s no need to explain them in detail, but you can link to the related online documentations. Especially regarding tools, you have to make sure that your information is always up-to-date. Instead of providing information on the MySQL version you’re using when the documentation is written, you can provide the CLI-call ‘mysql — version’ so that everybody can obtain the necessary information if needed.

Especially with regards to the architecture, we only use very little text in our documentation. Instead, we provide a glossary with specific terms. We give a short explanation, show relationships, and sometimes give an example.

In addition to the architecture description, we provide tutorials for frequent workflows like the installation of our developers’ VM or deployments to production servers. Here, it’s crucial to be as precise and complete as possible. It’s a good idea to provide all CLI steps, so that developers just need to copy and paste without wasting time.

The most extensive part of our documentation is about the individual aspects of the system. With aspects I simply mean ‘everything that a developer needs to know in order to work effectively’. Typical aspects are:

  • Naming and Coding Conventions (e.g. “We use PSR-2.”)
  • What are the layers of your application, how are they structured, and what exactly belongs to them (e.g. “The communication layer contains all controllers, forms, validators and grids. It must not contain any kind of business logic.”)
  • Major policies (e.g. “Every model needs a unit test.”)
  • Most important patterns (e.g. “We use factories instead of initiation with new.”)
  • Important files in your structure and their purpose (e.g. ”Composer.json defines the application’s dependencies to external packages.“)
  • Frequently used tasks, like “How to add a page to the navigation”, “How to change the configuration”, or “How to change the database structure”
  • Any kind of automatic generation of files and directories (e.g. via GruntJS)

When you start writing about these aspects, you should try to keep the text as short and simple as possible. Nobody will read the documentation as if it was a book. Usually people are looking for a concrete answer and it’s your job to deliver it as fast and clear as possible. It’s good practice to use diagrams whenever feasible. A good documentation doesn’t need to be changed frequently, because (hopefully) the described architecture is mostly invariant.

If you don’t know how to start, the best way is to give an ad hoc speech to your team about your system. When you stand in front of your team you’ll automatically focus on the most fundamental aspects and you can use the audience’s questions to get a feeling about what’s important and what’s unclear. Before you start: make sure that somebody takes notes! ;)

Copyright 2014 Project A Ventures | All code in this post is licensed under the MIT License unless otherwise declared.

Still got questions?
Ask the author for further information.