土狗屋

土狗屋

"Clean Code" The Clean Code Way

The first seven chapters were read from a physical book, while the subsequent content is sourced from the electronic book link below:

Clean Code

HOW DO YOU WRITE FUNCTIONS LIKE THIS?#

Writing software is like any other kind of writing. When you write a paper or an article, you get your thoughts down first, then you massage it until it reads well. The first draft might be clumsy and disorganized, so you wordsmith it and restructure it and refine it until it reads the way you want it to read.

Writing code is very similar to writing other things. When writing a paper or an article, you first write down whatever comes to mind, and then you refine it. The first draft may be rough and disorganized, so you ponder and polish it until it reaches your desired form.

When I write functions, they come out long and complicated. They have lots of indenting and nested loops. They have long argument lists. The names are arbitrary, and there is duplicated code. But I also have a suite of unit tests that cover every one of those clumsy lines of code.

When I write functions, they initially tend to be lengthy and complex. There are too many indents and nested loops. The argument lists are long. The names are arbitrary, and there may be duplicated code. However, I accompany them with a suite of unit tests that cover each line of ugly code.

So then I massage and refine that code, splitting out functions, changing names, eliminating duplication. I shrink the methods and reorder them. Sometimes I break out whole classes, all the while keeping the tests passing.

Then I refine the code, breaking it into functions, changing names, and eliminating duplication. I shorten and reorder methods. Sometimes I even break apart classes, all while ensuring the tests continue to pass. Finally, following the rules outlined in this chapter, I assemble these functions.

In the end, I wind up with functions that follow the rules I’ve laid down in this chapter. I don’t write them that way to start. I don’t think anyone could.

I don’t write functions according to the rules from the beginning. I don’t think anyone can.

  • The art of programming is and always has been the art of language design.
  • The proper use of comments is to compensate for our failure to express our intentions in code. Notice that I used the word "failure." I mean it. Comments are always a failure. We can never find a way to express ourselves without comments, so there must always be comments, and that is not something to celebrate.
    • Ps. So I still have to write them, as I cannot avoid all "failures" yet.
  • Why do I strive to belittle comments? Because comments can lie. Not that they always do or are intended to, but they occur far too frequently. The longer comments exist, the further they drift from the code they describe, becoming increasingly incorrect. The reason is simple. Programmers cannot keep comments maintained.

Again, we see the complimentary nature of these two definitions; they are virtual opposites! This exposes the fundamental dichotomy between objects and data structures:

We again see the essence of these two definitions; they are starkly opposed. This illustrates the dichotomy between objects and data structures:

Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.

Procedural code (code that uses data structures) makes it easy to add new functions without altering existing data structures. Object-oriented code, on the other hand, makes it easy to add new classes without changing existing functions.

The complement is also true:

Conversely, it is also true:

Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.

Procedural code makes it difficult to add new data structures because all functions must be modified. Object-oriented code makes it difficult to add new functions because all classes must be modified.

So, the things that are hard for OO are easy for procedures, and the things that are hard for procedures are easy for OO!

Therefore, what is difficult for object-oriented programming is easier for procedural code, and vice versa!

In any complex system, there are going to be times when we want to add new data types rather than new functions. For these cases, objects and OO are most appropriate. On the other hand, there will also be times when we’ll want to add new functions as opposed to data types. In that case, procedural code and data structures will be more appropriate.

In any complex system, there will be times when we need to add new data types rather than new functions. In these cases, objects and object-oriented programming are more suitable. Conversely, there will also be times when we want to add new functions instead of data types. In that case, procedural code and data structures are more appropriate.

Mature programmers know that the idea that everything is an object is a myth. Sometimes you really do want simple data structures with procedures operating on them.

Wrappers like the one we defined for ACMEPort can be very useful. In fact, wrapping third-party APIs is a best practice. When you wrap a third-party API, you minimize your dependencies upon it: You can choose to move to a different library in the future without much penalty. Wrapping also makes it easier to mock out third-party calls when you are testing your own code.

Wrappers like the one we defined for ACMEPort are very useful. In fact, wrapping third-party APIs is a best practice. When you wrap a third-party API, you reduce your dependency on it: You can switch to a different library in the future with minimal pain. Wrapping also facilitates mocking third-party calls when testing your own code.

One final advantage of wrapping is that you aren’t tied to a particular vendor’s API design choices. You can define an API that you feel comfortable with. In the preceding example, we defined a single exception type for port device failure and found that we could write much cleaner code.

Another benefit of wrapping is that you are not bound to a specific vendor's API design. You can define an API that you are comfortable with. In the previous example, we defined a single exception type for port device errors and discovered that we could write cleaner code.

Often a single exception class is fine for a particular area of code. The information sent with the exception can distinguish the errors. Use different classes only if there are times when you want to catch one exception and allow the other one to pass through.

For a specific area of code, a single exception class is often sufficient. The information sent with the exception can differentiate the errors. Use different exception classes only if there are instances where you want to catch one exception and let the other pass through.


The previous content was mainly read through a physical book, and the notes were difficult to organize (actually, it was laziness), so they were scattered above. The next few chapters were primarily read through the online translation by experts (see the link at the top), which was easier to record.

Chapter 8: Boundaries#

  • We are not responsible for testing third-party code, but writing tests for the third-party code we need to use may be in our best interest.

Learning the third-party code is hard. Integrating the third-party code is hard too. Doing both at the same time is doubly hard. What if we took a different approach? Instead of experimenting and trying out the new stuff in our production code, we could write some tests to explore our understanding of the third-party code. Jim Newkirk calls such tests learning tests.

Learning third-party code is difficult. Integrating third-party code is also challenging. Doing both simultaneously is even harder. What if we took a different approach? Instead of experimenting with new things in our production code, we could write tests to explore our understanding of the third-party code. Jim Newkirk refers to these as learning tests.

In learning tests, we call the third-party API, as we expect to use it in our application. We’re essentially doing controlled experiments that check our understanding of that API. The tests focus on what we want out of the API.

In learning tests, we call the third-party API as we expect to use it in our application. We are essentially conducting controlled experiments to verify our understanding of that API. The tests focus on what we want from the API.

Chapter 9: Unit Testing#

Some of you reading this might sympathize with that decision. Perhaps, long in the past, you wrote tests of the kind that I wrote for that Timer class. It’s a huge step from writing that kind of throw-away test to writing a suite of automated unit tests. So, like the team I was coaching, you might decide that having dirty tests is better than having no tests.

Some readers might agree with this approach. Perhaps, long ago, you also wrote tests similar to those I wrote for that Timer class. It is a significant leap from writing throwaway tests to writing a complete suite of automated unit tests. So, like the team I was coaching, you might think that having dirty tests is better than having no tests.

What this team did not realize was that having dirty tests is equivalent to, if not worse than, having no tests. The problem is that tests must change as the production code evolves. The dirtier the tests, the harder they are to change. The more tangled the test code, the more likely it is that you will spend more time cramming new tests into the suite than it takes to write the new production code. As you modify the production code, old tests start to fail, and the mess in the test code makes it hard to get those tests to pass again. So the tests become viewed as an ever-increasing liability.

This team did not realize that having dirty tests is equivalent to, if not worse than, having no tests. The issue is that tests must evolve alongside production code. The dirtier the tests, the more challenging they are to modify. The more tangled the test code, the more likely you are to spend more time forcing new tests into the suite than it takes to write the new production code. As you change the production code, old tests begin to fail, and the chaos in the test code makes it difficult to get those tests to pass again. Thus, tests become seen as an ever-growing liability.

9.3 - Clean Tests#

The BUILD-OPERATE-CHECK pattern is made obvious by the structure of these tests. Each of the tests is clearly split into three parts. The first part builds up the test data, the second part operates on that test data, and the third part checks that the operation yielded the expected results.

These tests clearly exhibit the BUILD-OPERATE-CHECK pattern. Each test is distinctly divided into three parts. The first part constructs the test data, the second part operates on that test data, and the third part verifies that the operation produced the expected results.

9.5 - F.I.R.S.T#

Timely The tests need to be written in a timely fashion. Unit tests should be written just before the production code that makes them pass. If you write tests after the production code, then you may find the production code to be hard to test. You may decide that some production code is too hard to test. You may not design the production code to be testable.

Timely tests should be written promptly. Unit tests should be created just before the production code that allows them to pass. If you write tests after the production code, you may find the production code difficult to test. You may conclude that some production code is too challenging to test. You may not design the production code to be testable.

Chapter 10 - Classes#

In this chapter on classes, I try to transfer most of the concepts to Go's interface{} for understanding.

The name of a class should describe what responsibilities it fulfills. In fact, naming is probably the first way of helping determine class size. If we cannot derive a concise name for a class, then it’s likely too large. The more ambiguous the class name, the more likely it has too many responsibilities. For example, class names including weasel words like Processor or Manager or Super often hint at unfortunate aggregation of responsibilities.

The name of a class should describe its responsibilities. In fact, naming is likely the first way to help determine the size of a class. If we cannot derive a precise name for a class, it is probably too large. The more ambiguous the class name, the more likely it has too many responsibilities. For instance, class names that include vague terms like Processor, Manager, or Super often suggest an unfortunate aggregation of responsibilities.

Cohesion#

  • When a class loses cohesion, split it!

So breaking a large function into many smaller functions often gives us the opportunity to split several smaller classes out as well. This gives our program a much better organization and a more transparent structure.

Therefore, breaking a large function into many smaller functions often provides the opportunity to split several smaller classes as well. This results in a much better-organized program with a more transparent structure.

10.3 - Organized for Change#

If a system is decoupled enough to be tested in this way, it will also be more flexible and promote more reuse. The lack of coupling means that the elements of our system are better isolated from each other and from change. This isolation makes it easier to understand each element of the system.

If a system is decoupled enough to be tested in this manner, it will also be more flexible and promote greater reuse. The absence of coupling means that the components of our system are better isolated from one another and from changes. This isolation facilitates a better understanding of each element of the system.

By minimizing coupling in this way, our classes adhere to another class design principle known as the Dependency Inversion Principle (DIP). In essence, the DIP says that our classes should depend upon abstractions, not on concrete details.

By minimizing coupling in this manner, our classes adhere to another class design principle known as the Dependency Inversion Principle (DIP). Essentially, the DIP states that our classes should depend on abstractions rather than concrete details.

Chapter 11 - Systems#

It is a myth that we can get systems “right the first time.” Instead, we should implement only today’s stories, then refactor and expand the system to implement new stories tomorrow. This is the essence of iterative and incremental agility. Test-driven development, refactoring, and the clean code they produce make this work at the code level.

The idea that we can get systems "right the first time" is a myth. Instead, we should only implement today's stories, then refactor and expand the system to implement new stories tomorrow. This encapsulates the essence of iterative and incremental agility. Test-driven development, refactoring, and the clean code they produce facilitate this at the code level.

Don't Rush

We all know it is best to give responsibilities to the most qualified persons. We often forget that it is also best to postpone decisions until the last possible moment. This isn’t lazy or irresponsible; it lets us make informed choices with the best possible information. A premature decision is a decision made with suboptimal knowledge. We will have that much less customer feedback, mental reflection on the project, and experience with our implementation choices if we decide too soon.

It is well known that it is best to delegate responsibilities to the most qualified individuals. However, we often forget that it is also beneficial to delay decisions until the last possible moment. This is not laziness or irresponsibility; it allows us to make informed choices based on the best available information. A premature decision is one made with suboptimal knowledge. If we decide too early, we will have significantly less customer feedback, mental reflection on the project, and experience with our implementation choices.

Whether you are designing systems or individual modules, never forget to use the simplest thing that can possibly work.

Whether you are designing systems or individual modules, always remember to use the simplest solution that could possibly work.

Chapter 12 - Emergence#

According to Kent, a design is “simple” if it follows these rules:

According to Kent, a design is "simple" if it adheres to the following rules:

  • Runs all the tests
  • Contains no duplication
  • Expresses the intent of the programmer
  • Minimizes the number of classes and methods

Runs all tests; contains no duplication; expresses the programmer's intent; minimizes the number of classes and methods;

The rules are given in order of importance.

The rules are listed in order of importance.

Remarkably, following a simple and obvious rule that says we need to have tests and run them continuously impacts our system’s adherence to the primary OO goals of low coupling and high cohesion. Writing tests leads to better designs.

Notably, adhering to a simple and clear rule that requires us to have tests and run them continuously significantly affects our system's alignment with the primary object-oriented goals of low coupling and high cohesion. Writing tests results in better designs.

During this refactoring step, we can apply anything from the entire body of knowledge about good software design. We can increase cohesion, decrease coupling, separate concerns, modularize system concerns, shrink our functions and classes, choose better names, and so on. This is also where we apply the final three rules of simple design: Eliminate duplication, ensure expressiveness, and minimize the number of classes and methods.

During this refactoring phase, we can apply all the knowledge we have about good software design. We can enhance cohesion, reduce coupling, separate concerns, modularize system concerns, reduce the size of our functions and classes, select better names, and so forth. This is also where we implement the last three rules of simple design: eliminate duplication, ensure expressiveness, and minimize the number of classes and methods.

Most of us have had the experience of working on convoluted code. Many of us have produced some convoluted code ourselves. It’s easy to write code that we understand, because at the time we write it we’re deep in an understanding of the problem we’re trying to solve. Other maintainers of the code aren’t going to have so deep an understanding.

Most of us have experienced working with convoluted code. Many of us have created convoluted code ourselves. It is easy to write code that we understand because, at the time of writing, we are deeply engaged with the problem we are trying to solve. Other maintainers of the code will not have such a deep understanding.

So one way to read source code is to start with tests (provided, of course, that the target you are preparing to read has sufficiently good test code and high coverage).

Well-written unit tests are also expressive. A primary goal of tests is to act as documentation by example. Someone reading our tests should be able to get a quick understanding of what a class is all about.

Therefore, one approach to reading source code is to begin with tests (assuming that the target you are about to read has adequate test code and coverage).

Well-written unit tests are also expressive. A primary goal of tests is to serve as documentation by example. Someone reading our tests should quickly grasp what a class is intended to do.

Chapter 13 - Concurrency#

“Objects are abstractions of processing. Threads are abstractions of schedule.”

—James O. Coplien

"Objects are abstractions of processing. Threads are abstractions of scheduling." —James O. Coplien

Here are a few more balanced sound bites regarding writing concurrent software:

Here are some more balanced insights regarding writing concurrent software:

  • Concurrency incurs some overhead, both in performance as well as writing additional code.
  • Correct concurrency is complex, even for simple problems.
  • Concurrency bugs aren’t usually repeatable, so they are often ignored as one-offs instead of the true defects they are.
  • Concurrency often requires a fundamental change in design strategy.

Concurrency introduces some overhead, both in terms of performance and in writing additional code. Correct concurrency is complex, even for simple problems. Concurrency bugs are often not repeatable, leading them to be dismissed as one-offs rather than genuine defects. Concurrency frequently necessitates a fundamental shift in design strategy.

  • Concurrency-related code has its own life cycle of development, change, and tuning.
  • Concurrency-related code has its own challenges, which are different from and often more difficult than nonconcurrency-related code.
  • The number of ways in which miswritten concurrency-based code can fail makes it challenging enough without the added burden of surrounding application code.

Concurrency-related code has its own development, modification, and tuning life cycle. Concurrency-related code presents its own challenges, which differ from and are often more difficult than those of nonconcurrency-related code. The myriad ways in which poorly written concurrency code can fail makes it challenging enough without the additional burden of surrounding application code.

Recommendation: Keep your concurrency-related code separate from other code.

Recommendation: Separate concurrency-related code from other code.

Chapter 14 - Successive Refinement#

Let me set your mind at rest. I did not simply write this program from beginning to end in its current form. More importantly, I am not expecting you to be able to write clean and elegant programs in one pass. If we have learned anything over the last couple of decades, it is that programming is a craft more than it is a science. To write clean code, you must first write dirty code and then clean it.

Let me reassure you. I did not write this program from start to finish in its current form. More importantly, I do not expect you to write clean and elegant programs in a single attempt. If we have learned anything over the past few decades, it is that programming is more of a craft than a science. To write clean code, you must first write dirty code and then refine it.

Incrementalism demanded that I get this working quickly before making any other changes. Indeed, the fix was not too difficult. I just had to move the check for null. It was no longer the boolean being null that I needed to check; it was the ArgumentMarshaller.

Incrementalism required me to get this working quickly before making any other changes. Indeed, the fix was not too challenging. I simply had to relocate the null check. It was no longer the boolean that needed to be checked for null; it was the ArgumentMarshaller.

Chapters 14, 15, and 16 are primarily practical content, but since the examples in this book are based on Java, and I am not very familiar with Java (the cases inevitably involve some unique features or libraries of the Java world), I decided not to spend too much time on these three chapters. I directly shifted my focus back to the concluding chapter.

Chapter 17 - Smells and Heuristics#

This chapter serves as a summary of the previous content, so I will not extract the original English text. Additionally, the entire text does not guarantee completeness, as this is merely my notes. Generally, I will only note down the parts that resonate with me.

  • Irrelevant or incorrect comments are discarded comments. Comments can quickly become outdated. It is best not to write comments that will be discarded. If you find discarded comments, it is best to update or delete them as soon as possible. Discarded comments drift far from the code they once described, becoming irrelevant and misleading islands in the code.
  • If a comment describes something that is sufficiently self-descriptive, then the comment is redundant.

The following G series comes from section 17.4 General Issues

G2: Obvious Behavior Not Implemented#

  • Following the Principle of Least Surprise, functions or classes should implement behaviors that other programmers have reason to expect. For example, consider a function that translates day names into an enumeration representing that day.

    Day day = DayDate.StringToDay(String dayName);
    
  • If obvious behavior is not implemented, readers and users can no longer rely on their intuition regarding the function name. They lose trust in the original author and must delve into the code details.

G3: Incorrect Boundary Behavior#

  • There is nothing that can replace caution. Every boundary condition, every extreme case, and every exception represents something that could disrupt an elegant and straightforward algorithm. Do not rely on intuition. Trace every boundary condition and write tests.

G5: Duplication#

  • Every time you see duplicated code, it signifies a missed abstraction. Duplicated code can become a subroutine or even another class. Stacking duplicated code into similar abstractions increases the vocabulary of your design language. Other programmers can utilize the abstractions you create. Coding becomes faster, and errors decrease as you elevate the level of abstraction.
  • A more subtle form is modules that use similar algorithms but have different lines of code. This is also a form of duplication that can be corrected using the template method pattern or strategy pattern.

G6: Code at the Wrong Level of Abstraction#

  • Good software design requires separating concepts at different levels and placing them in different containers. Sometimes, these containers are base classes or derived classes; other times, they are source files, modules, or components. In any case, separation must be complete. Lower-level concepts and higher-level concepts should not be mixed together.

G10: Vertical Separation#

  • Variables and functions should be defined close to where they are used. Local variables should be declared just above where they are first used, with minimal vertical distance. Local variables should not be declared hundreds of lines away from where they are used.
  • Private functions should be defined just below where they are first used. Private functions belong to the entire class, but we should still limit the vertical distance between calls and definitions. Finding a private function should be as simple as looking just a little down from where it is first used.

G11: Inconsistency#

  • If a variable named response is used to hold an HttpServletResponse object in a specific function, then the same variable name should be used in other functions that utilize the HttpServletResponse object. If a method is named processVerificationRequest, then similarly name methods that handle other request types, such as processDeletionRequest.

G17: Responsibilities in the Wrong Place#

  • The Principle of Least Surprise applies here. Code should be placed where readers naturally expect it to be. The PI constant should be declared where the trigonometric functions are defined. The OVERTIME_RATE constant should be declared in the HourlyPayCalculator class.

G20: Function Names Should Express Their Behavior#

  • If you have to look at the implementation (or documentation) of a function to know what it does, then you should change it to a better name or rearrange the functional code into a function with a better name.

G26: Accuracy#

  • Expecting the first match of a query to be the only match may be overly naive. Using floating-point numbers to represent currency is nearly criminal. Avoiding locks and/or transaction management to prevent concurrent updates is also a form of laziness. Declaring a variable as an ArrayList when a List would suffice is overly restrictive. Setting all variables to protected is a lack of discipline.
  • When making decisions in code, ensure that you are precise enough. Be clear about why you are doing it and how to handle exceptions. Do not neglect the accuracy of decisions. If you plan to call a function that may return null, ensure you check for null. If you believe a query is the only record in the database, ensure the code checks for the absence of other records. If handling currency data, use integers and handle rounding appropriately. If there may be concurrent updates, ensure you have implemented some locking mechanism.
  • Ambiguity and inaccuracy in code stem either from differing opinions or from laziness. Regardless of the reason, eliminate them.

G28: Encapsulate Conditions#

  • Without the context of if or while statements, boolean logic can be hard to understand. Functions that explain the intent of conditions should be extracted.

    // For example:
    
    if (shouldBeDeleted(timer))
    is preferable to
    
    // Better than
    
    if (timer.hasExpired() && !timer.isRecurrent())
    

G33: Encapsulate Boundary Conditions#

  • Boundary conditions are difficult to track. Concentrate code that handles boundary conditions in one place, rather than scattering it throughout the code. We do not want to see +1 and -1 scattered around.

    if(level + 1 < tags.length)
    {
      parts = new Parse(body, tags, level + 1, offset + endTag);
      body = null;
    }
    
    // Note that level + 1 appears twice. This is a boundary condition that should be encapsulated in a variable named nextLevel or similar.
    
    int nextLevel = level + 1;
    if(nextLevel < tags.length)
    {
      parts = new Parse(body, tags, nextLevel, offset + endTag);
      body = null;
    }
    

17.6 Naming#

N2: Names Should Match Abstraction Levels#

  • Do not name variables to communicate implementation; name them to reflect the abstraction level of the class or function. This is not easy. People are prone to mixing abstraction levels. Each time you browse code, you will find some variable names that are too low in level. You should take the opportunity to rename them. To make code readable, continuous improvement is necessary. Consider the following Modem interface:

N5: Choose Longer Names for Broader Scope#

  • The length of a name should correlate with the breadth of its scope. For a small scope, a short name is acceptable, while a longer name should be used for a larger scope.
    • Variable names like i and j are fine for a scope of five lines or less.
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.