Python Class Attributes: An Overly Thorough Guide

This article is originally published at Toptal.

I had a programming interview recently, a phone-screen in which we used a collaborative text editor.

I was asked to implement a certain API, and chose to do so in Python. Abstracting away the problem statement, let’s say I needed a class whose instances stored some 

1
data

 and some 

1
other_data

.

I took a deep breath and started typing. After a few lines, I had something like this:


1
2
3
4
5
6
class Service(object):
    data = []

    def __init__(self, other_data):
        self.other_data = other_data
    ...

My interviewer stopped me:

  • Interviewer: “That line: 
    1
    data = []

    . I don’t think that’s valid Python?”

  • Me: “I’m pretty sure it is. It’s just setting a default value for the instance attribute.”
  • Interviewer: “When does that code get executed?”
  • Me: “I’m not really sure. I’ll just fix it up to avoid confusion.”

For reference, and to give you an idea of what I was going for, here’s how I amended the code:


1
2
3
4
5
6
class Service(object):

    def __init__(self, other_data):
        self.data = []
        self.other_data = other_data
    ...

As it turns out, we were both wrong. The real answer lay in understanding the distinction between Python class attributes and Python instance attributes.

Python class attributes vs. Python instance attributes

Note: if you have an expert handle on class attributes, you can skip ahead to use cases.

Python Class Attributes

My interviewer was wrong in that the above code is syntactically valid.

I too was wrong in that it isn’t setting a “default value” for the instance attribute. Instead, it’s defining 

1
data

 as a class attribute with value 

1
[]

.

In my experience, Python class attributes are a topic that many people know something about, but few understand completely.

Python Class Variable vs. Instance Variable: What’s the Difference?

A Python class attribute is an attribute of the class (circular, I know), rather than an attribute of an instance of a class.

Let’s use a Python class example to illustrate the difference. Here, 

1
class_var

 is a class attribute, and 

1
i_var

 is an instance attribute:


1
2
3
4
5
class MyClass(object):
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

Note that all instances of the class have access to 

1
class_var

, and that it can also be accessed as a property of the class itself:


1
2
3
4
5
6
7
8
9
foo = MyClass(2)
bar = MyClass(3)

foo.class_var, foo.i_var
## 1, 2
bar.class_var, bar.i_var
## 1, 3
MyClass.class_var ## <— This is key
## 1

For Java or C++ programmers, the class attribute is similar—but not identical—to the static member. We’ll see how they differ later.

Class vs. Instance Namespaces

To understand what’s happening here, let’s talk briefly about Python namespaces.

namespace is a mapping from names to objects, with the property that there is zero relation between names in different namespaces. They’re usually implemented as Python dictionaries, although this is abstracted away.

Depending on the context, you may need to access a namespace using dot syntax (e.g., 

1
object.name_from_objects_namespace

) or as a local variable (e.g., 

1
object_from_namespace

). As a concrete example:


1
2
3
4
5
6
7
8
9
10
class MyClass(object):
    ## No need for dot syntax
    class_var = 1

    def __init__(self, i_var):
        self.i_var = i_var

## Need dot syntax as we've left scope of class namespace
MyClass.class_var
## 1

Python classes and instances of classes each have their own distinct namespaces represented by pre-defined attributes 

1
MyClass.__dict__

 and 

1
 

, respectively.

When you try to access an attribute from an instance of a class, it first looks at its instance namespace. If it finds the attribute, it returns the associated value. If not, it then looks in the class namespace and returns the attribute (if it’s present, throwing an error otherwise). For example:


1
2
3
4
5
6
7
8
9
10
foo = MyClass(2)

## Finds i_var in foo's instance namespace
foo.i_var
## 2

## Doesn't find class_var in instance namespace…
## So look's in class namespace (MyClass.__dict__)
foo.class_var
## 1

The instance namespace takes supremacy over the class namespace: if there is an attribute with the same name in both, the instance namespace will be checked first and its value returned. Here’s a simplified version of the code (source) for attribute lookup:


1
2
3
4
5
6
def instlookup(inst, name):
    ## simplified algorithm...
    if inst.__dict__.has_key(name):
        return inst.__dict__[name]
    else:
        return inst.__class__.__dict__[name]

And, in visual form:

attribute lookup in visual form

How Class Attributes Handle Assignment

With this in mind, we can make sense of how Python class attributes handle assignment:

  • If a class attribute is set by accessing the class, it will override the value for all instances. For example:
    
    
    1
    2
    3
    4
    5
    6
    foo = MyClass(2)
    foo.class_var
    ## 1
    MyClass.class_var = 2
    foo.class_var
    ## 2

    At the namespace level… we’re setting 

    1
    MyClass.__dict__['class_var'] = 2

    . (Note: this isn’t the exact code(which would be 

    1
    setattr(MyClass, 'class_var', 2)

    ) as 

    1
    __dict__

     returns a dictproxy, an immutable wrapper that prevents direct assignment, but it helps for demonstration’s sake). Then, when we access 

    1
    foo.class_var

    1
    class_var

     has a new value in the class namespace and thus 2 is returned.

  • If a Paython class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance. For example:
    
    
    1
    2
    3
    4
    5
    6
    7
    8
    foo = MyClass(2)
    foo.class_var
    ## 1
    foo.class_var = 2
    foo.class_var
    ## 2
    MyClass.class_var
    ## 1

    At the namespace level… we’re adding the 

    1
    class_var

     attribute to 

    1
    foo.__dict__

    , so when we lookup 

    1
    foo.class_var

    , we return 2. Meanwhile, other instances of 

    1
    MyClass

     will not have 

    1
    class_var

     in their instance namespaces, so they continue to find 

    1
    class_var

     in 

    1
    MyClass.__dict__

     and thus return 1.

Mutability

Quiz question: What if your class attribute has a mutable type? You can manipulate (mutilate?) the class attribute by accessing it through a particular instance and, in turn, end up manipulating the referenced object that all instances are accessing (as pointed out by Timothy Wiseman).

This is best demonstrated by example. Let’s go back to the 

1
Service

 I defined earlier and see how my use of a class variable could have led to problems down the road.


1
2
3
4
5
6
class Service(object):
    data = []

    def __init__(self, other_data):
        self.other_data = other_data
    ...

My goal was to have the empty list (

1
[]

) as the default value for 

1
data

, and for each instance of 

1
Service

 to have its own data that would be altered over time on an instance-by-instance basis. But in this case, we get the following behavior (recall that 

1
Service

 takes some argument 

1
other_data

, which is arbitrary in this example):


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
s1 = Service(['a', 'b'])
s2 = Service(['c', 'd'])

s1.data.append(1)

s1.data
## [1]
s2.data
## [1]

s2.data.append(2)

s1.data
## [1, 2]
s2.data
## [1, 2]

This is no good—altering the class variable via one instance alters it for all the others!

At the namespace level… all instances of 

1
Service

 are accessing and modifying the same list in 

1
Service.__dict__

 without making their own 

1
data

 attributes in their instance namespaces.

We could get around this using assignment; that is, instead of exploiting the list’s mutability, we could assign our 

1
Service

 objects to have their own lists, as follows:


1
2
3
4
5
6
7
8
9
10
s1 = Service(['a', 'b'])
s2 = Service(['c', 'd'])

s1.data = [1]
s2.data = [2]

s1.data
## [1]
s2.data
## [2]

In this case, we’re adding 

1
s1.__dict__['data'] = [1]

, so the original 

1
Service.__dict__['data']

 remains unchanged.

Unfortunately, this requires that 

1
Service

 users have intimate knowledge of its variables, and is certainly prone to mistakes. In a sense, we’d be addressing the symptoms rather than the cause. We’d prefer something that was correct by construction.

My personal solution: if you’re just using a class variable to assign a default value to a would-be Python instance variable, don’t use mutable values. In this case, every instance of 

1
Service

 was going to override 

1
Service.data

 with its own instance attribute eventually, so using an empty list as the default led to a tiny bug that was easily overlooked. Instead of the above, we could’ve either:

  1. Stuck to instance attributes entirely, as demonstrated in the introduction.
  2. Avoided using the empty list (a mutable value) as our “default”:
    
    
    1
    2
    3
    4
    5
    6
    class Service(object):
        data = None

        def __init__(self, other_data):
            self.other_data = other_data
        ...

    Of course, we’d have to handle the 

    1
    None

     case appropriately, but that’s a small price to pay.

So When Should you Use Python Class Attributes?

Class attributes are tricky, but let’s look at a few cases when they would come in handy:

  1. Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants. For example:
    
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    class Circle(object):
        pi = 3.14159

        def __init__(self, radius):
            self.radius = radius

        def area(self):
            return Circle.pi * self.radius * self.radius

    Circle.pi
    ## 3.14159

    c = Circle(10)
    c.pi
    ## 3.14159
    c.area()
    ## 314.159
  2. Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items:
    
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    class MyClass(object):
        limit = 10

        def __init__(self):
            self.data = []

        def item(self, i):
            return self.data[i]

        def add(self, e):
            if len(self.data) >= self.limit:
                raise Exception("Too many elements")
            self.data.append(e)

    MyClass.limit
    ## 10

    We could then create instances with their own specific limits, too, by assigning to the instance’s 

    1
    limit

    attribute.

    
    
    1
    2
    3
    foo = MyClass()
    foo.limit = 50
    ## foo can now hold 50 elements—other instances can hold 10

    This only makes sense if you will want your typical instance of 

    1
    MyClass

     to hold just 10 elements or fewer—if you’re giving all of your instances different limits, then 

    1
    limit

     should be an instance variable. (Remember, though: take care when using mutable values as your defaults.)

  3. Tracking all data across all instances of a given class. This is sort of specific, but I could see a scenario in which you might want to access a piece of data related to every existing instance of a given class.To make the scenario more concrete, let’s say we have a 
    1
    Person

     class, and every person has a 

    1
    name

    . We want to keep track of all the names that have been used. One approach might be to iterate over the garbage collector’s list of objects, but it’s simpler to use class variables.

    Note that, in this case, 

    1
    names

     will only be accessed as a class variable, so the mutable default is acceptable.

    
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    class Person(object):
        all_names = []

        def __init__(self, name):
            self.name = name
            Person.all_names.append(name)

    joe = Person('Joe')
    bob = Person('Bob')
    print Person.all_names
    ## ['Joe', 'Bob']

    We could even use this design pattern to track all existing instances of a given class, rather than just some associated data.

    
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    class Person(object):
        all_people = []

        def __init__(self, name):
            self.name = name
            Person.all_people.append(self)

    joe = Person('Joe')
    bob = Person('Bob')
    print Person.all_people
    ## [<__main__.Person object at 0x10e428c50>, <__main__.Person object at 0x10e428c90>]
  4. Performance (sort of… see below).

Under-the-hood

Note: If you’re worrying about performance at this level, you might not want to be use Python in the first place, as the differences will be on the order of tenths of a millisecond—but it’s still fun to poke around a bit, and helps for illustration’s sake.

Recall that a class’s namespace is created and filled in at the time of the class’s definition. That means that we do just one assignment—ever—for a given class variable, while instance variables must be assigned every time a new instance is created. Let’s take an example.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def called_class():
    print "Class assignment"
    return 2

class Bar(object):
    y = called_class()

    def __init__(self, x):
        self.x = x

## "Class assignment"

def called_instance():
    print "Instance assignment"
    return 2

class Foo(object):
    def __init__(self, x):
        self.y = called_instance()
        self.x = x

Bar(1)
Bar(2)
Foo(1)
## "Instance assignment"
Foo(2)
## "Instance assignment"

We assign to 

1
Bar.y

 just once, but 

1
instance_of_Foo.y

 on every call to 

1
__init__

.

As further evidence, let’s use the Python disassembler:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import dis

class Bar(object):
    y = 2

    def __init__(self, x):
        self.x = x

class Foo(object):
    def __init__(self, x):
        self.y = 2
        self.x = x

dis.dis(Bar)
##  Disassembly of __init__:
##  7           0 LOAD_FAST                1 (x)
##              3 LOAD_FAST                0 (self)
##              6 STORE_ATTR               0 (x)
##              9 LOAD_CONST               0 (None)
##             12 RETURN_VALUE

dis.dis(Foo)
## Disassembly of __init__:
## 11           0 LOAD_CONST               1 (2)
##              3 LOAD_FAST                0 (self)
##              6 STORE_ATTR               0 (y)

## 12           9 LOAD_FAST                1 (x)
##             12 LOAD_FAST                0 (self)
##             15 STORE_ATTR               1 (x)
##             18 LOAD_CONST               0 (None)
##             21 RETURN_VALUE

When we look at the byte code, it’s again obvious that 

1
Foo.__init__

 has to do two assignments, while 

1
Bar.__init__

 does just one.

In practice, what does this gain really look like? I’ll be the first to admit that timing tests are highly dependent on often uncontrollable factors and the differences between them are often hard to explain accurately.

However, I think these small snippets (run with the Python timeit module) help to illustrate the differences between class and instance variables, so I’ve included them anyway.

Note: I’m on a MacBook Pro with OS X 10.8.5 and Python 2.7.2.

Initialization


1
2
10000000 calls to `Bar(2)`: 4.940s
10000000 calls to `Foo(2)`: 6.043s

The initializations of 

1
Bar

 are faster by over a second, so the difference here does appear to be statistically significant.

So why is this the case? One speculative explanation: we do two assignments in 

1
Foo.__init__

, but just one in 

1
Bar.__init__

.

Assignment


1
2
3
4
10000000 calls to `Bar(2).y = 15`: 6.232s
10000000 calls to `Foo(2).y = 15`: 6.855s
10000000 `Bar` assignments: 6.232s - 4.940s = 1.292s
10000000 `Foo` assignments: 6.855s - 6.043s = 0.812s

Note: There’s no way to re-run your setup code on each trial with timeit, so we have to reinitialize our variable on our trial. The second line of times represents the above times with the previously calculated initialization times deducted.

From the above, it looks like 

1
Foo

 only takes about 60% as long as 

1
Bar

 to handle assignments.

Why is this the case? One speculative explanation: when we assign to 

1
Bar(2).y

, we first look in the instance namespace (

1
Bar(2).__dict__[y]

), fail to find 

1
y

, and then look in the class namespace (

1
Bar.__dict__[y]

), then making the proper assignment. When we assign to 

1
Foo(2).y

, we do half as many lookups, as we immediately assign to the instance namespace (

1
Foo(2).__dict__[y]

).

In summary, though these performance gains won’t matter in reality, these tests are interesting at the conceptual level. If anything, I hope these differences help illustrate the mechanical distinctions between class and instance variables.

In Conclusion

Class attributes seem to be underused in Python; a lot of programmers have different impressions of how they work and why they might be helpful.

My take: Python class variables have their place within the school of good code. When used with care, they can simplify things and improve readability. But when carelessly thrown into a given class, they’re sure to trip you up.

Appendix: Private Instance Variables

One thing I wanted to include but didn’t have a natural entrance point…

Python doesn’t have private variables so-to-speak, but another interesting relationship between class and instance naming comes with name mangling.

In the Python style guide, it’s said that pseudo-private variables should be prefixed with a double underscore: ‘__’. This is not only a sign to others that your variable is meant to be treated privately, but also a way to prevent access to it, of sorts. Here’s what I mean:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Bar(object):
    def __init__(self):
    self.__zap = 1

a = Bar()
a.__zap
## Traceback (most recent call last):
##   File "<stdin>", line 1, in <module>
## AttributeError: 'Bar' object has no attribute '__baz'

## Hmm. So what's in the namespace?
a.__dict__
{'_Bar__zap': 1}
a._Bar__zap
## 1

Look at that: the instance attribute 

1
__zap

 is automatically prefixed with the class name to yield 

1
_Bar__zap

.

While still settable and gettable using 

1
a._Bar__zap

, this name mangling is a means of creating a ‘private’ variable as it prevents you and others from accessing it by accident or through ignorance.

Edit: as Pedro Werneck kindly pointed out, this behavior is largely intended to help out with subclassing. In the PEP 8 style guide, they see it as serving two purposes: (1) preventing subclasses from accessing certain attributes, and (2) preventing namespace clashes in these subclasses. While useful, variable mangling shouldn’t be seen as an invitation to write code with an assumed public-private distinction, such as is present in Java.

 

Rails Service Objects: A Comprehensive Guide

This post was written by Amin Shah Gilani, Ruby Developer for Toptal.

Ruby on Rails ships with everything you need to prototype your application quickly, but when your codebase starts growing, you’ll run into scenarios where the conventional Fat Model, Skinny Controller mantra breaks. When your business logic can’t fit into either a model or a controller, that’s when service objects come in and let us separate every business action into its own Ruby object.

An example request cycle with Rails service objects

In this article, I’ll explain when a service object is required; how to go about writing clean service objects and grouping them together for contributor sanity; the strict rules I impose on my service objects to tie them directly to my business logic; and how not to turn your service objects into a dumping ground for all the code you don’t know what to do with.

Why Do I Need Service Objects?

Try this: What do you do when your application needs to tweet the text from 

1
params[:message]

?

If you’ve been using vanilla Rails so far, then you’ve probably done something like this:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TweetController</span> &lt; ApplicationController</span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create</span></span>
    send_tweet(params[<span class="hljs-symbol">:message</span>])
  <span class="hljs-keyword">end</span>

  private

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">send_tweet</span><span class="hljs-params">(tweet)</span></span>
    client = Twitter::REST::Client.new <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
      config.consumer_key        = ENV[<span class="hljs-string">'TWITTER_CONSUMER_KEY'</span>]
      config.consumer_secret     = ENV[<span class="hljs-string">'TWITTER_CONSUMER_SECRET'</span>]
      config.access_token        = ENV[<span class="hljs-string">'TWITTER_ACCESS_TOKEN'</span>]
      config.access_token_secret = ENV[<span class="hljs-string">'TWITTER_ACCESS_SECRET'</span>]
    <span class="hljs-keyword">end</span>
    client.update(tweet)
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

The problem here is that you’ve added at least ten lines to your controller, but they don’t really belong there. Also, what if you wanted to use the same functionality in another controller? Do you move this to a concern? Wait, but this code doesn’t really belong in controllers at all. Why can’t the Twitter API just come with a single prepared object for me to call?

The first time I did this, I felt like I’d done something dirty. My, previously, beautifully lean Rails controllers had started getting fat and I didn’t know what to do. Eventually, I fixed my controller with a service object.

Before you start reading this article, let’s pretend:

  • This application handles a Twitter account.
  • The Rails Way means “the conventional Ruby on Rails way of doing things” and the book doesn’t exist.
  • I’m a Rails expert… which I’m told every day that I am, but I have trouble believing it, so let’s just pretend that I really am one.

What Are Service Objects?

Service objects are Plain Old Ruby Objects (PORO) that are designed to execute one single action in your domain logic and do it well. Consider the example above: Our method already has the logic to do one single thing, and that is to create a tweet. What if this logic was encapsulated within a single Ruby class that we can instantiate and call a method to? Something like:


1
2
3
4
5
6
7
tweet_creator = TweetCreator.new(params[<span class="hljs-symbol">:message</span>])
tweet_creator.send_tweet


<span class="hljs-comment"># Later on in the article, we'll add syntactic sugar and shorten the above to:</span>

TweetCreator.call(params[<span class="hljs-symbol">:message</span>])

This is pretty much it; our 

1
TweetCreator

 service object, once created, can be called from anywhere, and it would do this one thing very well.

Creating a Service Object

First let’s create a new 

1
TweetCreator

 in a new folder called 

1
app/services

:


1
$ mkdir app/services &amp;&amp; touch app/services/tweet_creator.rb

And let’s just dump all our logic inside a new Ruby class:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<span class="hljs-comment"># app/services/tweet_creator.rb</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TweetCreator</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">initialize</span><span class="hljs-params">(message)</span></span>
    @message = message
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">send_tweet</span></span>
    client = Twitter::REST::Client.new <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
      config.consumer_key        = ENV[<span class="hljs-string">'TWITTER_CONSUMER_KEY'</span>]
      config.consumer_secret     = ENV[<span class="hljs-string">'TWITTER_CONSUMER_SECRET'</span>]
      config.access_token        = ENV[<span class="hljs-string">'TWITTER_ACCESS_TOKEN'</span>]
      config.access_token_secret = ENV[<span class="hljs-string">'TWITTER_ACCESS_SECRET'</span>]
    <span class="hljs-keyword">end</span>
    client.update(@message)
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

Then you can call 

1
TweetCreator.new(params[:message]).send_tweet

 anywhere in your app, and it will work. Rails will load this object magically because it autoloads everything under 

1
app/

. Verify this by running:


1
2
3
4
5
6
$ rails c
Running via Spring preloader <span class="hljs-keyword">in</span> process <span class="hljs-number">12417</span>
Loading development environment (Rails <span class="hljs-number">5.1</span>.<span class="hljs-number">5</span>)
 &gt; puts ActiveSupport::Dependencies.autoload_paths
...
/Users/gilani/Sandbox/nazdeeq/app/services

Want to know more about how 

1
autoload

 works? Read the Autoloading and Reloading Constants Guide.

Adding Syntactic Sugar to Make Rails Service Objects Suck Less

Look, this feels great in theory, but 

1
TweetCreator.new(params[:message]).send_tweet

 is just a mouthful. It’s far too verbose with redundant words… much like HTML (ba-dum tiss!). In all seriousness, though, why do people use HTML when HAML is around? Or even Slim. I guess that’s another article for another time. Back to the task at hand:

1
TweetCreator

 is a nice short class name, but the extra cruft around instantiating the object and calling the method is just too long! If only there were precedence in Ruby for calling something and having it execute itself immediately with the given parameters… oh wait, there is! It’s 

1
Proc#call

.

1
Proc*call

 invokes the block, setting the block’s parameters to the values in params using something close to method calling semantics. It returns the value of the last expression evaluated in the block.


1
2
3
4
5
a_proc = Proc.new {<span class="hljs-params">|scalar, *values|</span> values.map {<span class="hljs-params">|value|</span> value*scalar } }
a_proc.call(<span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>)    <span class="hljs-comment">#=&gt; [9, 18, 27]</span>
a_proc[<span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]         <span class="hljs-comment">#=&gt; [9, 18, 27]</span>
a_proc.(<span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>)        <span class="hljs-comment">#=&gt; [9, 18, 27]</span>
a_proc.<span class="hljs-keyword">yield</span>(<span class="hljs-number">9</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>)   <span class="hljs-comment">#=&gt; [9, 18, 27]</span>

Documentation

If this confuses you, let me explain. A 

1
proc

 can be 

1
call

-ed to execute itself with the given parameters. Which means, that if 

1
TweetCreator

 were a 

1
proc

, we could call it with 

1
TweetCreator.call(message)

 and the result would be equivalent to 

1
TweetCreator.new(params[:message]).call

, which looks quite similar to our unwieldy old 

1
TweetCreator.new(params[:message]).send_tweet

.

So let’s make our service object behave more like a 

1
proc

!

First, because we probably want to reuse this behavior across all our service objects, let’s borrow from the Rails Way and create a class called 

1
ApplicationService

:


1
2
3
4
5
6
<span class="hljs-comment"># app/services/application_service.rb</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ApplicationService</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">self</span>.<span class="hljs-title">call</span><span class="hljs-params">(*args, &amp;block)</span></span>
    new(*args, &amp;block).call
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

Did you see what I did there? I added a class method called 

1
call

 that creates a new instance of the class with the arguments or block you pass to it, and calls 

1
call

 on the instance. Exactly what we we wanted! The last thing to do is to rename the method from our 

1
TweetCreator

 class to 

1
call

, and have the class inherit from 

1
ApplicationService

:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<span class="hljs-comment"># app/services/tweet_creator.rb</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TweetCreator</span> &lt; ApplicationService</span>
  <span class="hljs-keyword">attr_reader</span> <span class="hljs-symbol">:message</span>
 
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">initialize</span><span class="hljs-params">(message)</span></span>
    @message = message
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call</span></span>
    client = Twitter::REST::Client.new <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
      config.consumer_key        = ENV[<span class="hljs-string">'TWITTER_CONSUMER_KEY'</span>]
      config.consumer_secret     = ENV[<span class="hljs-string">'TWITTER_CONSUMER_SECRET'</span>]
      config.access_token        = ENV[<span class="hljs-string">'TWITTER_ACCESS_TOKEN'</span>]
      config.access_token_secret = ENV[<span class="hljs-string">'TWITTER_ACCESS_SECRET'</span>]
    <span class="hljs-keyword">end</span>
    client.update(@message)
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

And finally, let’s wrap this up by calling our service object in the controller:


1
2
3
4
5
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TweetController</span> &lt; ApplicationController</span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create</span></span>
    TweetCreator.call(params[<span class="hljs-symbol">:message</span>])
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

Grouping Similar Service Objects for Sanity

The example above has only one service object, but in the real world, things can get more complicated. For example, what if you had hundreds of services, and half of them were related business actions, e.g., having a 

1
Follower

 service that followed another Twitter account? Honestly, I’d go insane if a folder contained 200 unique-looking files, so good thing there’s another pattern from the Rails Way that we can copy—I mean, use as inspiration: namespacing.

Let’s pretend we’ve been tasked to create a service object that follows other Twitter profiles.

Let’s look at the name of our previous service object: 

1
TweetCreator

. It sounds like a person, or at the very least, a role in an organization. Someone that creates Tweets. I like to name my service objects as if they were just that: roles in an organization. Following this convention, I’ll call my new object: 

1
ProfileFollower

.

Now, since I’m the supreme overlord of this app, I’m going to create a managerial position in my service hierarchy and delegate responsibility for both these services to that position. I’ll call this new managerial position 

1
TwitterManager

.

Since this manager does nothing but manage, let’s make it a module and nest our service objects under this module. Our folder structure will now look like:


1
2
3
4
5
services
??? application_service.rb
??? twitter_manager
      ??? profile_follower.rb
      ??? tweet_creator.rb

And our service objects:


1
2
3
4
5
6
<span class="hljs-comment"># services/twitter_manager/tweet_creator.rb</span>
<span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">TwitterManager</span></span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TweetCreator</span> &lt; ApplicationService</span>
  ...
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

1
2
3
4
5
6
<span class="hljs-comment"># services/twitter_manager/profile_follower.rb</span>
<span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">TwitterManager</span></span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ProfileFollower</span> &lt; ApplicationService</span>
  ...
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

And our calls will now become 

1
TwitterManager::TweetCreator.call(arg)

, and 

1
TwitterManager::ProfileManager.call(arg)

.

Service Objects to Handle Database Operations

The example above made API calls, but service objects can also be used when all the calls are to your database instead of an API. This is especially helpful if some business actions require multiple database updates wrapped in a transaction. For example, this sample code would use services to record a currency exchange taking place.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
<span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">MoneyManager</span></span>
  <span class="hljs-comment"># exchange currency from one amount to another</span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CurrencyExchanger</span> &lt; ApplicationService</span>
    ...
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call</span></span>
      ActiveRecord::Base.transaction <span class="hljs-keyword">do</span>
        <span class="hljs-comment"># transfer the original currency to the exchange's account</span>
        outgoing_tx = CurrencyTransferrer.call(
          <span class="hljs-symbol">from:</span> the_user_account,
          <span class="hljs-symbol">to:</span> the_exchange_account,
          <span class="hljs-symbol">amount:</span> the_amount,
          <span class="hljs-symbol">currency:</span> original_currency
        )

        <span class="hljs-comment"># get the exchange rate</span>
        rate = ExchangeRateGetter.call(
          <span class="hljs-symbol">from:</span> original_currency,
          <span class="hljs-symbol">to:</span> new_currency
        )

        <span class="hljs-comment"># transfer the new currency back to the user's account</span>
        incoming_tx = CurrencyTransferrer.call(
          <span class="hljs-symbol">from:</span> the_exchange_account,
          <span class="hljs-symbol">to:</span> the_user_account,
          <span class="hljs-symbol">amount:</span> the_amount * rate,
          <span class="hljs-symbol">currency:</span> new_currency
        )

        <span class="hljs-comment"># record the exchange happening</span>
        ExchangeRecorder.call(
          <span class="hljs-symbol">outgoing_tx:</span> outgoing_tx,
          <span class="hljs-symbol">incoming_tx:</span> incoming_tx
        )
      <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>

  <span class="hljs-comment"># record the transfer of money from one account to another in money_accounts</span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">CurrencyTransferrer</span> &lt; ApplicationService</span>
    ...
  <span class="hljs-keyword">end</span>

  <span class="hljs-comment"># record an exchange event in the money_exchanges table</span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ExchangeRecorder</span> &lt; ApplicationService</span>
    ...
  <span class="hljs-keyword">end</span>

  <span class="hljs-comment"># get the exchange rate from an API</span>
  <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ExchangeRateGetter</span> &lt; ApplicationService</span>
    ...
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

What Do I Return from My Service Object?

We’ve discussed how to 

1
call

 our service object, but what should the object return? There are three ways to approach this:

  • Return 
    1
    true

     or 

    1
    false
  • Return a value
  • Return an Enum

Return 

1
true

 or 

1
false

This one is simple: If an action works as intended, return 

1
true

; otherwise, return 

1
false

:


1
2
3
4
5
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call</span></span>
    ...
    <span class="hljs-keyword">return</span> <span class="hljs-literal">true</span> <span class="hljs-keyword">if</span> client.update(@message)
    <span class="hljs-literal">false</span>
  <span class="hljs-keyword">end</span>

Return a Value

If your service object fetches data from somewhere, you probably want to return that value:


1
2
3
4
5
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call</span></span>
    ...
    <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span> <span class="hljs-keyword">unless</span> exchange_rate
    exchange_rate
  <span class="hljs-keyword">end</span>

Respond with an Enum

If your service object is a bit more complex, and you want to handle different scenarios, you could just add enums to control the flow of your services:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ExchangeRecorder</span> &lt; ApplicationService</span>
  RETURNS = [
    SUCCESS = <span class="hljs-symbol">:success</span>,
    FAILURE = <span class="hljs-symbol">:failure</span>,
    PARTIAL_SUCCESS = <span class="hljs-symbol">:partial_success</span>
  ]

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">call</span></span>
    foo = do_something
    <span class="hljs-keyword">return</span> SUCCESS <span class="hljs-keyword">if</span> foo.success?
    <span class="hljs-keyword">return</span> FAILURE <span class="hljs-keyword">if</span> foo.failure?
    PARTIAL_SUCCESS
  <span class="hljs-keyword">end</span>

  private

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">do_something</span></span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

And then in your app, you can use:


1
2
3
4
5
6
7
8
    <span class="hljs-keyword">case</span> ExchangeRecorder.call
    <span class="hljs-keyword">when</span> ExchangeRecorder::SUCCESS
      foo
    <span class="hljs-keyword">when</span> ExchangeRecorder::FAILURE
      bar
    <span class="hljs-keyword">when</span> ExchangeRecorder::PARTIAL_SUCCESS
      baz
    <span class="hljs-keyword">end</span>

Shouldn’t I Put Service Objects in 

1
lib/services

 Instead of 

1
app/services

?

This is subjective. People’s opinions differ on where to put their service objects. Some people put them in 

1
lib/services

, while some create 

1
app/services

. I fall in the latter camp. Rails’ Getting Started Guide describes the 

1
lib/

 folder as the place to put “extended modules for your application.”

In my humble opinion, “extended modules” means modules that don’t encapsulate core domain logic and can generally be used across projects. In the wise words of a random Stack Overflow answer, put code in there that “can potentially become its own gem.”

Are Service Objects a Good Idea?

It depends on your use case. Look—the fact that you’re reading this article right now suggests you’re trying to write code that doesn’t exactly belong in a model or controller. I recently read this article about how service objects are an anti-pattern. The author has his opinions, but I respectfully disagree.

Just because some other person overused service objects doesn’t mean they’re inherently bad. At my startup, Nazdeeq, we use service objects as well as non-ActiveRecord models. But the difference between what goes where has always been apparent to me: I keep all business actions in service objects while keeping resources that don’t really need persistence in non-ActiveRecord models. At the end of the day, it’s for you to decide what pattern is good for you.

However, do I think service objects in general are a good idea? Absolutely! They keep my code neatly organized, and what makes me confident in my use of POROs is that Ruby loves objects. No, seriously, Ruby loves objects. It’s insane, totally bonkers, but I love it! Case in point:


1
2
3
4
5
6
7
8
9
10
11
 &gt; <span class="hljs-number">5</span>.is_a? Object <span class="hljs-comment"># =&gt; true</span>
 &gt; <span class="hljs-number">5</span>.<span class="hljs-keyword">class</span> <span class="hljs-comment"># =&gt; Integer</span>


 &gt; <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Integer</span></span>
<span class="hljs-meta">?&gt;</span>   <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">woot</span></span>
<span class="hljs-meta">?&gt;</span>     <span class="hljs-string">'woot woot'</span>
<span class="hljs-meta">?&gt;</span>   end
<span class="hljs-meta">?&gt;</span> end <span class="hljs-comment"># =&gt; :woot</span>

 &gt; <span class="hljs-number">5</span>.woot <span class="hljs-comment"># =&gt; "woot woot"</span>

See? 

1
5

 is literally an object.

In many languages, numbers and other primitive types are not objects. Ruby follows the influence of the Smalltalk language by giving methods and instance variables to all of its types. This eases one’s use of Ruby, since rules applying to objects apply to all of Ruby.
Ruby-lang.org

When Should I Not Use a Service Object?

This one’s easy. I have these rules:

  1. Does your code handle routing, params or do other controller-y things?
    If so, don’t use a service object—your code belongs in the controller.
  2. Are you trying to share your code in different controllers?
    In this case, don’t use a service object—use a concern.
  3. Is your code like a model that doesn’t need persistence?
    If so, don’t use a service object. Use a non-ActiveRecord model instead.
  4. Is your code a specific business action? (e.g., “Take out the trash,” “Generate a PDF using this text,” or “Calculate the customs duty using these complicated rules”)
    In this case, use a service object. That code probably doesn’t logically fit in either your controller or your model.

Of course, these are my rules, so you’re welcome to adapt them to your own use cases. These have worked very well for me, but your mileage may vary.

Rules for Writing Good Service Objects

I have a four rules for creating service objects. These aren’t written in stone, and if you reallywant to break them, you can, but I will probably ask you to change it in code reviews unless your reasoning is sound.

Rule 1: Only One Public Method per Service Object

Service objects are single business actions. You can change the name of your public method if you like. I prefer using 

1
call

, but Gitlab CE’s codebase calls it 

1
execute

 and other people may use 

1
perform

. Use whatever you want—you could call it 

1
nermin

 for all I care. Just don’t create two public methods for a single service object. Break it into two objects if you need to.

Rule 2: Name Service Objects Like Dumb Roles at a Company

Service objects are single business actions. Imagine if you hired one person at the company to do that one job, what would you call them? If their job is to create tweets, call them 

1
TweetCreator

. If their job is to read specific tweets, call them 

1
TweetReader

.

Rule 3: Don’t Create Generic Objects to Perform Multiple Actions

Service objects are single business actions. I broke the functionality into two pieces: 

1
TweetReader

, and 

1
ProfileFollower

. What I didn’t do is create a single generic object called 

1
TwitterHandler

 and dump all of the API functionality in there. Please don’t do this. This goes against the “business action” mindset and makes the service object look like the Twitter Fairy. If you want to share code among the business objects, just create a 

1
BaseTwitterManager

 object or module and mix that into your service objects.

Rule 4: Handle Exceptions Inside the Service Object

For the umpteenth time: Service objects are single business actions. I can’t say this enough. If you’ve got a person that reads tweets, they’ll either give you the tweet, or say, “This tweet doesn’t exist.” Similarly, don’t let your service object panic, jump on your controller’s desk, and tell it to halt all work because “Error!” Just return 

1
false

 and let the controller move on from there.

Credits and Next Steps

This article wouldn’t have been possible without the amazing community of Ruby developers at Toptal. If I ever run into a problem, the community is the most helpful group of talented engineers I’ve ever met.

If you’re using service objects, you may find yourself wondering how to force certain answers while testing. I recommend reading this article on how to create mock service objects in Rspec that will always return the result you want, without actually hitting the service object!

If you want to learn more about Ruby tricks, I recommend Creating a Ruby DSL: A Guide to Advanced Metaprogramming by fellow Toptaler Máté Solymosi. He breaks down how the 

1
routes.rb

 file doesn’t feel like Ruby and helps you build your own DSL.

Trunk-based Development vs. Git Flow

In order to develop quality software, we need to be able to track all changes and reverse them if necessary. Version control systems fill that role by tracking project history and helping to merge changes made by multiple people. They greatly speed up work and give us the ability to find bugs more easily.

Moreover, working in distributed teams is possible mainly thanks to these tools. They enable several people to work on different parts of a project at the same time and later join their results into a single product. Let’s take a closer look at version control systems and explain how trunk-based development and Git flow came to being.

How Version Control Systems Changed the World

Before version control systems were created, people relied on manually backing up previous versions of projects. They were copying modified files by hand in order to incorporate the work of multiple developers on the same project.

It cost a lot of time, hard drive space, and money.

When we look at the history, we can broadly distinguish three generations of version control software.

Let’s take a look at them:

Generation Operations Concurrency Networking Examples
First On a single file only Locks Centralized RCS
Second On multiple files Merge before commit Centralized Subversion, CVS
Third On multiple files Commit before merge Distributed Git, Mercurial

We notice that as version control systems mature, there is a tendency to increase the ability to work on projects in parallel.

One of the most groundbreaking changes was a shift from locking files to merging changes instead. It enabled programmers to work more efficiently.

Another considerable improvement was the introduction of distributed systems. Git was one of the first tools to incorporate this philosophy. It literally enabled the open-source world to flourish. Git allows developers to copy the whole repository, in an operation called forking, and introduce the desired changes without needing to worry about merge conflicts.

Later, they can start a pull request in order to merge their changes into the original project. If the initial developer is not interested in incorporating those changes from other repositories, then they can turn them into separate projects on their own. It’s all possible thanks to the fact that there is no concept of central storage.

Development Styles

Nowadays, the most popular version control system is definitely Git, with a market share of about 70 percent in 2016.

Git was popularized with the rise of Linux and the open-source scene in general. GitHub, currently the most popular online storage for public projects, was also a considerable contributor to its prevalence. We owe the introduction of easy to manage pull requests to Git.

Put simply, pull requests are requests created by a software developer to combine changes they created with the main project. It includes a process of reviewing those changes. Reviewers can insert comments on every bit they think could be improved, or see as unnecessary.

After receiving feedback, the creator can respond to it, creating a discussion, or simply follow it and change their code accordingly.

Diagram of Git development style

Git is merely a tool. You can use it in many different ways. Currently, two most popular development styles you can encounter are Git flow and trunk-based development. Quite often, people are familiar with one of those styles and they might neglect the other one.

Let’s take a closer look at both of them and learn how and when we should use them.

Git Flow

In the Git flow development model, you have one main development branch with strict access to it. It’s often called the 

1
develop

 branch.

Developers create feature branches from this main branch and work on them. Once they are done, they create pull requests. In pull requests, other developers comment on changes and may have discussions, often quite lengthy ones.

It takes some time to agree on a final version of changes. Once it’s agreed upon, the pull request is accepted and merged to the main branch. Once it’s decided that the main branch has reached enough maturity to be released, a separate branch is created to prepare the final version. The application from this branch is tested and bug fixes are applied up to the moment that it’s ready to be published to final users. Once that is done, we merge the final product to the 

1
master

 branch and tag it with the release version. In the meantime, new features can be developed on the 

1
develop

 branch.

Below, you can see Git flow diagram, depicting a general workflow:

Git flow Diagram depicging general workflow

One of the advantages of Git flow is strict control. Only authorized developers can approve changes after looking at them closely. It ensures code quality and helps eliminate bugs early.

However, you need to remember that it can also be a huge disadvantage. It creates a funnel slowing down software development. If speed is your primary concern, then it might be a serious problem. Features developed separately can create long-living branches that might be hard to combine with the main project.

What’s more, pull requests focus code review solely on new code. Instead of looking at code as a whole and working to improve it as such, they check only newly introduced changes. In some cases, they might lead to premature optimization since it’s always possible to implement something to perform faster.

Moreover, pull requests might lead to extensive micromanagement, where the lead developer literally manages every single line of code. If you have experienced developers you can trust, they can handle it, but you might be wasting their time and skills. It can also severely de-motivate developers.

In larger organizations, office politics during pull requests are another concern. It is conceivable that people who approve pull requests might use their position to purposefully block certain developers from making any changes to the code base. They could do this due to a lack of confidence, while some may abuse their position to settle personal scores.

Git Flow Pros and Cons

As you can see, doing pull requests might not always be the best choice. They should be used where appropriate only.

When Does Git Flow Work Best?

  • When you run an open-source project.
    This style comes from the open-source world and it works best there. Since everyone can contribute, you want to have very strict access to all the changes. You want to be able to check every single line of code, because frankly you can’t trust people contributing. Usually, those are not commercial projects, so development speed is not a concern.
  • When you have a lot of junior developers.
    If you work mostly with junior developers, then you want to have a way to check their work closely. You can give them multiple hints on how to do things more efficiently and help them improve their skills faster. People who accept pull requests have strict control over recurring changes so they can prevent deteriorating code quality.
  • When you have an established product.
    This style also seems to play well when you already have a successful product. In such cases, the focus is usually on application performance and load capabilities. That kind of optimization requires very precise changes. Usually, time is not a constraint, so this style works well here. What’s more, large enterprises are a great fit for this style. They need to control every change closely, since they don’t want to break their multi-million dollar investment.

When Can Git Flow Cause Problems?

  • When you are just starting up.
    If you are just starting up, then Git flow is not for you. Chances are you want to create a minimal viable product quickly. Doing pull requests creates a huge bottleneck that slows the whole team down dramatically. You simply can’t afford it. The problem with Git flow is the fact that pull requests can take a lot of time. It’s just not possible to provide rapid development that way.
  • When you need to iterate quickly.
    Once you reach the first version of your product, you will most likely need to pivot it few times to meet your customers’ need. Again, multiple branches and pull requests reduce development speed dramatically and are not advised in such cases.
  • When you work mostly with senior developers.
    If your team consists mainly of senior developers who have worked with one another for a longer period of time, then you don’t really need the aforementioned pull request micromanagement. You trust your developers and know that they are professionals. Let them do their job and don’t slow them down with all the Git flow bureaucracy.

Trunk-based Development Workflow

In the trunk-based development model, all developers work on a single branch with open access to it. Often it’s simply the 

1
master

 branch. They commit code to it and run it. It’s super simple.

In some cases, they create short-lived feature branches. Once code on their branch compiles and passess all tests, they merge it straight to 

1
master

. It ensures that development is truly continuous and prevents developers from creating merge conflicts that are difficult to resolve.

Let’s have a look at trunk-based development workflow.

Trunk-based development diagram

The only way to review code in such an approach is to do full source code review. Usually, lengthy discussions are limited. No one has strict control over what is being modified in the source code base—that is why it’s important to have enforceable code style in place. Developers that work in such style should be experienced so that you know they won’t lower source code quality.

This style of work can be great when you work with a team of seasoned software developers. It enables them to introduce new improvements quickly and without unnecessary bureaucracy. It also shows them that you trust them, since they can introduce code straight into the 

1
master

 branch. Developers in this workflow are very autonomous—they are delivering directly and are checked on final results in the working product. There is definitely much less micromanagement and possibility for office politics in this method.

If, on the other hand, you do not have a seasoned team or you don’t trust them for some reason, you shouldn’t go with this method—you should choose Git flow instead. It will save you unnecessary worries.

Pros and Cons of Trunk-based Development

Let’s take a closer look at both sides of the cost—the very best and very worst scenarios.

When Does Trunk-based Development Work Best?

  • When you are just starting up.
    If you are working on your minimum viable product, then this style is perfect for you. It offers maximum development speed with minimum formality. Since there are no pull requests, developers can deliver new functionality at the speed of light. Just be sure to hire experienced programmers.
  • When you need to iterate quickly.
    Once you reached the first version of your product and you noticed that your customers want something different, then don’t think twice and use this style to pivot into a new direction. You are still in the exploration phase and you need to be able to change your product as fast as possible.
  • When you work mostly with senior developers.
    If your team consists mainly of senior developers, then you should trust them and let them do their job. This workflow gives them the autonomy that they need and enables them to wield their mastery of their profession. Just give them purpose (tasks to accomplish) and watch how your product grows.

When Can Trunk-based Development Cause Problems?

  • When you run an open-source project.
    If you are running an open-source project, then Git flow is the better option. You need very strict control over changes and you can’t trust contributors. After all, anyone can contribute. Including online trolls.
  • When you have a lot of junior developers.
    If you hire mostly junior developers, then it’s a better idea to tightly control what they are doing. Strict pull requests will help them to to improve their skills and will find potential bugs more quickly.
  • When you have established product or manage large teams.
    If you already have a prosperous product or manage large teams at a huge enterprise, then Git flow might be a better idea. You want to have strict control over what is happening with a well-established product worth millions of dollars. Probably, application performance and load capabilities are the most important things. That kind of optimization requires very precise changes.

Use the Right Tool for the Right Job

As I said before, Git is just a tool. Like every other tool, it needs to be used appropriately.

Git flow manages all changes through pull requests. It provides strict access control to all changes. It’s great for open-source projects, large enterprises, companies with established products, or a team of inexperienced junior developers. You can safely check what is being introduced into the source code. On the other hand, it might lead to extensive micromanagement, disputes involving office politics, and significantly slower development.

Trunk-based development gives programmers full autonomy and expresses more faith in them and their judgement. Access to source code is free, so you really need to be able to trust your team. It provides excellent software development speed and reduces processes. These factors make it perfect when creating new products or pivoting an existing application in an all-new direction. It works wonders if you work mostly with experienced developers.

Still, if you work with junior programmers or people you don’t fully trust, Git flow is a much better alternative.

Equipped with this knowledge, I hope you will be able to choose the workflow that perfectly matches your project.

UNDERSTANDING THE BASICS

What is a trunk in software?

In the world of software development, “trunk” means main development branch under a version control system. It’s the base of a project, where all improvements are being merged together.

Originally written by Konrad Gadzinowski, JavaScript developer for Toptal.