Trunk-based Development vs. Git Flow

In order to develop quality software, we need to be able to track all changes and reverse them if necessary. Version control systems fill that role by tracking project history and helping to merge changes made by multiple people. They greatly speed up work and give us the ability to find bugs more easily.

Moreover, working in distributed teams is possible mainly thanks to these tools. They enable several people to work on different parts of a project at the same time and later join their results into a single product. Let’s take a closer look at version control systems and explain how trunk-based development and Git flow came to being.

How Version Control Systems Changed the World

Before version control systems were created, people relied on manually backing up previous versions of projects. They were copying modified files by hand in order to incorporate the work of multiple developers on the same project.

It cost a lot of time, hard drive space, and money.

When we look at the history, we can broadly distinguish three generations of version control software.

Let’s take a look at them:

Generation Operations Concurrency Networking Examples
First On a single file only Locks Centralized RCS
Second On multiple files Merge before commit Centralized Subversion, CVS
Third On multiple files Commit before merge Distributed Git, Mercurial

We notice that as version control systems mature, there is a tendency to increase the ability to work on projects in parallel.

One of the most groundbreaking changes was a shift from locking files to merging changes instead. It enabled programmers to work more efficiently.

Another considerable improvement was the introduction of distributed systems. Git was one of the first tools to incorporate this philosophy. It literally enabled the open-source world to flourish. Git allows developers to copy the whole repository, in an operation called forking, and introduce the desired changes without needing to worry about merge conflicts.

Later, they can start a pull request in order to merge their changes into the original project. If the initial developer is not interested in incorporating those changes from other repositories, then they can turn them into separate projects on their own. It’s all possible thanks to the fact that there is no concept of central storage.

Development Styles

Nowadays, the most popular version control system is definitely Git, with a market share of about 70 percent in 2016.

Git was popularized with the rise of Linux and the open-source scene in general. GitHub, currently the most popular online storage for public projects, was also a considerable contributor to its prevalence. We owe the introduction of easy to manage pull requests to Git.

Put simply, pull requests are requests created by a software developer to combine changes they created with the main project. It includes a process of reviewing those changes. Reviewers can insert comments on every bit they think could be improved, or see as unnecessary.

After receiving feedback, the creator can respond to it, creating a discussion, or simply follow it and change their code accordingly.

Diagram of Git development style

Git is merely a tool. You can use it in many different ways. Currently, two most popular development styles you can encounter are Git flow and trunk-based development. Quite often, people are familiar with one of those styles and they might neglect the other one.

Let’s take a closer look at both of them and learn how and when we should use them.

Git Flow

In the Git flow development model, you have one main development branch with strict access to it. It’s often called the 

1
develop

 branch.

Developers create feature branches from this main branch and work on them. Once they are done, they create pull requests. In pull requests, other developers comment on changes and may have discussions, often quite lengthy ones.

It takes some time to agree on a final version of changes. Once it’s agreed upon, the pull request is accepted and merged to the main branch. Once it’s decided that the main branch has reached enough maturity to be released, a separate branch is created to prepare the final version. The application from this branch is tested and bug fixes are applied up to the moment that it’s ready to be published to final users. Once that is done, we merge the final product to the 

1
master

 branch and tag it with the release version. In the meantime, new features can be developed on the 

1
develop

 branch.

Below, you can see Git flow diagram, depicting a general workflow:

Git flow Diagram depicging general workflow

One of the advantages of Git flow is strict control. Only authorized developers can approve changes after looking at them closely. It ensures code quality and helps eliminate bugs early.

However, you need to remember that it can also be a huge disadvantage. It creates a funnel slowing down software development. If speed is your primary concern, then it might be a serious problem. Features developed separately can create long-living branches that might be hard to combine with the main project.

What’s more, pull requests focus code review solely on new code. Instead of looking at code as a whole and working to improve it as such, they check only newly introduced changes. In some cases, they might lead to premature optimization since it’s always possible to implement something to perform faster.

Moreover, pull requests might lead to extensive micromanagement, where the lead developer literally manages every single line of code. If you have experienced developers you can trust, they can handle it, but you might be wasting their time and skills. It can also severely de-motivate developers.

In larger organizations, office politics during pull requests are another concern. It is conceivable that people who approve pull requests might use their position to purposefully block certain developers from making any changes to the code base. They could do this due to a lack of confidence, while some may abuse their position to settle personal scores.

Git Flow Pros and Cons

As you can see, doing pull requests might not always be the best choice. They should be used where appropriate only.

When Does Git Flow Work Best?

  • When you run an open-source project.
    This style comes from the open-source world and it works best there. Since everyone can contribute, you want to have very strict access to all the changes. You want to be able to check every single line of code, because frankly you can’t trust people contributing. Usually, those are not commercial projects, so development speed is not a concern.
  • When you have a lot of junior developers.
    If you work mostly with junior developers, then you want to have a way to check their work closely. You can give them multiple hints on how to do things more efficiently and help them improve their skills faster. People who accept pull requests have strict control over recurring changes so they can prevent deteriorating code quality.
  • When you have an established product.
    This style also seems to play well when you already have a successful product. In such cases, the focus is usually on application performance and load capabilities. That kind of optimization requires very precise changes. Usually, time is not a constraint, so this style works well here. What’s more, large enterprises are a great fit for this style. They need to control every change closely, since they don’t want to break their multi-million dollar investment.

When Can Git Flow Cause Problems?

  • When you are just starting up.
    If you are just starting up, then Git flow is not for you. Chances are you want to create a minimal viable product quickly. Doing pull requests creates a huge bottleneck that slows the whole team down dramatically. You simply can’t afford it. The problem with Git flow is the fact that pull requests can take a lot of time. It’s just not possible to provide rapid development that way.
  • When you need to iterate quickly.
    Once you reach the first version of your product, you will most likely need to pivot it few times to meet your customers’ need. Again, multiple branches and pull requests reduce development speed dramatically and are not advised in such cases.
  • When you work mostly with senior developers.
    If your team consists mainly of senior developers who have worked with one another for a longer period of time, then you don’t really need the aforementioned pull request micromanagement. You trust your developers and know that they are professionals. Let them do their job and don’t slow them down with all the Git flow bureaucracy.

Trunk-based Development Workflow

In the trunk-based development model, all developers work on a single branch with open access to it. Often it’s simply the 

1
master

 branch. They commit code to it and run it. It’s super simple.

In some cases, they create short-lived feature branches. Once code on their branch compiles and passess all tests, they merge it straight to 

1
master

. It ensures that development is truly continuous and prevents developers from creating merge conflicts that are difficult to resolve.

Let’s have a look at trunk-based development workflow.

Trunk-based development diagram

The only way to review code in such an approach is to do full source code review. Usually, lengthy discussions are limited. No one has strict control over what is being modified in the source code base—that is why it’s important to have enforceable code style in place. Developers that work in such style should be experienced so that you know they won’t lower source code quality.

This style of work can be great when you work with a team of seasoned software developers. It enables them to introduce new improvements quickly and without unnecessary bureaucracy. It also shows them that you trust them, since they can introduce code straight into the 

1
master

 branch. Developers in this workflow are very autonomous—they are delivering directly and are checked on final results in the working product. There is definitely much less micromanagement and possibility for office politics in this method.

If, on the other hand, you do not have a seasoned team or you don’t trust them for some reason, you shouldn’t go with this method—you should choose Git flow instead. It will save you unnecessary worries.

Pros and Cons of Trunk-based Development

Let’s take a closer look at both sides of the cost—the very best and very worst scenarios.

When Does Trunk-based Development Work Best?

  • When you are just starting up.
    If you are working on your minimum viable product, then this style is perfect for you. It offers maximum development speed with minimum formality. Since there are no pull requests, developers can deliver new functionality at the speed of light. Just be sure to hire experienced programmers.
  • When you need to iterate quickly.
    Once you reached the first version of your product and you noticed that your customers want something different, then don’t think twice and use this style to pivot into a new direction. You are still in the exploration phase and you need to be able to change your product as fast as possible.
  • When you work mostly with senior developers.
    If your team consists mainly of senior developers, then you should trust them and let them do their job. This workflow gives them the autonomy that they need and enables them to wield their mastery of their profession. Just give them purpose (tasks to accomplish) and watch how your product grows.

When Can Trunk-based Development Cause Problems?

  • When you run an open-source project.
    If you are running an open-source project, then Git flow is the better option. You need very strict control over changes and you can’t trust contributors. After all, anyone can contribute. Including online trolls.
  • When you have a lot of junior developers.
    If you hire mostly junior developers, then it’s a better idea to tightly control what they are doing. Strict pull requests will help them to to improve their skills and will find potential bugs more quickly.
  • When you have established product or manage large teams.
    If you already have a prosperous product or manage large teams at a huge enterprise, then Git flow might be a better idea. You want to have strict control over what is happening with a well-established product worth millions of dollars. Probably, application performance and load capabilities are the most important things. That kind of optimization requires very precise changes.

Use the Right Tool for the Right Job

As I said before, Git is just a tool. Like every other tool, it needs to be used appropriately.

Git flow manages all changes through pull requests. It provides strict access control to all changes. It’s great for open-source projects, large enterprises, companies with established products, or a team of inexperienced junior developers. You can safely check what is being introduced into the source code. On the other hand, it might lead to extensive micromanagement, disputes involving office politics, and significantly slower development.

Trunk-based development gives programmers full autonomy and expresses more faith in them and their judgement. Access to source code is free, so you really need to be able to trust your team. It provides excellent software development speed and reduces processes. These factors make it perfect when creating new products or pivoting an existing application in an all-new direction. It works wonders if you work mostly with experienced developers.

Still, if you work with junior programmers or people you don’t fully trust, Git flow is a much better alternative.

Equipped with this knowledge, I hope you will be able to choose the workflow that perfectly matches your project.

UNDERSTANDING THE BASICS

What is a trunk in software?

In the world of software development, “trunk” means main development branch under a version control system. It’s the base of a project, where all improvements are being merged together.

Originally written by Konrad Gadzinowski, JavaScript developer for Toptal.

Seven Common Pitfalls to Avoid When Hiring a Freelance AngularJS Specialist

Hiring a freelancer for Angular jobs can be a scary undertaking, especially when filling a hole in your team’s existing skill set. Whether you’re hiring a freelancer to take ownership of an existing AngularJS web development project, to pave the way with a new greenfield project, or to augment an existing team, you’ll need to know what to look for and what to avoid.

Pitfall No. 1: AngularJS vs. Angular

It's important to know the difference between Angular and AngularJS.

It’s just “Angular.”

Even though it sounds straightforward, not all “Angulars” are created equal.

The team that built Angular has specified in its Branding Guidelines for Angular and AngularJS that “AngularJS” should be used when referring to versions 1.x, and “Angular”—without the “JS”—should be used when referring to versions 2+. That means even Angular 4 is just referred to as “Angular.”

Why does this matter?

It’s important for you and your freelancer to be on the same page and use the right name. While AngularJS and Angular may sound similar, they are in fact distinct frameworks. And just as you wouldn’t expect a React specialist or a Vue.js specialist to hit the ground running with your Angular app, you shouldn’t expect an AngularJS specialist to be an expert in Angular, or vice versa. This isn’t to say they can’t take it on—they’ll just require more ramp-up time.

When hiring for an existing project, be sure to know if you need an AngularJS or Angular specialist. If you’re planning a new project, use Angular!

Pitfall No. 2: Hiring a Developer Who Isn’t Fluent in TypeScript

Angular was written in TypeScript, and it is by far the preferred language for Angular apps. This means that the ecosystem (e.g., libraries and documentation) around Angular is predominantly written in TypeScript.

When hiring an Angular expert, you’ll want to make sure that you’re hiring someone who knows TypeScript and can take full advantage of its amazing features. They should be familiar with tools like Atom and VSCode, which support TypeScript and will highlight errors and provide autocompletion.

Hiring an Angular specialist means hiring a TypeScript specialist, so test their chops!

Pitfall No. 3: Lead or Follow?

Are you looking for someone to augment your existing Angular team? Maintain an existing application? Lead or bootstrap a new project?

An Angular lead should know how to set up a new project. This is an incredibly important part of your project lifecycle.

The answers to these questions will help you determine how much Angular experience your specialist will need to have. As with other frameworks, the skill and experience required to be productive in an established codebase is much lower than what is required to bootstrap a new project. If you don’t need an Angular lead, then hiring someone with React, AngularJS, or great JavaScript experience may suffice, although they will require some learning. If you need an Angular lead, or someone to bootstrap a new project, you’ll want to make sure that your specialist is up to the task.

A professional Angular lead should know how to set up a new project. This is an incredibly important part of your project lifecycle! Think of it like a building—you wouldn’t want to build a skyscraper on top of a shaky foundation. Likewise, your Angular lead will be setting up the foundation for themselves and all future developers working on your project, so it needs to be rock-solid.

A good setup will:

  • Follow best practices (for Angular or AngularJS).
  • Reduce bugs.
  • Make it obvious how to add new features and extend your application.

When hiring a lead, make sure to ask them about best practices, directory structure, and how to set up a single page application (since it requires special routing).

Pitfall No. 4: Your Angular Specialist Doesn’t Really Know Angular

You wouldn’t hire a chef without tasting their food, and you shouldn’t hire someone for Angular or AngularJS development without testing their Angular knowledge. (A great starting point for this is our list of AngularJS interview questions.) Both Angular and AngularJS code come with their own set of peculiarities that you’ll want to talk about.

Data Binding and Component Communication

An Angular specialist should know their way around data binding and component communication.

An AngularJS expert in particular should know the different ways to pass data to a component:

  • 1
    @

     for raw text

  • 1
    &

     for a function

  • 1
    =

     for two-way data binding

  • 1
    =?

     for optional two-way parameters

Conversely, an Angular specialist should know when to use:

  • 1
    [property]

     binding

  • 1
    (event)

     binding

  • 1
    [(two-way)]

     binding

Your specialist should also be able to tell you how to do parent-child or child-parent component communication, for Angular or AngularJS.

Services, Directives, and Pipes

Your Angular specialist should be able to explain to you what services are (hint: they’re singletons!), and when to use them. Services are a great way to provide common utilities to many components, simplify components by pulling out complex logic, and share state throughout your app. Angular makes it easy to control the scope of this shared state through the use of providers (e.g., app-, module-, or component-level state).

An Angular specialist should also know when to use directives and how to set them up. Directives are an amazing way to extend HTML by attaching custom behavior to elements in the DOM. For example, you could set up a directive to add on-hover tooltips to an element, set up hotkey event handling, or register when a user clicks outside of your element (to close a dropdown, for example).

Any non-trivial application will most likely have its own custom pipes, so your specialist will need to be versed in these, too. Pipes (or filters for AngularJS) are specifically used to transform your displayed data. Angular comes with many built-in pipes, and AngularJS comes with many built-in filters. Ask your specialist about these handy tools, and make sure they won’t repeat the same transformations across the app when they could use pipes or filters!

Promises and Observables

While not strictly Angular-specific, promises and observables are paradigms that are common in the Angular world, and your specialist should be familiar with these as well.

Thanks to promises, we no longer have to live in fear of callback hell, and your specialist should know when and how to use them (such as wrapping REST API requests). Additionally, Angular introduces the use of ReactiveX’s Observables, which provide an awesome way to stream data.

Pitfall No. 5: Not Doing a Code Review

You can talk the talk, but can you walk the walk?

So, your prospective specialist sounds like they know what they’re talking about, but can they actually break down a problem and write quality code?

Do a code walkthrough of some of their existing code that they can share with you. It doesn’t need to be perfect (but if it isn’t, they should be able to explain to you how they’d improve it). Additionally—or if they don’t have any open source code to share—have them code an example component within your problem domain (e.g., a checkout shopping cart, a web form for teachers to add lesson plans, or a to-do list). Alternatively, you can set up some example code and have them explain it and identify bugs and cleanups.

Checking their code can really give you an insight into not only their competency, but also their style. Good style goes a long way in keeping code maintainable and bug-free, and is just a good general indication of their seniority.

Things to look for:

  • They follow best practices (for Angular or AngularJS).
  • Consistency in their style (casing, format, etc.).
  • They use TypeScript for Angular.
  • They can explain how their code works and defend their decisions.

Read up on good codecommon JavaScript mistakes, and common AngularJS mistakes. And if you’re hiring someone who has yet to be vetted, you should also test their general programming skills (there’s a reason FizzBuzz weeds out so many freelancers).

Pitfall No. 6: Proceeding without a Testing Strategy

Tests are an essential part of every code base. They’re like a warm, snuggly security blanket for your engineers, giving them confidence that they aren’t breaking anything and costing the company money. Good tests and a good testing strategy will boost your technical wealth, while bad tests, or lack of strategy, will be a constant source of frustration and major code debt.

A good freelancer will advocate for tests and understand their benefits:

  • Guarding against regressions (preventing “What do you mean users can’t sign up anymore!?”).
  • Acting as codified documentation of your codebase, making it easier for other developers to understand, maintain, and extend it.
  • Validating functionality and preventing bugs in pesky edge cases.

If you don’t understand testing, you’ll likely fall into the “We need tests!” trap. This can lead you to hire someone who doesn’t truly understand tests, but will happily write tons of less-than-useful or incredibly fragile tests.

When considering Angular consulting, you’ll want to explore your potential hire’s understanding of tests and determine how they’d go about testing your app.

Things to look for:

  • They understand the fragile nature of front-end testing and how to use constructs like page objects to DRY up test upkeep in the face of template changes and refactorings.
  • They can explain how AngularJS’s digest cycle works, or how Angular’s asynchronous change detection works, and how that impacts testing. (Hint: You need to explicitly resolve asyncs or use wrapping functionsto wait for them.)
  • Mocking! They should know how to use spys and stubs/test-doubles in order to isolate tests and remove their dependence on any network calls.
  • An Angular specialist will know that services and pipes are ripe for unit testing. Components are also unit-testable, but with a bit more boilerplate. This is why it is recommended to move complex logic into a service.
  • End to end (E2E) tests will depend on your back-end framework, but an Angular specialist should know about Protractor (although other tools like Nightwatch.js will also work).

To aid in your probing of their abilities, you could provide an example component, service, or directive and ask them what they’d test—maybe even have them write up the “it should (blank)” descriptions of all of the tests they’d write for it, and also write one of them up.

When hiring a professional Angular specialist, don’t superficially ask about tests. Instead, explore their understanding of what to test and how to test it.

Pitfall No. 7: Having Only Non-Developers Interview Your Freelancer

When hiring a freelance developer for Angular(JS) web development, you’ll want to make sure that a developer interviews them. Just because a freelancer is confident, it doesn’t mean they are competent, and a non-developer has a higher risk of making a costly mis-hire. A good developer will be able to recognize someone who knows what they are talking about. Your developer should also validate that the freelancer can walk the walk, through interview questions and challenges.

If you don’t have a senior developer, you can ask a friend or stick with vetted developers.

This Up-Front Effort Will Save You Time and Money in the Long Run

Exploring AngularJS development services can seem like a difficult, opaque, and potentially costly process. After all, if you’re looking for a freelancer to contribute to your existing project or team, it’s incredibly important to find someone who is a good fit and whose chops are up to par. And if you’re building a new project from scratch, in many ways, your project’s future success will depend upon the early-stage decisions made by your specialist.

But don’t panic. By taking the precautions discussed above, you can ensure not only that you’ll be hiring a skilled developer, but also that your project will be on the right track to succeed and to take advantage of all the powerful features that Angular has to offer.

This article is originally publisheed at Toptal.

A Guide to Process-oriented Programming in Elixir and OTP

People like to categorize programming languages into paradigms. There are object-oriented (OO) languages, imperative languages, functional languages, etc. This can be helpful in figuring out which languages solve similar problems, and what types of problems a language is intended to solve.

In each case a paradigm generally has one “main” focus and technique that is the driving force for that family of languages:

  • In OO languages, it is the class or object as a way to encapsulate state (data) with manipulation of that state (methods).
  • In functional languages, it can be the manipulation of functions themselves or the immutable data passed from function to function.

While Elixir (and Erlang before it) are often categorized as functional languages because they exhibit the immutable data common to functional languages, I would submit they represent a separate paradigm from many functional languages. They exist and are adopted because of the existence of OTP, and so I would categorize them as process-oriented languages.

In this post, we will capture the meaning of what process-oriented programming is when using these languages, explore the differences and similarities to other paradigms, see the implications for both training and adoption, and end with a short process-oriented programming example.

What Is Process-oriented Programming?

Let’s start with a definition: Process-oriented programming is a paradigm based on Communicating Sequential Processes, originally from a paper by Tony Hoare in 1977. This is also popularly called the actormodel of concurrency. Other languages with some relation to this original work include Occam, Limbo, and Go. The formal paper deals only with synchronous communication; most actor models (including OTP) use asynchronous communication as well. It is always possible to build synchronous communication on top of asynchronous communication, and OTP supports both forms.

On this history, OTP created a system for fault tolerant computing by communicating sequential processes. The fault tolerant facilities come from a “let it fail” approach with solid error recovery in the form of supervisors and the use of distributed processing enabled by the actor model. The “let it fail” can be contrasted to “prevent it from failing,” as the former is far easier to accommodate and has been proven in OTP to be far more reliable than the latter. The reason is that the programming effort required to prevent failures (as shown in the Java checked exception model) is much more involved and demanding.

So, process-oriented programming can be defined as a paradigm in which the process structure and communication between processes of a system are the primary concerns.

Object-oriented vs. Process-oriented Programming

In object-oriented programming, the static structure of data and function is the primary concern. What methods are required to manipulate the enclosed data, and what should be the connections between objects or classes. Thus, the class diagram of UML is a prime example of this focus, as seen in Figure 1.

Process-oriented programming: Sample UML class diagram

It can be noted that a common criticism of object-oriented programming is that there is no visible control flow. Because systems are composed from a large number of classes/objects defined separately, it can be difficult for a less experienced person to visualize the control flow of a system. This is especially true for systems with a lot of inheritance, which use abstract interfaces or have no strong typing. In most cases, it becomes important for the developer to memorize a large amount of the system structure to be effective (what classes have what methods and which are used in what ways).

The strength of the object-oriented development approach is that the system can be extended to support new types of objects with limited impact on existing code, so long as the new object types conform to the expectations of the existing code.

Functional vs. Process-oriented Programming

Many functional programming languages do address concurrency in various ways, but their primary focus is immutable data passing between functions, or the creation of functions from other functions (higher order functions that generate functions). For the most part, the focus of the language is still a single address space or executable, and communications between such executables are handled in an operating system specific manner.

For example, Scala is a functional language built on the Java Virtual Machine. While it can access Java facilities for communication, it is not an inherent part of the language. While it is a common language used in Spark programming, it is again a library used in conjunction with the language.

A strength of functional paradigm is the ability to visualize the control flow of a system given the top level function. The control flow is explicit in that each function calls other functions, and passes all the data from one to the next. In the functional paradigm there are no side effects, which makes problem determination easier. The challenge with pure functional systems is that “side effects” are required to have persistent state. In well architected systems, the persisting of state is handled at the top level of the control flow, allowing most of the system to be side effect free.

Elixir/OTP and Process-oriented Programming

In Elixir/Erlang and OTP, the communication primitives are part of the virtual machine that executes the language. The ability to communicate between processes and between machines are built in and central to the language system. This emphasizes the importance of communication in this paradigm and in these language systems.

While the Elixir language is predominantly functional in terms of the logic expressed in the language, its use is process oriented.

What Does It Mean to Be Process-oriented?

To be process-oriented as defined in this post is to design a system first in the form of what processes exist and how they communicate. One of the main questions is which processes are static, and which are dynamic, which are spawned on demand to requests, which serve a long-running purpose, which hold shared state or part of the shared state of the system, and which features of the system are inherently concurrent. Just as OO has types of objects, and functional has types of functions, process-oriented programming has types of processes.

As such, a process-oriented design is the identification of the set of process types required to solve a problem or address a need.

The aspect of time enters quickly into the design and requirements efforts. What is the lifecycle of the system? What custom needs are occasional and which are constant? Where is the load in the system and what is the expected velocity and volume? It is only after these types of considerations are understood that a process-oriented design begins to define the function of each process or the logic to be executed.

Training Implications

The implication of this categorization to training is that training should begin not with language syntax or “Hello World” examples, but with systems engineering thinking and a design focus on process allocation.

The coding concerns are secondary to the process design and allocation which are best addressed at a higher level, and involve cross-functional thinking about lifecycle, QA, DevOps, and customer business requirements. Any training course in Elixir or Erlang must (and generally does) include OTP, and should have a process orientation from the beginning, not as the “Now you can code in Elixir, so let’s do concurrency” type approach.

Adoption Implications

The implication for adoption is that the language and system is better applied to problems that require communication and/or distribution of computing. Problems that are single workload on a single computer are less interesting in this space, and may be better addressed with another language. Long-lived continuous processing systems are a prime target for this language because it has fault tolerance built in from the ground up.

For documentation and design work, it can be very helpful to use a graphical notation (like figure 1 for OO languages). The suggestion for Elixir and process-oriented programming from UML would be the sequence diagram (example in figure 2) to show temporal relationships between processes and identify which processes are involved in servicing a request. There is not a UML diagram type for capturing life-cycle and process structure, but it could be represented with a simple box and arrow diagram for process types and their relationships. For example, Figure 3:

Process-oriented programming sample UML sequence diagram

Process-oriented programming sample process structure diagram

An Example of Process Orientation

Finally, we will walk through a short example of applying process orientation to a problem. Suppose we are tasked with providing a system that supports global elections. This problem is chosen in that many individual activities are performed in bursts, but the aggregation or summarization of the results is desirable in real time and might see significant load.

Initial Process Design and Allocation

We can initially see that the casting of votes by each individual is a burst of traffic to the system from many discrete inputs, is not time ordered, and can have high load. To support this activity, we would want a large number of processes all collecting these inputs and forwarding them to a more central process for tabulation. These processes could be located near the populations in each country that would be generating votes, and thus provide low latency. They would retain local results, log their inputs immediately, and forward them for tabulation in batches to reduce bandwidth and overhead.

We can initially see that there will need to be processes that track the votes in each jurisdiction in which results must be presented. Let’s assume for this example that we need to track results for each country, and within each country by province/state. To support this activity, we would want at least one process per country performing the computation, and retaining the current totals, and another set for each state/province in each country. This assumes we need to be able to answer totals for country and state/province in real time or low latency. If the results can be obtained from a database system, we might choose a different process allocation where totals are updated by transient processes. The advantage of using dedicated processes for these computations is that the results occur at the speed of memory and can be obtained with low latency.

Finally, we can see that lots and lots of people will be viewing the results. These processes can be partitioned in many ways. We may want to distribute the load by placing processes in each country responsible for that country’s results. The processes could cache the results from the computation processes to reduce query load on the computation processes, and/or the computation processes could push their results to the proper results processes on a periodic basis, when results change by a significant amount, or upon the computation process becoming idle indicating a slowed rate of change.

In all three process types, we can scale the processes independently of each other, distribute them geographically, and ensure results are never lost through active acknowledgement of data transfers between processes.

As discussed, we have begun the example with a process design independent of the business logic in each process. In cases where the business logic has specific requirements for data aggregation or geography that can impact the process allocation iteratively. Our process design so far is shown in figure 4.

Process-oriented development example: Initial process design

The use of separate processes to receive votes allows each vote to be received independent of any other vote, logged upon receipt, and batched to the next set of processes, reducing load on those systems significantly. For a system that consumes a large amount of data, reducing the volume of data by use of layers of processes is a common and useful pattern.

By performing the computation in an isolated set of processes, we can manage the load on those processes and ensure their stability and resource requirements.

By placing the result presentation in an isolated set of processes, we both control load to the rest of the system and allow the set of processes to be scaled dynamically for load.

Additional Requirements

Now, let’s add some complicating requirements. Let’s suppose that in each jurisdiction (country or state), the tabulation of votes can result in a proportional result, a winner-takes-all result, or no result if insufficient votes are cast relative to the population of that jurisdiction. Each jurisdiction has control over these aspects. With this change, then the results of countries are not a simple aggregation of the raw vote results, but are an aggregation of the state/province results. This changes the process allocation from the original to require that results from the state/province processes feed into the country processes. If the protocol used between the vote collection and the state/province and the province to country processes is the same, then the aggregation logic can be reused, but distinct processes holding the results are needed and their communication paths are different, as shown in Figure 5.

Process-oriented development example: Modified process design

The Code

To complete the example, we will review an implementation of the example in Elixir OTP. To simplify things, this example assumes a web server like Phoenix is used to process actual web requests, and those web services make requests to the process identified above. This has the advantage of simplifying the example and keeping the focus on Elixir/OTP. In a production system, having these be separate processes has some advantages as well as separates concerns, allows flexible deployment, distributes load, and reduces latency. The full source code with tests can be found at https://github.com/technomage/voting. The source is abbreviated in this post for readability. Each process below fits into an OTP supervision tree to ensure that processes are restarted on failure. See the source for more on this aspect of the example.

Vote Recorder

This process receives votes, logs them to a persistent store, and batches the results to the aggregators. The module VoteRecoder uses Task.Supervisor to manage short lived tasks to record each vote.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
defmodule Voting.VoteRecorder do
  @moduledoc """
  This module receives votes and sends them to the proper
  aggregator. This module uses supervised tasks to ensure
  that any failure is recovered from and the vote is not
  lost.
  """

  @doc """
  Start a task to track the submittal of a vote to an
  aggregator. This is a supervised task to ensure
  completion.
  """
  def cast_vote where, who do
    Task.Supervisor.async_nolink(Voting.VoteTaskSupervisor,
      fn ->
        Voting.Aggregator.submit_vote where, who
      end)
    |> Task.await
  end
end

Vote Aggregator

This process aggregates votes within a jurisdiction, computes the result for that jurisdiction, and forwards vote summaries to the next higher process (a higher level jurisdiction, or a result presenter).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
defmodule Voting.Aggregator do
  use GenStage
  ...

  @doc """
  Submit a single vote to an aggregator
  """
  def submit_vote id, candidate do
    pid = __MODULE__.via_tuple(id)
    :ok = GenStage.call pid, {:submit_vote, candidate}
  end

  @doc """
  Respond to requests
  """
  def handle_call {:submit_vote, candidate}, _from, state do
    n = state.votes[candidate] || 0
    state = %{state | votes: Map.put(state.votes, candidate, n+1)}
    {:reply, :ok, [%{state.id => state.votes}], state}
  end

  @doc """
  Handle events from subordinate aggregators
  """
  def handle_events events, _from, state do
    votes = Enum.reduce events, state.votes, fn e, votes ->
      Enum.reduce e, votes, fn {k,v}, votes ->
        Map.put(votes, k, v) # replace any entries for subordinates
      end
    end
    # Any jurisdiction specific policy would go here

    # Sum the votes by candidate for the published event
    merged = Enum.reduce votes, %{}, fn {j, jv}, votes ->
      # Each jourisdiction is summed for each candidate
      Enum.reduce jv, votes, fn {candidate, tot}, votes ->
        Logger.debug "@@@@ Votes in #{inspect j} for #{inspect candidate}: #{inspect tot}"
        n = votes[candidate] || 0
        Map.put(votes, candidate, n + tot)
      end
    end
    # Return the published event and the state which retains
    # Votes by jourisdiction
    {:noreply, [%{state.id => merged}], %{state | votes: votes}}
  end
end

Result Presenter

This process receives votes from an aggregator and caches those results to service requests for presenting results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
defmodule Voting.ResultPresenter do
  use GenStage
  …

  @doc """
  Handle requests for results
  """
  def handle_call :get_votes, _from, state do
    {:reply, {:ok, state.votes}, [], state}
  end

  @doc """
  Obtain the results from this presenter
  """
  def get_votes id do
    pid = Voting.ResultPresenter.via_tuple(id)
    {:ok, votes} = GenStage.call pid, :get_votes
    votes
  end

  @doc """
  Receive votes from aggregator
  """
  def handle_events events, _from, state do
    Logger.debug "@@@@ Presenter received: #{inspect events}"
    votes = Enum.reduce events, state.votes, fn v, votes ->
      Enum.reduce v, votes, fn {k,v}, votes ->
        Map.put(votes, k, v)
      end
    end
    {:noreply, [], %{state | votes: votes}}
  end
end

Takeaway

This post explored Elixir/OTP from its potential as a process-oriented language, compared this to object-oriented and functional paradigms, and reviewed the implications of this to training and adoption.

The post also includes a short example of applying this orientation to a sample problem. In case you’d like to review all the code, here is a link to our example on GitHub again, just so you don’t have to scroll back looking for it.

The key takeaway is to view systems as a collection of communicating processes. Plan the system from a process design point of view first, and a logic coding point of view second.

This article is originally posted in Toptal.