<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Sergi GP]]></title><description><![CDATA[Sergi GP]]></description><link>https://sergigp.dev</link><generator>RSS for Node</generator><lastBuildDate>Mon, 13 Apr 2026 22:12:33 GMT</lastBuildDate><atom:link href="https://sergigp.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Async vs Sync communication at work]]></title><description><![CDATA[In the previous chapter we explained what async communication is and how to differentiate it from sync communication. We would like to analyse how this translates into your workspace, so you can easily understand when to use each.
In general, we can ...]]></description><link>https://sergigp.dev/async-vs-sync-communication-at-work-16e38b40d7a0</link><guid isPermaLink="true">https://sergigp.dev/async-vs-sync-communication-at-work-16e38b40d7a0</guid><dc:creator><![CDATA[Sergi Gonzalez]]></dc:creator><pubDate>Mon, 10 Jan 2022 15:18:49 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232238610/f592db30-20fb-4335-bd02-c84be0c0df22.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the <a target="_blank" href="https://medium.com/@sergigp/what-does-async-communication-mean-c1212d524380">previous chapter</a> we explained what async communication is and how to differentiate it from sync communication. We would like to analyse how this translates into your workspace, so you can easily understand when to use each.</p>
<p>In general, we can group company meetings by their purpose. You should be able to map all your day-to-day communication in any of these buckets. For each kind of meeting, we will explain the pros and cons of running them asynchronously.</p>
<h3 id="heading-types-of-meetings">Types of meetings</h3>
<p>Meetings are extremely important at work. They are the main contact point with your peers. Meetings let you sync up, align, mentor, find issues or solutions. They enable your day-to-day tasks. Because of their importance, communication easily ends up predating your productive hours. You end up being blocked by the tool which was supposed to unlock your work. That’s why it’s really important to identify types of meetings and book sync up time for those that truly require it.</p>
<p>In general we can identify up to 6 kinds of meetings</p>
<h4 id="heading-information-sharing">Information sharing</h4>
<p>Presentations, seminars, keynotes…These kinds of meetings are meant to share information with the attendees. Depending on the format, they can be mostly unidirectional with a low engagement. The listeners just need to sit down, and get information from the speaker.</p>
<p>These meetings tend to be tedious and boring, as normally one person shares information with many stakeholders. The presenter needs some visual support to keep the audience engaged. Historically, they have been shared in the shape of presentations. Unless there is some value in keeping the group together while the meeting happens, these meetings don’t need to be synchronous.</p>
<p>If you do them asynchronously, make sure you provide some visual information so that the participant is engaged with the content.</p>
<h4 id="heading-status-updates">Status updates</h4>
<p>Probably the most common meeting in your calendar. These meetings are meant to keep workmates updated with the latest progress of the project. If you follow agile methodologies, you most likely have daily stand-ups. If you work with different teams, you probably have some weekly catch-ups with others. These meetings tend to have a fixed structure, with three main topics in the agenda:</p>
<ul>
<li>Progress done</li>
<li>Next steps</li>
<li>Problems found along the day</li>
</ul>
<p>Just like information sharing meetings, status updates tend to get boring quickly. Not all information shared is valuable for everyone, so they need to be brief and to the point. Since these meetings are usually not interactive, they are really good candidates to be executed asynchronously.</p>
<p>If done asynchronously, it’s really important to make sure that the information can be consumed easily. The engagement can drop quickly if the listener finds herself drowning in irrelevant information.</p>
<h4 id="heading-problem-solving">Problem solving</h4>
<p>This kind of meeting is held whenever a team needs to find the cause, and potential solutions for a specific issue. Not all are meant to solve high pressure issues, sometimes they are more strategic and less tactical. In any case, it’s also important to stick to a fixed agenda. <a target="_blank" href="https://www.cassetteapp.com/">Cassette</a> recommends the following:</p>
<ul>
<li>What was the cause of the issue?</li>
<li>What can we do to solve or minimize the issue?</li>
<li>How can we prevent it from happening again?</li>
</ul>
<p>It’s extremely important to come up with a set of actions, deadlines and areas of responsibility as a goal, so you can track progress and make sure that short and long-term solutions are implemented.</p>
<p>If the issue requires to be addressed quickly, asynchronous communication won’t help. As we explained before, meetings that require a high bandwidth should happen in-sync to maximise the amount of information being shared.</p>
<p>For more strategic discussions, we recommend carrying these asynchronously. In these types of meetings, it’s key to have input from everyone. Usually stakeholders will have different visions, or pieces of information that need to be shared to “complete the puzzle”. Running these meetings asynchronously allows every participant to properly gather information about the issue, listen and plan.. The output is usually a more complete solution.</p>
<h4 id="heading-decision-making">Decision-making</h4>
<p>Decision-making meetings are run whenever the team needs to agree on a solution among a set of options.. Examples might include choosing the best tool for your team, agreeing on a plan, or deciding which candidate would fit the role best..</p>
<p>Similarly to problem solving meetings, decision-making is often best run asynchronously unless there is a tight deadline..</p>
<p>The success of this meeting will heavily depend on preparation. There are some important points that the host needs to take into account when running these meetings:</p>
<ul>
<li>Will this decision impact many different teams/people within the organisation? If not, can we avoid this meeting by delegating the decision to the most informed or experienced team member?</li>
<li>Is there any option that can be checked beforehand without any big effort? If you can try any option and rollback if you are not satisfied, then consider if it should be kept as an object of debate or not.</li>
<li>Can you provide all the content needed to make a fully informed decision? Make sure it’s the case.</li>
<li>Is everyone’s opinion equally valid or do you have some experts that should be taken more into account? Set the expectations in your team beforehand to avoid potential conflicts. It’s very important that everyone feels they’ve been listened to and considered, so they have contributed to the final decision.</li>
<li>Do you need consensus or do you allow disagreements to the final decision? Again, set the right expectations</li>
</ul>
<h4 id="heading-innovation">Innovation</h4>
<p>The meetings where the magic happens. These meetings help refining parts of your product or service, or coming up with new brilliant ideas for them. The most important thing about innovation meetings is to keep everyone engaged with the topic, focused and that they feel they have the freedom to propose. Do not close any door, let everyone express their ideas and encourage out of the box thinking. Just like with decision making meetings, everyone should feel they have contributed by the time it finishes. Also, to avoid frustration, set the right expectations about how the ideas are going to be used.</p>
<p>These meetings are better run synchronously, as they require a high information bandwidth and a quick exchange of ideas. They can also be run in really funny ways, helping the team coming together and aligning them.</p>
<h4 id="heading-team-building">Team building</h4>
<p>We often see teams making use of normal meetings as “team building meetings”. Especially with remote teams, where the office space is removed and so those face to face moments are lost, we need to allocate time for personal contact. Team building does not mean to see each other. Neither talking about product issues.</p>
<p>The best way to make use of synchronous time is precisely to create personal spaces for your team members. Spaces where they can talk about their lives, their concerns, their hobbies… A team stand-up will help them align at a professional level, but won’t automatically help you build a team.</p>
<p>In <a target="_blank" href="https://www.cassetteapp.com/">Cassette</a>, we recommend teams to move status updates meetings to async as much as possible. And whenever possible, replace that sync-up time with team building meetings. Water cooler conversations, team coffees…let them happen outside of a normal meeting space and see how your team starts creating personal bonds.</p>
<h3 id="heading-cassette-structured-async-meetings-for-teams">Cassette. Structured async meetings for teams</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232236639/3da17629-f454-4859-b47f-ddf56843c771.png" alt /></p>
<p>Unlock your schedule with Cassette</p>
<p><a target="_blank" href="http://www.cassetteapp.com/">Cassette</a> is a free multiplatform app created to disrupt the broken meeting culture using voice notes. We believe that asynchronous work should unlock your schedule and bring back your own time. Cassette provides you an easy way to produce voice messages and consume them efficiently. It enhances meetings by adding structure such as agenda, due date and reactions.</p>
]]></content:encoded></item><item><title><![CDATA[What does async communication mean?]]></title><description><![CDATA[Definition
Asynchronous communication means non-real-time communication. Exchanging information without expecting an immediate action or response. It’s important to highlight the expectation here, as that’s what enables async communication.
Whenever ...]]></description><link>https://sergigp.dev/what-does-async-communication-mean-c1212d524380</link><guid isPermaLink="true">https://sergigp.dev/what-does-async-communication-mean-c1212d524380</guid><dc:creator><![CDATA[Sergi Gonzalez]]></dc:creator><pubDate>Tue, 04 Jan 2022 09:40:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232243664/c7dfa438-38fd-4e04-878e-430713f3e455.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4 id="heading-definition">Definition</h4>
<p>Asynchronous communication means non-real-time communication. Exchanging information without expecting an immediate action or response. It’s important to highlight the expectation here, as that’s what enables async communication.</p>
<p>Whenever a workmate requests something and you reply immediately, you are addressing it synchronously. You are fulfilling their expectations in terms of timing and urgency — that would normally lead you into a real-time chat.</p>
<p>If you ask your workmate to send you an email with the details of the request, you are effectively generating an expectation about when your reply will come. You are trying to shift that conversation to async.</p>
<p>The fact that we associate email with async is related to how the communication channel is used. There is one thing that has to be provided by the channel to enable async communication: information persistence. That’s why you leave a voice message (async) when the phone call (sync) is missed.</p>
<h4 id="heading-pros-and-cons">Pros and cons</h4>
<p>Human communication is extremely complex. There’s no silver bullet or best way to communicate. As expected, it totally depends on the context. So the discussion should not be about how to do it right, but to understand when to communicate in what way.</p>
<p>Some benefits of async communication are:</p>
<p><strong>No scheduling needed</strong></p>
<p>Synchronous communication happens in a specific time slot, where all messages are exchanged in a timely manner. This time slot can be allocated beforehand — or not.</p>
<p>Asynchronous communication happens at any moment. Messages are shared whenever the senders decide to do so. No specific slots are allocated.</p>
<p><strong>Self documented</strong></p>
<p>As we explained previously, async communication requires information to be stored, as it will be consumed later in time. After a non-real time conversation, you will have a record of the information generated with no extra effort. The channel will provide the way to consume that information: text messages, audios, video clips….</p>
<p><strong>Second thought</strong></p>
<p>Real time conversations often require thinking and providing an answer as soon as possible. It’s easy to provide information that is not complete or correct. With async communication we remove the expectations of an immediate response. We gain extra time to prepare the response, potentially making it better in terms of the information provided.</p>
<p><strong>Democracy</strong></p>
<p>Synchronous communication is timeboxed. Time suddenly becomes a precious resource. Without an external role regulating how the time is distributed among all parts, it’s easy to end up failing. Not everyone has the same information to give, not all information is necessarily equally valuable for everyone, and not everyone has the ability to communicate in groups. With asynchronous communication, everyone has a chance to contribute without those constraints.</p>
<p>But not every situation is suited for this kind of communication. There are some areas where non-real time fails:</p>
<p><strong>High bandwidth</strong></p>
<p>Spacing messages in time usually means that less messages are exchanged during the conversation. During async conversations there’s less room to iterate concepts or ideas. This makes it a bad choice for ideation, brainstorming or any other meeting that requires a high information bandwidth.</p>
<p><strong>Personal touch</strong></p>
<p>Some things are better communicated in person. The more personal the communication gets, the easier to build trust between all participants. Since async communication removes the presence and spaces up messages, communication becomes less personal. Some messages require a full personal touch that async won’t provide.</p>
<p><strong>Wrapping up</strong></p>
<p>We have tried to summarise the differences between sync and async communication. We hope this generates some awareness about how and when we can communicate. In the next chapter we will explain how this translates to your work — what kind of meetings we normally have at work, and how to use async channels to improve the way we communicate with our colleagues.</p>
<h3 id="heading-cassette-structured-async-meetings-for-teams"><strong>Cassette.</strong> Structured async meetings for teams</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232241835/5d80b0cb-24f5-4085-b275-15f39808afca.png" alt /></p>
<p>Unlock your schedule with Cassette</p>
<p><a target="_blank" href="http://www.cassetteapp.com">Cassette</a> is a free multiplatform app created to disrupt the broken meeting culture using voice notes. We believe that asynchronous work should unlock your schedule and bring back your own time. Cassette provides you an easy way to produce voice messages and consume them efficiently. It enhances meetings by adding structure such as agenda, due date and reactions.</p>
]]></content:encoded></item><item><title><![CDATA[Event-Oriented Architecture Anti-Patterns]]></title><description><![CDATA[Over the past few years we’ve seen a rise in popularity of microservices architecture. There are a lot of resources on how to implement it correctly but, quite often, people talk about microservices like they’re a silver bullet. There are many argume...]]></description><link>https://sergigp.dev/event-oriented-architecture-anti-patterns-2dccc68ed282</link><guid isPermaLink="true">https://sergigp.dev/event-oriented-architecture-anti-patterns-2dccc68ed282</guid><dc:creator><![CDATA[Sergi Gonzalez]]></dc:creator><pubDate>Mon, 07 Oct 2019 13:55:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232263788/58d6ee85-8d2a-4027-aa3f-442819750bc9.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the past few years we’ve seen a rise in popularity of microservices architecture. There are a lot of resources on how to implement it correctly but, quite often, people talk about microservices like they’re a silver bullet. There are many arguments against microservices, but the most relevant one is that this kind of architecture carries a lot of accidental complexity that relies on how you manage the relationships between your services and teams. You can find a lot of literature about why (maybe) microservices isn’t a good choice depending on your circumstances.</p>
<p>At letgo we’ve migrated from monolith to microservices to meet our scalability requirements and once we verified its beneficial impact across all teams. When correctly applied, microservices gives us several advantages, namely:</p>
<ul>
<li><strong>Scalability of the application</strong>: In our experience the main pain point in the scalability of an application is in its infrastructure. Microservices empowers modularization of code and infrastructure (databases, etc.). In a well-implemented microservices architecture, each service owns its infrastructure. The Users database can only be accessed (read and write) by the Users service.</li>
<li><strong>Organizational scalability</strong>: As we’ve observed, microservices helps to fix organizational issues and gives us a framework on how we manage a large codebase that several teams are changing. Splitting the codebase prevents conflicts when making changes. In our experience, working in big teams doesn’t scale efficiently, so once we decided to split our engineers into small teams it made sense to <a target="_blank" href="https://en.wikipedia.org/wiki/Conway%27s_law">split our codebase into small components too</a>. Organizing a company in small teams also fosters ownership.</li>
</ul>
<h3 id="heading-event-oriented-architectures">Event-Oriented Architectures</h3>
<p>Not all microservices architectures are event-oriented. There are several people that advocate for synchronous communication between services in this kind of architecture using HTTP (gRPC, REST, etc.). At letgo we try not to follow this pattern and we communicate our services asynchronously with <a target="_blank" href="http://codebetter.com/gregyoung/2010/04/11/what-is-a-domain-event/">domain events</a>. Our reasons for doing so are:</p>
<ul>
<li><strong>Improve scalability and resilience:</strong> Dividing a large system into smaller ones helps to control affectation of failures. For example, a DDoS or a traffic spike in one of our services shouldn’t affect other services. If you communicate your services synchronously the possibilities of a service DDoSing another one increases. In this case we can say that our services are too coupled. For us the key concept for increasing microservices’ scalability and resilience is how rigid the boundaries of a service are and how you enforce communication between them.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232254530/930927c4-41a8-4fb6-b7d4-66fba3880971.png" alt /></p>
<p>A bulkhead is an upright wall within the <em>hull of a ship that creates watertight compartments that can contain water in the case of a hull breach or other leak.</em></p>
<ul>
<li><strong>Decoupling</strong>: A change in a service shouldn’t affect another one. We think practices like synchronizing deploys of multiple services are bad smells because they add a lot of complexity to our operations. There are some ways to mitigate this, like API versioning, but in our experience using domain events as a service’s public contract helps to model its domain in a way that doesn’t impact other services. A user entity in our Users service shouldn’t be the same as a user entity in the Chat service.</li>
</ul>
<p>Based on this, we try to encourage async communication between services at letgo and we only allow it in very special cases like feature MVPs. We do this because we want every service to build its own entities based on domain events published by other services in our Message Bus.</p>
<p>In our opinion, the success or failure of a microservices architecture depends on how you handle the inherent complexity that it carries and relies on how all your services communicate with each other. Splitting code without splitting infrastructure of switching communication to asynchronous tends to become a distributed monolith.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232256446/bad23fb9-9c30-4402-b834-7784430b1e0f.png" alt /></p>
<h3 id="heading-letgos-event-oriented-architecture">letgo’s Event-Oriented Architecture</h3>
<p>I want to share an example of how we use domain events and async communication at letgo: Our User entity exists in a lot of services but its creation and editing is handled originally by the Users service. We store a lot of data in the Users service database like name, email, avatar, country, etc. In our Chat service, we also have the user concept but we don’t need the same data that Users has in Users service. In our conversation list view, we’re only showing the username, avatar and ID (to link to their profile). We say that in chat we have a <strong>projection</strong> of the user entity that contains only partial data. In fact, in chat we don’t speak about users, we consider them “talkers”. This projection belongs to the Chat service and is built with events that Chat consumes from the Users service.</p>
<p>We do the same with listings. In the Products service we store n pictures of every listing but in the conversation list view we only show the main one, so our Products projection in Chat only needs one picture instead of n.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232258564/35f3fa43-4083-47e0-bb59-cf3e67dbe0b0.png" alt /></p>
<p>Conversation list view in our chat. Which backend service produces information shown.</p>
<p>If you take a look at conversation list view again, you’ll see that almost all the data we show isn’t created by the Chat service, but all the data is owned by the Chat service because User and Product projections are property of Chat. There’s a tradeoff between availability and consistency in projections which we can’t cover in this post, but we can say that it’s obviously easier to scale a lot of small databases than one huge one.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232260107/6913b7f5-cda6-46a9-9d15-b6f9294cec30.png" alt /></p>
<p>Simplified view of letgo’s backend architecture</p>
<h3 id="heading-the-anti-patterns">The Anti-Patterns</h3>
<p>Some intuitive solutions often became mistakes. Here’s a list of the most important antipatterns we’ve seen in our architecture related to domain events.</p>
<h3 id="heading-1-fat-events">1. Fat events</h3>
<p>We should try to keep our domain events as small as possible without losing domain meaning. We should be careful especially when refactoring legacy codebases with big entities to an event-driven architecture. These kind of entities can lead us to fat events, but since our domain events became our public contract we need to keep them as simple as possible. In this case, we think it’s better to consider these refactors from the outside in. First, we design our events using techniques like <a target="_blank" href="https://openpracticelibrary.com/practice/event-storming/">event storm</a> and then we refactor the service’s code to adapt it to our events.</p>
<p>We also need to be careful with the “user and product problem”: a lot of systems tend to have products and users and these entities tend to attract all of the logic which means that all domain events are coupled to them.</p>
<h3 id="heading-2-events-as-intentions">2. Events as intentions</h3>
<p>A domain event, by definition, is something that has already happened. If you’re publishing something in the message bus to request that something else happens in another service, it’s probably an async command instead of a domain event. As a rule of thumb, we name our domain events using the past tense: <em>user_registered</em>, <em>product_published</em>, etc. The less a service knows about the others, the better. Using events as commands couples services and increases the chances that a change in a service will affect others.</p>
<h3 id="heading-3-no-agnostic-serialization-or-compression">3. No agnostic serialization or compression</h3>
<p>The serialization and compression systems of our domain events should be agnostic of programming languages. You shouldn’t even know what programming language other services consumers are coded in. That’s why we can’t use the PHP or Java serializer for example. Take your time, as a team, to discuss and choose your serializer because changing this in the future is complex and hard. At letgo we’re using JSON but there are a lot of good serialization formats with good performance.</p>
<h3 id="heading-4-no-structure-standardization">4. No structure standardization</h3>
<p>When we started to migrate the letgo backend to an event-oriented architecture we agreed upon a common structure for our domain events. It looks like this:</p>
<p>{</p>
<p>  “data”: {</p>
<p>    “id”: [uuid], // event id.</p>
<p>    “type”: “user_registered”,</p>
<p>    “attributes”: {</p>
<p>      “id”: [uuid], // aggregate/entity id, in this case user_id</p>
<p>      “user_name”: “John Doe”,</p>
<p>      …<br />    }</p>
<p>  },</p>
<p>  “meta” : {</p>
<p>    “created_at”: timestamp, // when was the event created?</p>
<p>    “host”: “users-service” // where was the event created?</p>
<p>    …</p>
<p>  }</p>
<p>}</p>
<p>Having a common and shared structure for our domain events enables us to integrate our services quicker and implement some libraries with abstractions.</p>
<h3 id="heading-5-no-schema-validation">5. No schema validation</h3>
<p>We’ve experienced some problems with serialization at letgo related to programming languages that don’t have a strong typed system.</p>
<p>{</p>
<p>  “null_value_one”: null, // thank god</p>
<p>  “null_value_two”: “null”,</p>
<p>  “null_value_three”: “”,</p>
<p>}</p>
<p>A strong testing culture that tests how our events are serialized and knowing how the serialization library works helps a lot to mitigate this. At letgo we’re migrating to Avro and Confluent Schema Registry that will provide us a single point of definition of our domain event structure and will allow us to avoid this kind of errors as well as outdated documentation.</p>
<h3 id="heading-6-anemic-domain-events">6. Anemic domain events</h3>
<p>As we said before, and as its name suggests, domain events should have a meaning at the domain level. Just like we try to avoid inconsistent states in our entities, we need to avoid the same in domain events. Let’s illustrate this with an example: A product in our system is geolocated with latitude and longitude that are stored in two different fields in a products table of the products service. All products can be “moved” so we will have domain events to represent this update. We used to have two events to represent this: <em>product_latitude_updated</em> and <em>product_longitude_updated</em> which doesn’t make much sense if you’re not a rook in a game of chess. In this example, it makes more sense to have an event like <em>product_location_updated</em> or <em>product_moved</em>.</p>
<p>The <strong>rook</strong> ([/rʊk/](https://en.wikipedia.org/wiki/Help:IPA/English "Help:IPA/English"); ♖,♜) is a piece in the game of chess. Formerly the piece was called the <em>tower.</em> The rook only moves horizontally or vertically, through any number of unoccupied squares.</p>
<h3 id="heading-7-no-tooling-for-debug">7. No tooling for debug</h3>
<p>At letgo we produce thousands of domain events per second. All these events become an extremely useful resource to know what’s happening in our system, to log user activity or even to reconstruct the state of the system in a specific point of time doing event sourcing. We need to take advantage of this resource and to do that we need tooling to inspect and debug our events. Queries like “give me all events produced by user John Doe in the last 3 hours” also become very useful to detect fraudulent behaviour. We’ve developed some tools to accomplish that on top of ElasticSearch, Kibana and S3.</p>
<h3 id="heading-8-no-event-monitoring">8. No event monitoring</h3>
<p>We can use our domain events to check the health of a system. When we deploy something (which happens several times per day depending on the service) we need tools to quickly check if everything’s working the way it should. For example, if we deploy a new version of our Products service in production and we see a 20% decrease in <em>product_published</em> events, we can be almost certain that we broke something. We’re currently using InfluxDB, Grafana and Prometheus to accomplish that using derivative functions. If you remember your math classes, the derivative of a function f(x) is the slope of the tangent line of every point x. If you have a function of publishing rate of a specific domain event and you apply the derivative you will see peaks of this function and you will be able to set alerts based on that. With this type of alerts you avoid ones like “alert me if we publish less than 200 events per second during 5 minutes” and focus on significant variations in publishing rate.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232261764/2ea87de1-65f0-4dd2-a09f-1631032c1c2d.png" alt /></p>
<p>Something weird happened here… but maybe it’s only a marketing campaign :D</p>
<h3 id="heading-9-assuming-everything-is-gonna-be-alright">9. Assuming everything is gonna be alright</h3>
<p>We try to build resilient systems and try to reduce their recovery cost. In addition to infrastructure problems or human failure, one of the most common things that can happen in an event-driven architecture is a loss of events. We need to have a plan to recover the right state of the system reprocessing all events that were lost. Our strategy for accomplishing it is based on two points:</p>
<ul>
<li><strong>Save all events</strong>: We need to be capable of doing things like “reprocess all events that happened yesterday” so we need to have some kind of event store where we keep all events. At letgo this is a responsibility of the Data team more than the Backend one.</li>
<li><strong>Consumer idempotency:</strong> Our consumers should be capable of handling the same event more than once without corrupting the internal state or throwing a lot of errors. This can happen because we’re recovering from an error and reprocessing old events or because our <a target="_blank" href="https://medium.com/@marton.waszlavik/demystifying-cap-theorem-eventual-consistency-and-exactly-once-delivery-guarantee-ed20cf7cc877">message bus delivers an event more than once</a>. <a target="_blank" href="https://en.wikipedia.org/wiki/Idempotence">Idempotence</a> is, in our opinion, the cheapest solution to solve this problem. Imagine we’re listening in our service to the event <em>user_registered</em> from the Users service because we want to construct our projection of users and we have a MySQL table using <em>user_id</em> as our primary key. If we don’t check existence before inserting when handling <em>user_registered</em> domain events we can end up with a lot of duplicate key errors. In this case, even if we check the existence of the user before inserting it we can still have errors due to the delay between master and slaves in MySQL (about 30ms average). As these projections can be represented as key value records, we’re trying to switch them to DynamoDB. Even if you try to be idempotent there are use cases like incrementing or decrementing counters where building an idempotent consumer is very hard. Depending on the criticality of the use case at domain level, you should decide how tolerant you should be to failures and inconsistencies, and decide if the cost of a deduplication event system pays off.</li>
</ul>
<h3 id="heading-10-lack-of-domain-event-documentation">10. Lack of domain event documentation</h3>
<p>Our domain events become our public interface to the rest of the systems in our backend. Just like we document our REST APIs, we need to document our domain events. Any member of the organization should be able to see an updated documentation of every domain event published by each service. If we’re using schemas for domain event validation, they can be used as documentation too.</p>
<h3 id="heading-11-resistance-of-consuming-own-events">11. Resistance of consuming own events</h3>
<p>You’re authorized and encouraged to consume your own domain events to construct projections inside your system that are, for example, optimized for reading. We have seen some resistance in some teams to do it because they interiorize the concept of consuming other’s events.</p>
]]></content:encoded></item><item><title><![CDATA[Testing Backend Services: Introduction]]></title><description><![CDATA[This is the first in a series of posts about testing backend services and some principles that we follow at letgo. In this post, we’ll cover some core testing concepts that we need to understand in order to improve our test suite.
Anatomy of a backen...]]></description><link>https://sergigp.dev/testing-backend-services-introduction-cc24b5709a48</link><guid isPermaLink="true">https://sergigp.dev/testing-backend-services-introduction-cc24b5709a48</guid><dc:creator><![CDATA[Sergi Gonzalez]]></dc:creator><pubDate>Thu, 04 Apr 2019 06:47:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232251097/46bc21b6-71c6-496d-bac7-f586057b5074.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is the first in a series of posts about testing backend services and some principles that we follow at letgo. In this post, we’ll cover some core testing concepts that we need to understand in order to improve our test suite.</p>
<h3 id="heading-anatomy-of-a-backend-service">Anatomy of a backend service:</h3>
<p>We try to follow some <a target="_blank" href="https://airbrake.io/blog/software-design/domain-driven-design">DDD</a> and <a target="_blank" href="https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/">Hexagonal Architecture</a> principles when designing our backend services. We also try to apply CQRS using Command and Query buses whenever a project’s complexity demands it. The following design is a simplified version of the anatomy of one of our use cases without buses and a few more layers that we use to implement.</p>
<h3 id="heading-controller">Controller:</h3>
<p>A controller handles HTTP requests and responses, translating requests coming from clients to something understandable by our system. That’s done by validating the input and wrapping it into Domain Objects, or failing with an HTTP error. The validation itself occurs in the Command Handlers but we will omit them for simplicity’s sake.</p>
<h3 id="heading-service">Service:</h3>
<p>All the logic that our application does should go here. All dependencies that imply IO will be injected here.</p>
<h3 id="heading-repository">Repository:</h3>
<p>A repository provides access to data. It’s usually a database but it can also be an external service. The repository’s interface belongs to Domain. Its implementation belongs to infrastructure. We should try not to put too much logic in <a target="_blank" href="https://martinfowler.com/eaaCatalog/repository.html">repositories</a>. Ideally, repositories should have an API similar to an array: add, search by id, update and delete.</p>
<h3 id="heading-types-of-test">Types of test</h3>
<p>Out of all the types of tests, we’re going to focus on three.</p>
<h3 id="heading-unit">Unit</h3>
<p>This kind of test checks that our domain logic works as expected. These tests should be really fast, isolated and repeatable. That creates a fast feedback loop that maximizes value while developing. You invoke services mocking infrastructure (or using in-memory implementations) and assert the response that it returns or side effects in case of a command. You should try to test all possible errors, not just the happy path. What you’re testing here is the application’s behavior.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232246674/c7026b47-ec6a-427e-8810-1a6dcd28a114.png" alt /></p>
<h3 id="heading-integration">Integration</h3>
<p>This type of test is used to check that the implementation against an external service works. They’re typically used to check that your repository works as expected against a database or external service. These tests are slower and add less value than unit tests, since they have some overlap with acceptance tests.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232248183/404a0476-6910-403c-8526-504c1b5a5911.png" alt /></p>
<h3 id="heading-acceptance">Acceptance</h3>
<p>Taking into account the <a target="_blank" href="https://martinfowler.com/bliki/TestPyramid.html">test pyramid</a>, the main purpose of this test is to check that all pieces that have already been tested in isolation work well with each other. It can also be used to check HTTP status or your data validation. We don’t test our controllers in isolation, because we try to keep them very small, so this test also tests the logic placed in them. This kind of tests are usually really slow but they maximize value from a business point of view because it’s the only test that ensures the whole system works as expected.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1707232249516/58e9375c-6485-4266-9d48-a83a6f61c94d.png" alt /></p>
<h3 id="heading-some-other-testing-concepts">Some other testing concepts…</h3>
<h3 id="heading-determinism">Determinism:</h3>
<p>Determinism is a concept that relates to repeatability. A system is deterministic if a specific input will always have the same output. A system can be a whole service, a single class or a bunch of classes that collaborate with each other.</p>
<p>Let’s look at an example:</p>
<p>class ExampleService {</p>
<p>  def myMethod(value: Double): Double =</p>
<p>    value + System.currentTimeMillis()</p>
<p>}</p>
<p>We can easily see that code is non-deterministic. The result of this function depends on when you call it. You can see a deterministic system as a mathematical function f(x) = y. In an ideal world, all non-deterministic effects should be in the infrastructure level.</p>
<p>In the following posts we’ll explore techniques to help us isolate non-deterministic effects and make code like this easily testable.</p>
<h3 id="heading-mutable-state">Mutable State:</h3>
<p>In an application, a mutable state is the data that belongs to it and could mutate as time passes when an event happens in the system. Almost all useful modern software applications have state. We usually store the state of an application in some kind of database but code can also have state (through mutable variables). As an example of code with state we can take a look at this piece of code:</p>
<p>class ExampleService {</p>
<p>  var statefulVariable: Double = 0</p>
<p>  def myMethod(value: Double): Double =</p>
<p>    value + statefulVariable</p>
<p>}</p>
<p>In this example a mutable variable is defined in the class scope. This variable leads this class to a non-deterministic behaviour, the value of this variable affects the result of myMethod. If we decide to have stateful code we need to handle a new bunch of problems like concurrency and non-deterministic behaviours. As a rule, we should try to isolate state to infrastructure or an external system but sometimes performance requirements don’t allow it. In following blog posts, we’ll explore some strategies to isolate state. Spoiler: There are tools like actors or channels that allow us to handle stateful code in a safer way.</p>
<h3 id="heading-coverage">Coverage</h3>
<p>If we visualize our domain layer as a graph, the more paths our domain logic has, the harder it is to test. You need more tests to cover all possible paths. In general, if we do abstractions, complexity increases and we need more tests to cover all paths. This is why we only abstract when we’re very sure that we’re creating a lot of unnecessary boilerplate and we aren’t afraid of some duplication. We prefer simplicity over easiness.</p>
<p>Testing all paths in acceptance tests is way harder than testing them with unit tests because they have more moving parts to set up. It’s also more expensive because acceptance tests are the slowest. When we develop we want fast feedback and when we run our test pipeline we want it to be as fast as possible because we like continuous integration and deployment. We’re not obsessed with code coverage percentage- we see as more of a consequence than an objective of a good test suite.</p>
<h3 id="heading-some-key-points-of-a-good-test-suite">Some key points of a good test suite:</h3>
<ul>
<li><strong>Non-fragile tests:</strong> We don’t want to break a lot of tests when we refactor something, and we don’t want false positives either. So we try to test our <a target="_blank" href="https://en.wikipedia.org/wiki/System_under_test">SUT</a> from outside without knowing how it’s implemented internally. We try to test use cases instead of classes.</li>
<li><strong>Fast:</strong> Some of letgo services have hundreds of tests. We’ve set the max time at 10 minutes. If our test pipeline takes more time than that, we should look into what’s happening and how we can speed it up.</li>
<li><strong>Confidence:</strong> Our build always needs to be green on the master branch. If something fails in master we should stop what we’re doing and fix it.</li>
<li><strong>Cheap</strong>: We try to be pragmatic. The theory is nice but we also have deadlines so sometimes we need to choose the cheapest time option and that means that some services might only have some acceptance tests. We’ll talk about applying software economics to testing in further posts.</li>
<li><strong>Readability</strong>: We try to not abstract too much in our test code. We prefer to have two 50-line tests than two 5-line tests with multiple (and often rigid) abstractions or helpers. We want to read a test 6 months after coding it and understand what it does.</li>
</ul>
<h3 id="heading-when-we-run-our-tests">When we run our tests</h3>
<p>At letgo we work with git branches and Github pull requests. We try to keep it as small as possible and when merged into master we want to deploy it to production. Every developer usually runs at least unit tests on their local machine, but it’s not guaranteed. Before a pull request is eligible to be merged all tests should be passing in the team CI server and should be reviewed by other devs. These tests are run against a virtual merge of the branch with master. After merging we usually run our tests again, but this time we also run other processes like code quality code static analysis tools, code coverage, etc. Once this build has passed it’s deployed to staging and then to production.</p>
]]></content:encoded></item></channel></rss>