Blog | SparkFabrik

How to Choose Service Mesh in 2025: solutions and advantages

Written by SparkFabrik Team | Jun 26, 2025 9:00:50 AM

How to improve observability, security, and reliability of a microservices application? Service meshes represent a modern and powerful solution to address these challenges. In this article, we clarify: let's discover what they are, how they work, and when it's worth implementing them. 

What are service meshes 

A service mesh is an application layer that adds functionality to a network between services. As you can imagine, it's particularly useful in the context of microservices applications, which involve dividing the app into numerous small-sized services that communicate with each other using the network and various protocols (mainly HTTP and gRPC).

These components communicate through a network whose intrinsic characteristics are low reliability, exposure to attacks, and difficult observability. Service meshes solve the problem by managing traffic between services and ensuring higher levels of reliability, observability, control, and security. All without requiring application code changes.

Monitoring, traffic management, resilience features, and security can be configured once and applied automatically to all services that are part of the mesh. In the absence of service meshes, these and other features must be developed in each individual service, with time expenditure (thus costs) and greater risk of bugs and security issues.

How service meshes work and the Sidecar model

After defining what we're talking about, let's see some more technical details about how they work. A service mesh is based on an architecture composed of two distinct planes: the data plane and the control plane. The data plane is formed by a network of sidecar proxies, i.e., software components distributed together with each microservice, which intercept and manage all incoming and outgoing traffic. These proxies perform key functions such as request routing, security policy enforcement, load balancing, and data collection for observability, without requiring changes to the application code.

The sidecar model allows isolating the so-called Cross-Cutting Concerns (CCC), such as logging, retry, authentication, and authorization, moving them outside the business logic of the microservice. This approach improves maintainability and allows developers to focus exclusively on domain functionalities.

The control plane is responsible for centralized mesh management. It handles dynamically configuring all proxies distributed in the data plane, orchestrating communication policies, collecting metrics and logs, and interfacing with observability tools. Through this layer, it's possible to update mesh rules and behaviors in real-time, without needing to restart services.

Together, data plane and control plane constitute the service mesh infrastructure, enabling consistent and secure management of communication between microservices in complex and dynamic environments, such as Cloud Native ones.

Service Mesh vs API Gateway: differences and complementarity 

Now, what do service meshes have to do with API gateways? The concepts are distinct, yet they are tools often confused, let's see how they differ.

An API gateway acts as an entry point for external calls, exposing APIs and managing authentication, rate limiting, payload transformation, and versioning. It's a fundamental component for all those applications that interact with external clients (web apps, mobile apps, IoT, etc.).

For its part, a service mesh works within the application domain, governing communications between microservices. It handles internal routing, retry, resilience, end-to-end security, and distributed observability.

The two tools are not alternatives but complementary. In a well-structured architecture, the API gateway manages the interface to the outside, while the service mesh handles security, visibility, and reliability of internal traffic, ensuring consistent policies and granular control in complex environments. However, in simple or poorly structured contexts, an advanced API gateway can cover some functionalities typical of a service mesh, although it doesn't offer the same level of control, observability, and automation at scale.

Spring Boot and service mesh: a winning combination 

After seeing the differences between API Gateway and service mesh and the role they play in modern architectures, it's useful to understand how these technologies concretely integrate in the realization of microservices applications. A practical example is represented by Spring Boot, one of the most used open source frameworks for creating enterprise microservices. What is it exactly? Spring Boot allows rapidly creating independent and performant microservices, but it doesn't handle network traffic management, nor the security, resilience, or observability aspects that characterize a distributed architecture.

This is where the service mesh comes into play, which integrates perfectly with services created in Spring Boot. Thanks to the sidecar architecture, Spring Boot microservices can delegate all cross-cutting functionalities to the mesh, without having to modify their own code. This allows keeping the business logic clean and focusing on the application domain, leaving Istio, Linkerd, or Consul the task of managing network, policies, security, and monitoring.

Spring Boot therefore represents a solid foundation for developing microservices in cloud native environments, while the service mesh adds a fundamental layer for managing the complexity that arises when those services start communicating with each other.

Service mesh implementations

After seeing how service meshes integrate into application architectures, let's now move on to the available solutions for putting them into practice. There's no shortage of possibilities, in fact, since 2017, the year of birth of service meshes as described here, many products have been created to enable their implementation. Most are based on Envoy proxy.

Consequently, the need to define a standard emerged. With this purpose, within the CNCF, the Service Mesh Interface project was also born. The goal of the SMI API is to provide a common and portable set of APIs for service meshes that Kubernetes users can use independently of the provider, so as not to be strictly tied to any specific implementation.

The main products for implementing service meshes, such as Istio, Linkerd and Consul, use the SMI standard. Let's see them in more detail.

Istio: the best-known solution

Certainly the most popular solution is Istio, created and supported by Google and IBM, companies that also offer a commercial offering to be able to use it managed.

In the Google context, Istio on Google Kubernetes Engine should be mentioned, a tool that provides automated installation and updates of Istio in the GKE cluster. For Google Cloud users, Google recommends adopting Anthos Service Mesh, which is Google's fully supported Istio distribution.

Unlike other projects, Istio is also usable outside Kubernetes. It therefore allows combining old traditional infrastructures with a Cloud Native infrastructure based on containers.

Regarding Cloud integrations, Istio works with Google Cloud, Alibaba Cloud, and IBM Cloud.

Linkerd: simplicity on Kubernetes

Another well-known product, also because it was born first, is Linkerd. Its particularity is the fact that it only works on Kubernetes. Designed to be non-invasive and performant, it has the advantage of requiring little time to be adopted. Linkerd promises lightness, simplicity, and security.

The project was developed within the CNCF and is completely open source. The Cloud integration offered is with DigitalOcean.

Consul Service Mesh

Consul Service Mesh, developed by HashiCorp, is a versatile solution designed for multi-platform environments. It can operate both in Kubernetes and in bare metal and VM-based environments, and is particularly appreciated for flexible management of distributed services.

Consul supports mutual TLS (mTLS), advanced authorization policies, as well as integrations with discovery and automation systems. Additionally, it's available in open source and enterprise versions, with features dedicated to governance between teams and multi-domain management.

AWS App Mesh

AWS App Mesh is the solution proposed by Amazon Web Services. It's not open source, but is completely managed by AWS and designed to be used with Amazon ECS, Amazon EKS, Fargate, and EC2.

Also based on Envoy proxy, AWS App Mesh enables native visibility on traffic between microservices, with monitoring, control, resilience, and distributed tracing functionalities, all integrated into the AWS ecosystem.

Open Service Mesh

Open Service Mesh (OSM) is an open source service mesh created by Microsoft, built for Kubernetes and fully compatible with the SMI standard.

It's designed to be simple to install, secure, and easily extensible. OSM integrates well with Azure Kubernetes Service (AKS), but can also be used in generic Kubernetes environments.

Kuma

Kuma, developed by Kong, is a modern and multi-zone service mesh designed for Kubernetes and VM-based environments, with support for sidecar proxies or proxyless.

It supports mTLS, advanced routing, observability, and stands out for a user-friendly administrative interface. It's suitable for organizations operating on multiple clusters or in hybrid environments, thanks to its global control-plane and distributed data-plane architecture.

How to choose a service mesh implementation

In short, there's no shortage of solutions, quite the opposite. But such a rich landscape of offerings can be destabilizing at first impact: which implementation is best to adopt? Let's try to define some guidelines.

The first consideration concerns the type of cloud used. If the application is on a public cloud, the advice is certainly to adopt the product that the vendor proposes. In the case of Google, Anthos Service Mesh; for AWS, AWS App Mesh.

The discourse changes when talking about private and hybrid cloud. If you only have Kubernetes clusters, the most direct and simple solution is to choose a product based entirely on Kubernetes like Linkerd or Kuma.

Another scenario is that of a hybrid cloud where besides Kubernetes clusters there are also virtual machines or even bare metal. To be able to treat the entire network with the same functionalities available for containers, you need to adopt a more flexible product, like Istio. This allows managing a wide spectrum of infrastructures but it must be taken into account that it's much more complex than other products (first of all Linkerd).

Finally, a general advice: it's good to always choose products that adopt or implement standards, and that are supported by large vendors.

Advantages of service meshes 

Developing microservices, as we know, simplifies and accelerates numerous processes (we talked about it in this article on the advantages of microservices and Cloud Native applications). On the other hand, clearly, a distributed context presents greater complexities regarding observability, monitoring, management, and problem resolution. This is where service meshes come into play: let's see what the main benefits are.

Security of communication between services

Communications from one service to another (so-called lateral communications) can be subject to different types of attacks. Service meshes allow automatically securing communications between services thanks to encryption, mTLS, authentication, and authorization.

These aspects are fundamental in a complex context like that of microservices, where the adoption and enforcement of high security standards could be very expensive and complex to implement, also considering their heterogeneity in technological terms (e.g.: different programming languages).

The most common needs are:

  • Defend against man-in-the-middle attacks, encrypting traffic.
  • Have a flexible access control system, using mTLS and access policies.
  • Understand who did what and when, using auditing systems.

Istio today offers the most complete and sophisticated system capable of going into detail on every aspect. For more information, we recommend taking a look at this in-depth analysis on Istio's security features aimed at mitigating internal and external threats.

Resilience: the microservices system doesn't get blocked

By resilience we mean the ability of a microservice to function even if one or more microservices stop working. Service meshes favor resilience in various ways:

  • Circuit breaker: if service A calls service B and service B doesn't respond, the service mesh automatically replaces that piece of the call with a response that allows service A to stop so as not to overload the system. Or it directly provides an error code and interrupts that communication.
  • Timeout: they ensure that the calling microservice is not blocked for too long, otherwise it might no longer be able to accept requests.
  • Retry: when a service doesn't respond, with service mesh the retry happens automatically n° times: if the error is temporary, the new attempt can make the request successful.

Observability of wire data, tracing, and logging

Service meshes acquire wire data such as origin, destination, protocol, URL, status codes, latency, duration, and similar. Once detected, metrics and logs are collected by the control plane and passed to the chosen monitoring tool.

Service meshes also support tracing. In the context of microservices, tracing is crucial for understanding dependencies between services and setting up root cause analysis. Correct tracing allows identifying sequence problems, service call tree anomalies, and request-specific problems.

Another important aspect is logging. Thanks to service meshes, network logs are uniform, regardless of the type of technology used in microservices, this is because all traffic between services is proxied by Envoy.

Traffic control with advanced functionalities

Service meshes perform routing of requests between microservices and from the outside to the correct microservice. Additionally, they allow using advanced functionalities such as:

  • Canary release: this process involves deploying an instance of the new version of a service while the previous version is still active. Thanks to service meshes, it's possible to direct traffic to both services to verify the correct functioning of the new version, while still having the previous one available to potentially do a rollback.
  • Mirroring: in this case, the new and old version of a microservice are run in parallel. Both receive traffic and the different behaviors can be studied in detail to determine if the new version works correctly.
  • A/B testing: two versions of services are tested to verify which is the most performant (for example from the point of view of generated revenue). Both versions receive traffic from user groups selected according to precise criteria or randomly.

Service mesh on multi-cluster

Ultimately, who should use service meshes? The answer to this question is simple: anyone developing microservices applications. On one hand, the introduction of service meshes can add initial complexity regarding installation in the cluster, management, and understanding of basic logic. However, this initial effort is repaid already in the short term, with benefits observable immediately.

In the tech community, the importance of service meshes is now established, to the point that the discourse no longer revolves around their possible adoption, but around their use in multi-cloud and multi-cluster environments.

Today the need is to be able to instantiate clusters on multiple providers and to be able to manage them in the same way. My application must be able to talk to service A on GCP and to service B that is on AWS without having to implement networking mechanisms more complex than the service mesh. Future developments of this functionality are therefore expected to be of particular interest also for companies that need to manage a certain complexity.

When not to use a service mesh 

Despite the benefits we've seen so far, the service mesh is not always the best choice, especially in projects with limited requirements or in the initial phase. In particular, if the application is composed of few microservices and the DevOps team has limited resources, adopting a mesh can introduce significant operational and cognitive overhead, not justified by the advantages.

Moreover, implementation requires specific knowledge, additional compute resources, and a non-trivial initial configuration phase. In these cases, security and monitoring functionalities can be obtained through simpler solutions, such as advanced API Gateways, manual sidecars, or native observability tools.

Before adopting a service mesh, it's therefore useful to carefully evaluate the stage of technological maturity, the infrastructure complexity, and the team's ability to maintain the system over time. In short, the mesh should not be seen as a prerequisite, but as a strategic lever to activate when needed.

Evolution of service meshes: towards a proxyless model? 

Precisely to overcome these limits and simplify their adoption, service meshes are rapidly evolving with new approaches and architectures.

The service mesh is in continuous evolution. After years of adoption based on the sidecar model, the main projects are exploring proxyless or hybrid approaches. The goal of these experiments is aimed at improving scalability and reducing resource consumption.

An example is Istio's Ambient Mesh, which replaces sidecars with waypoint proxies external to the pod. This architecture drastically reduces CPU and memory consumption, simplifies onboarding of legacy workloads, and decreases certificate management complexity.

The adoption of proxyless architectures allows extending the mesh to new domains, such as VMs, serverless applications, or edge workloads, without requiring a profound modification of the environment. For CTOs, it means having greater flexibility in modeling the service network without compromising on policies or security.

In conclusion, service meshes are a valuable tool, useful for improving security, control, and visibility in microservices systems. Want to understand how to adopt them correctly and grow your architecture? Consider a support path like Cloud Native Journey by SparkFabrik, which helps you step by step in the transformation to the Cloud Native paradigm.