Multi-tenant systems are invaluable for modern, fast-paced businesses. These systems allow multiple users and teams to access and use them at the same time. Machine learning operations (MLOps) teams, in particular, benefit greatly from using multi-tenant systems. MLOps teams that don’t leverage multi-tenant systems can fall victim to inefficiency, inconsistency, duplicative work, and bumpy onboarding—adding friction to already complex workstreams. Let’s take a look at the benefits of multi-tenant systems for MLOps teams, challenges for multi-tenancy, best practices to scale efficiently, and what the future may look like for multi-tenancy.

A multi-tenant system allows more than one user to work within it without their work being hampered. Google Drive and Salesforce are excellent examples of best-in-class multi-tenant systems. They allow large companies to develop a single body of work on a single system, reducing the cost of ownership by eliminating duplicate support efforts.

In the context of MLOps, the benefits of using a multi-tenant system are manifold. Machine learning engineers, data scientists, analysts, modelers, and other practitioners contributing to MLOps processes often need to perform similar activities with equally similar software stacks. It is hugely beneficial for a company to maintain only one instance of the stack or its capabilities—this cuts costs, saves time, and enhances collaboration. In essence, MLOps teams on multi-tenant systems can be exponentially more efficient because they aren’t wasting time switching between two different stacks or systems. 

Growing demand for multi-tenancy

Adoption of multi-tenant systems is growing, and for good reason. These systems help unify compute environments, discouraging those scenarios where individual groups set up their own bespoke systems. Fractured compute environments like these are highly duplicative and exacerbate cost of ownership because each group likely needs a dedicated team to keep their local system operational. This also leads to inconsistency. In a large company, you might have some groups running software that is on version 7 and others running version 8. You may have groups that use certain pieces of technology but not others. The list goes on. These inconsistencies create a lack of common understanding of what’s happening across the system, which then exposes the potential for risk.

Related work from others:  Latest from MIT Tech Review - The dark secret behind those cute AI-generated animal images

Ultimately, multi-tenancy is not a feature of a platform: It’s a baseline security capability. It’s not sufficient to simply plaster on security as an afterthought. It needs to be a part of a system’s fundamental architecture. One of the greatest benefits for teams that endeavor to build multi-tenant systems is the implicit architectural commitment to security, because security is inherent to multi-tenant systems.

Challenges and best practices

Despite the benefits of implementing multi-tenant systems, they don’t come without challenges. One of the main hurdles for these systems, regardless of discipline, is scale. Whenever any scaling operation kicks off, patterns emerge that likely weren’t apparent before.

As you begin to scale, you garner more diverse user experiences and expectations. Suddenly, you find yourself in a world where users begin to interact with whatever is being scaled and use the tool in ways that you hadn’t anticipated. The bigger and more fundamental challenge is that  you’ve got to be able to manage more complexity.

When you’re building something multi-tenant, you’re likely building a common operating platform that multiple users are going to use. This is an important consideration. Something that is multi-tenant is also likely to become a fundamental part of your business because it’s such a meaningful investment. 

To successfully execute on building multi-tenant systems, strong product management is crucial, especially if the system is built by and for machine learning experts. It’s important that the people designing and building a domain-specific system have deep fluency in the field, enabling them to work backward from their end users’ requirements and capabilities while being able to anticipate future business and technology trends. This need is only underscored in evolving domains like machine learning, as demonstrated by the proliferation and growth of MLOps systems.

Related work from others:  Latest from Google AI - Robust and efficient medical imaging with self-supervision

Aside from these best practices, make sure to obsessively test each component of the system and the interactions and workflows they enable—we’re talking hundreds of times—and bring in users to test each element and emergent property of functionality. Sometimes, you’ll find that you need to implement things in a particular way because of the business or technology. But you really want to be true to your users and how they’re using the system to solve a problem. You never want to misinterpret a user’s needs. A user may come to you and say, “Hey, I need a faster horse.” You may then spend all your time training a faster horse, when what they actually needed was a more reliable and rapid means of conveyance that isn’t necessarily powered by hay.

Finally, focus on iterative programming—it may feel like it’s a slow burn, but it will save you time and resources in the long run because you’ve done the legwork and sorted out the kinks before they come back to haunt you. 

The future of multi-tenancy 

This is an exciting space to be in and the momentum is expected to continue. We can expect to see continuous investment in cloud technologies and other fully managed services. Particularly within AI, ML, and MLOps, things are moving rapidly—so much so that whenever someone recommends a new piece of technology or software, it’s out-of-date almost immediately. What really matters now, and will matter even more in the future, is the ability to iterate quickly. What we’re going to see happen more and more is companies, large and small, working toward mastering such agility. The more they do, the more progress we will see and the more exciting the future becomes. 

Related work from others:  Latest from Google AI - Learning from deep learning: a case study of feature discovery and validation in pathology

This content was produced by Capital One. It was not written by MIT Technology Review’s editorial staff.

Similar Posts