In part 1 of this series we stated that Nuvola is a multi-tenant software and using a single db for every tenant can quickly become a problem as your data grow.
Let’s dig a bit about pros and cons about multi-tenant sharding.
- Scalability Splitting each tenant data in its storage can easily make the storage distributed. You can have some tenants on a machine and some others on other one. It’s up to you to design your database infrastructure to scale properly.
- Security Every tenant is physically separated from each other. If one tenant get compromised it doesn’t affect the others. Moreover you can offer custom service level to different tenants.
- Software complexity To deal with multiple databases you need to add software logic. How can you send user X queries to his tenant? How to execute schema migrations on every db simultaneously?
- Data aggregation Data are split by tenants. So, if you need to aggregate some data, you have to deal with it by yourself. You don’t have relationships between different databases. Extracting reports and statistics can be very problematic.
- Complex infrastructure Having your data in hundreds or thousands of databases surely increases the costs and time to keep them running properly and up-to-date. From daily checks to disaster recovery policy, it’s more complex to be sure everything is working properly.
As stated in part 1, there area some enterprise solutions sold by big vendors. They are ready-made for you but at least you have two drawbacks: they are expensive and they lock you to a specific provider. We want to avoid this and so we chose to go for a home-made solution.
If you search on the Internet for multi-tenant solutions you often read about using subdomains to identify tenants. If you think for a moment it sounds pretty straightforward. Suppose your tenant is accessible on the url “tenat1.yourapp.com”. Your web server can set an environment variable with value of “tenant1” and so your web app can easily tie each request with that subdomain to the db for “tenant1”.
For us this wasn’t acceptable because we already had a single entry point for every user. Changing it for every tenant (about 500 at the time of the migration) wasn’t an option at all.
Centralized db to let users login
Wanted to keep a single domain for every tenants we need to find a way to let users login and then set up the code to work with the user tenant.
To achieve this goal we set up a “global” database where we store every user of the system and the information about which tenant he belongs to. User data in this global db are always synchronized to the tenant’s one. If a user gets created, modified or deleted on a tenant, he’s synchronized to the global one. If he gets modified on the global, he’s synchronized to the tenant.
Doing so, when a user logs in goes to the global db. There we check if he can login and which tenant he belongs to. We put this information on his session and so every other request is tied to the proper tenant.
Let’s recap the pros and cons of this solution.
- no subdomains Avoiding subdomains means not causing troubles or breaking changes to your users. You changed your implementation but not the interface. You have more freedom in the future to rearrange things in different ways. Moreover there is no need to have a particular web server configuration because you have a simple virtual host pointing to you app: nothing less, nothing more.
- Synchronized users data You need to always have users data synchronized between global db and the specific tenant. Moreover you need to manage this extra global db. It receives every user login and so you need to take special care to let it scale properly.
In the next article we’ll see what Symfony and Doctrine offer to us to implement the chosen solution. Stay tuned!