This repo is an experiment how works sharding via Vitess. Vitess is an application on GoLang which resolves problems if your MySQL becomes very big. For example, it can help do MySQL sharding. So, imagine that we have 3 tables of MySQL and 2 of them are massive. Once we consider that to separate our 2 tables between other servers is good idea. And now you can practice here before you're going to shard your production database.
Here you can see how communicate containers between each other.
Picture from Vitess architecture
So we have 9 containers that it works.
| № | Container | Description |
|---|---|---|
| 1 | app (Application) |
PHP Application which make queries to Vitess. |
| 2 | topology | Helps to communicate whole parts of system between each other. |
| 3 | vtctl (vtctl + vtctld) | CLI and API service to admin a Vitess cluster. |
| 4 | unsharded-node (VTTablet + MySQL) | Serve unsharded data. |
| 5 | transitional-node (VTTablet + MySQL) | Assist in transferring data to the sharded space. |
| 6 | left-shard-node (VTTablet + MySQL) | First shard with data. |
| 7 | right-shard-node (VTTablet + MySQL) | Second shard with data. |
| 8 | vtgate | Serve SQL queries. |
| 9 | vtadmin | GUI. |
| 10 | vtorc | Automated fault detection and repair tool. |
Each node has three MySQL and VTTablet instances: primary, replica, replica with semi sync replication.
System requirements:
- Docker
- Minimal resource allocation
- CPU 2
- RAM 8GB
This is an example of resource utilization while the app container does query every 0.0001s single thread
Let's install our cluster!
Just run.
git clone https://github.com/revilon1991/vitess-cluster.git
cd vitess-cluster
docker-compose up -dContainer unsharded-node run MySQL and VTTablet and load sql schema.
Keyspace will be unsharded for this node.
There will be a schema of three tables - Device, User, RequestLog.
Container vtgate load sql dataset and users.
Dataset will be load to keyspace unsharded for all tables.
Container transitional-node run MySQL and VTTablet.
This node will be in new sharded keyspace.
It's node for migration two tables Device, User from unsharded keyspace.
As well, we'll switch traffic on Device and User tables to new sharded keyspace.
Traffic on table RequestLog keeps on unsharded keyspace.
That action name is MoveTables.
Actual, it needs for vertical sharding or preparing to horizontal sharding.
Container left-shard-node run MySQL and VTTablet in keyspace sharded.
After that we'll make new shard which will be contained ~50% data from Device and User tables.
The data have not been migrated from transitional-node yet. It'll happen after initialing right-shard-node.
Now the traffic for Device and User going only to transitional-node of sharded keyspace.
Container right-shard-node run MySQL and VTTablet in keyspace sharded.
After that we'll make else once shard which will be contained second ~50% part of data from Device and User tables.
Now the data have been successfully migrated and sharded from transitional-node to left-shard-node and right-shard-node.
The traffic also was switched between left-shard-node and right-shard-node.
The metadata from transitional-node for a topology was deleted but the data you should remove by yourself.
Also, we'll set vindex (sharding key) config.
Container vtadmin run NodeJS server with web interface and run API server Vitess.
You can research your own cluster here http://localhost:14201/tablets
Container vtorc run orchestrator. It help to keep your cluster healthy.
Here http://localhost:16000/debug/status you can see recent recoveries.
Let's see how it works from simple PHP application.
docker-compose exec app ash
php index.phpThere just connection to database (VTGate) and one query with join on all tables.
$dsn = 'mysql:host=vtgate;port=15306';
$user = 'admin';
$password = 'admin';
$pdo = new PDO($dsn, $user, $password);
$sql = <<<SQL
select
*
from Device d
inner join User u on u.id = d.userId
left join RequestLog rl on rl.createdAt = d.createdAt
SQL;
$stmt = $pdo->query($sql);
$resultList = $stmt->fetchAll(PDO::FETCH_ASSOC);
var_dump($resultList);
// here will be all data Device, User from all shards and data from our first keyspace from table RequestLog.Change it, and you'll be able to do some experiments.
So, you could look that application doesn't mind about shard that just make queries. Vitess is doing all the work of merging data. Of course some restrictions is. You can see those here
| № | Service | URL |
|---|---|---|
| 1 | Tablet 101 - primary unsharded DB | http://localhost:15101/debug/status |
| 2 | Tablet 102 - replica unsharded DB | http://localhost:15102/debug/status |
| 3 | Tablet 103 - replica unsharded DB | http://localhost:15103/debug/status |
| 4 | Tablet 201 - primary transition sharded DB | http://localhost:15201/debug/status |
| 5 | Tablet 202 - replica transition sharded DB | http://localhost:15202/debug/status |
| 6 | Tablet 203 - replica transition sharded DB | http://localhost:15203/debug/status |
| 7 | Tablet 301 - primary left shard DB | http://localhost:15301/debug/status |
| 8 | Tablet 302 - replica left shard DB | http://localhost:15302/debug/status |
| 9 | Tablet 303 - replica left shard DB | http://localhost:15303/debug/status |
| 10 | Tablet 401 - primary right shard DB | http://localhost:15401/debug/status |
| 11 | Tablet 402 - replica right shard DB | http://localhost:15402/debug/status |
| 12 | Tablet 403 - replica right shard DB | http://localhost:15403/debug/status |
| 13 | Orchestrator | http://localhost:16000/debug/status |
| 14 | GUI | http://localhost:14201/tablets |
More information at Vitess Documentation.
