Author: Rene' Cannao'

Compiling ProxySQL 2.0.0

ProxySQL 2.0 is nearing (GA) and a lot of folks in the MySQL world are eager to check out many of the new features such as GTID consistent reads, frontend SSL and Galera native support. Packages will of course be released for all the supported Linux o…

SSL Encryption at ProxySQL Part 1

As we all know, MySQL supports using SSL to secure connections by encrypting the data in transit and protecting it from snooping on the wire.
As of now, since version v1.2.0e, ProxySQL supports SSL connections to the backends. Frontend SSL is enabled by default from version 2.0, which is under development.
Even if frontend SSL is not available before 2.0 version , there is a way to secure frontend traffic by deploying ProxySQL with app hosts and use sockets instead of tcp.
This document will cover how to integrate ProxySQL Backend SSL with MySQL to use an encrypted connection.
Enable Encryption on MySQL
Configuring MySQL for Encrypted Connections is out of the scope of this article, used information found in the following link to enable SSL on server.https://dev.mysql.com/doc/refman/5.7/en/using-encrypted-connections.html
First, we must check if MySQL server supports SSL connections.
Log into MySQL…
mysql> show global variables like ‘%ssl%’;
+—————+—————–+
| Variable_name | Value |
+—————+—————–+
| have_openssl | YES |
| have_ssl | YES |
| ssl_ca | ca.pem |
| ssl_capath | |
| ssl_cert | server-cert.pem |
| ssl_cipher | |
| ssl_crl | |
| ssl_crlpath | |
| ssl_key | server-key.pem |
+—————+—————–+
Create a user to permit only SSL-encrypted connection on MySQL Server.
GRANT ALL PRIVILEGES ON *.* TO ‘sysbench’@’172.16.1.%’ IDENTIFIED BY ‘sysbench’ REQUIRE SSL;
Test a secure connection:
root@proxysql-host:/data/benchmark/v2.0# mysql -usysbench -psysbench -h 172.16.1.102 -P3306 -e ‘\s’ | grep SSL

SSL: Cipher in use is DHE-RSA-AES256-SHA
As ProxySQL is forwarding traffic to all backend servers, we need to keep the same *.pem files on all database instances.
You can copy below following files from any DB node to all backends.
Remember that you have to change their ownership from root user to mysql.
root@mysql-host:/data/mysql# ls -lhtr *.pem
-rw-r–r– 1 mysql mysql 1.1K Mar 22 08:07 ca.pem
-rw——- 1 mysql mysql 1.7K Mar 22 08:07 ca-key.pem
-rw——- 1 mysql mysql 1.7K Mar 22 08:07 server-key.pem
-rw-r–r– 1 mysql mysql 1.1K Mar 22 08:07 server-cert.pem
-rw——- 1 mysql mysql 1.7K Mar 22 08:07 client-key.pem
-rw-r–r– 1 mysql mysql 1.1K Mar 22 08:07 client-cert.pem
-rw-r–r– 1 mysql mysql 452 Mar 22 08:07 public_key.pem
-rw——- 1 mysql mysql 1.7K Mar 22 08:07 private_key.pem

Once you are done, restart MySQL servers.
Enable Encryption on ProxySQL
At this stage, connection attempts to host 127.0.0.1 and port 6033 will not use SSL because no key and no certificate has been configured. Instead, normal non-SSL connections will be established.
We must now transfer ca.pem, client-cert.pem, and client-key.pem to ProxySQL server under folder /var/lib/proxysql/
mysql> SELECT * FROM global_variables WHERE variable_name LIKE ‘mysql%ssl%’;
+————————————-+—————-+
| variable_name | variable_value |
+————————————-+—————-+
| mysql-have_ssl | false |
| mysql-session_idle_show_processlist | true |
| mysql-ssl_p2s_ca | (null) |
| mysql-ssl_p2s_cert | (null) |
| mysql-ssl_p2s_key | (null) |
| mysql-ssl_p2s_cipher | (null) |
+————————————-+—————-+
6 rows in set (0.00 sec)
Currently, as seen in the ProxySQL configuration, SSL-related variables are not defined. We will have to change this.
UPDATE mysql_servers SET use_ssl=1 WHERE port=3306;
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
First of all, we have to tell ProxySQL that our backend nodes use SSL. Setting ‘use_ssl’ column in mysql_servers will do the trick. Remember that you have to load the changed configuration to runtime and eventually save it to disk.
Let’s see how it looks like now:
mysql> select hostgroup_id,hostname,port,status,max_connections,use_ssl from mysql_servers;
+————–+————–+——+——–+—————–+———+
| hostgroup_id | hostname | port | status | max_connections | use_ssl |
+————–+————–+——+——–+—————–+———+
| 0 | 172.16.1.103 | 3306 | ONLINE | 3000 | 1 |
| 0 | 172.16.1.104 | 3306 | ONLINE | 3000 | 1 |
| 0 | 172.16.1.105 | 3306 | ONLINE | 3000 | 1 |
| 0 | 172.16.1.102 | 3306 | ONLINE | 3000 | 1 |
+————–+————–+——+——–+—————–+———+
4 rows in set (0.00 sec)
As can be seen above, all looks good. Now it’s time to configure ProxySQL to use keys and certificates that will allow it to connect to the backend using SSL:
SET mysql-ssl_p2s_cert=”/var/lib/proxysql/client-cert.pem”;
SET mysql-ssl_p2s_key=”/var/lib/proxysql/client-key.pem”;
SET mysql-ssl_p2s_ca=”/var/lib/proxysql/ca.pem”;
SET mysql-ssl_p2s_cipher=’ECDHE-RSA-AES256-SHA’;

LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;
Again, after setting up those variables remember to load the configuration to runtime and save it to persistent storage. Let’s see how it looks like now:
mysql> SELECT * FROM global_variables WHERE variable_name LIKE ‘mysql%ssl%’;
+————————————-+———————————–+
| variable_name | variable_value |
+————————————-+———————————–+
| mysql-have_ssl | false |
| mysql-session_idle_show_processlist | true |
| mysql-ssl_p2s_ca | /var/lib/proxysql/ca.pem |
| mysql-ssl_p2s_cert | /var/lib/proxysql/client-cert.pem |
| mysql-ssl_p2s_key | /var/lib/proxysql/client-key.pem |
| mysql-ssl_p2s_cipher | ECDHE-RSA-AES256-SHA |
+————————————-+———————————–+
6 rows in set (0.00 sec)
Everything looks as expected.
Once we are done with above changes, we have to restart ProxySQL service.
At this point, all new connections to host 127.0.0.1 and port 6033 will use SSL.
We can verify this by executing below commands:
root@sysbench-host:/data/benchmark/v2.0# mysql -h127.0.0.1 -P6033 -usysbench -psysbench
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.30 (ProxySQL)
Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

mysql> SHOW SESSION STATUS LIKE ‘Ssl_cipher’;
+—————+———————-+
| Variable_name | Value |
+—————+———————-+
| Ssl_cipher | ECDHE-RSA-AES256-SHA |
+—————+———————-+
1 row in set (0.00 sec)
Using SSL creates some overhead and to understand how big it is we performed a simple read-only test using sysbench against ProxySQL with backend SSL enabled and disabled. ProxySQL was configured to use 4 internal threads and we are happy to announce that results are quite consistent.
For the reference, we used following sysbench command :
for conn in 1 8 128 ; do for i in $(seq 1 3) ; do echo “${conn}:${i}”; ulimit -n 1048576; ./src/sysbench ./src/lua/oltp_read_only.lua –db-debug=on –report-interval=20 –table-size=70000000 –tables=20 –mysql-db=sbtest_rw –mysql-user=sysbench –mysql-password=sysbench –db-driver=mysql –mysql-host=127.0.0.1 –max-requests=0 –mysql-port=6033 –db-ps-mode=disable –skip-trx=on –threads=${conn} –max-time=60 run ; ulimit -n 1048576; ./src/sysbench ./src/lua/oltp_read_only.lua –db-debug=on –report-interval=20 –table-size=70000000 –tables=20 –mysql-db=sbtest_rw –mysql-user=sysbench –mysql-password=sysbench –db-driver=mysql –mysql-host=127.0.0.1 –max-requests=0 –mysql-port=6033 –db-ps-mode=disable –skip-trx=on –threads=${conn} –max-time=120 run |tee /data/benchmark/v2.0/v2.0_4threads${conn}connections.${i}_line.log ; done ; done
For more details : https://github.com/sysown/proxysql/wiki/SSL-Support
Authored by : Ashwini Ahire

Releasing ProxySQL 1.4.8

Proudly announcing the release of the latest stable release of ProxySQL 1.4.8 as of the 11th April 2018.
ProxySQL is a high performance, high availability, protocol aware proxy for MySQL. It can be downloaded here, and freely usable and accessible acc…

ClickHouse and ProxySQL queries rewrite

ClickHouse and MySQL do not have the same SQL syntax and functions
ProxySQL is a popular open source, high performance and protocol-aware proxy server for MySQL and its forks.
Back in September 2017 we announced the introduction of ClickHouse as backend.
ClickHouse is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries.
To support ClickHouse as a backend, ProxySQL acts as a data bridge between MySQL protocol and ClickHouse protocol, allowing MySQL clients to execute queries in ClickHouse through it.
When we started using it in real-world scenarios, we faced an expected issue: ClickHouse’s SQL query syntax is different than MySQL’s syntax, and migrating application from MySQL to ClickHouse isn’t just a matter of changing connections endpoint but it also requires modifying some queries. This needs development time, but not always possible.
One of ProxySQL most widely used feature is indeed the ability of rewriting queries, so often it is just a matter of writing the right query rules.
Let make an example.
We originally had this query:
SELECT COUNT(`id`), FROM_UNIXTIME(`created`, ‘%Y-%m’) AS `date` FROM `tablename` GROUP BY FROM_UNIXTIME(`created`, ‘%Y-%m’)
ClickHouse doesn’t support FROM_UNIXTIME, but it supports toDate and toTime.
ClickHouse also supports toYear and toMonth , useful to format the date the same FROM_UNIXTIME does.
Therefore, it is possible to rewrite the query as:
SELECT COUNT(`id`), concat(toString(toYear(toDate(created))), ‘-‘, toString(toMonth(toDate(created)))) AS `date`
FROM `tablename`
GROUP BY toYear(toDate(created)), toMonth(toDate(created));
To perform the above rewrite, we will need two rules, one for the first FROM_UNIXTIME, and one for the second one. Or we can just use one rewrite rules to replace FROM_UNIXTIME(created, ‘%Y-%m’) no matter if on the retrieved fields or in the GROUP BY clause, generatinging the following query:
SELECT COUNT(`id`), concat(toString(toYear(toDate(created))), ‘-‘, toString(toMonth(toDate(created)))) AS `date`
FROM `tablename`
GROUP BY concat(toString(toYear(toDate(created))), ‘-‘, toString(toMonth(toDate(created))));
Does it look great? No, not yet!
For the month of March, concat(toString(toYear(toDate(created))), ‘-‘, toString(toMonth(toDate(created)))) will return 2018-3 : not what the application was expecting, as MySQL would return 2018-03 . The same applies for all the first 9 months of each year.
Finally, we rewrote the query as the follow, and the application was happy:
SELECT COUNT(`id`), substring(toString(toDate(created)),1,7) AS `date`
FROM `tablename`
GROUP BY substring(toString(toDate(created)),1,7);
Note: because of the datatypes conversions that ClickHouse needs to perform in order to execute the above query, its execution time is about 50% slower than executing the following query:
SELECT COUNT(`id`), concat(toString(toYear(toDate(created))), ‘-‘, toString(toMonth(toDate(created)))) AS `date`
FROM `tablename`
GROUP BY toYear(toDate(created)), toMonth(toDate(created));
Architecture using 2 ProxySQLs
Great, we now know how to rewrite the query!
Although, the ClickHouse module in ProxySQL doesn’t support query rewrite. The ClickHouse module in ProxySQL is only responsible to transform data between MySQL and ClickHouse protocol, and viceversa.
Therefore the right way of achieving this solution is to configure two ProxySQL layers, one instance responsible for rewriting the query and sending the rewritten query to the second ProxySQL instance, this one responsible for executing the query (already modified) on ClickHouse.
Architecture using only one ProxySQL
Does the above architecture seems complex? Not really, it is reasonable straightforward.
Can it be improved?
As you can see from the previous chart, the ClickHouse module and the MySQL module listen on different ports. The first ProxySQL instance is receiving traffic on port 6033, and sending traffic on the second PorxySQL instance on port 6090.
Are two instances really required? The answer is no.
In fact, a single instance can receive MySQL traffic on port 6033, rewrite the query, and send the rewritten query to itself on port 6090, to finally execute the rewritten query on ClickHouse.
This diagram describes the architecture:
Configuration
For reference, below is the step to configure one single ProxySQL to send traffic to ClickHouse, and use itself as a backend.
Create ClickHouse user:
INSERT INTO clickhouse_users (username,password) VALUES (‘clicku’,’clickp’);
LOAD CLICKHOUSE USERS TO RUNTIME;
SAVE CLICKHOUSE USERS TO DISK;
Create MySQL user (same as ClickHouse):
INSERT INTO mysql_users(username,password) SELECT username, password FROM clickhouse_users;
LOAD MYSQL USERS TO RUNTIME;
SAVE MYSQL USERS TO DISK;
Configure ProxySQL itself as a backend for MySQL traffic:
INSERT INTO mysql_servers(hostname,port) VALUES (‘127.0.0.1’,6090);
SAVE MYSQL SERVERS TO DISK;
LOAD MYSQL SERVERS TO RUNTIME;
Create a query rule for rewriting queries:
INSERT INTO mysql_query_rules (active,match_pattern,replace_pattern,re_modifiers) VALUES
(1,”FROM_UNIXTIME\(`created`, ‘%Y-%m’\)”, ‘substring(toString(toDate(created)),1,7)’,”CASELESS,GLOBAL”);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;
This is a very simple example to demonstrate how to perform query rewrite from MySQL to ClickHouse using just one ProxySQL instance.
In a real world scenarios you will need to create more rules based on your own queries.
Conclusion
Not only ProxySQL allows to send queries to ClickHouse, but it also allows to rewrite queries to solve issues related to different SQL syntax and available functions.
To achieve this, ProxySQL uses its ability to use itself as a backend: rewrite the query in the MySQL module, and execute it in the ClickHouse module.

GTID consistent reads

Adaptive query routing based on GTID tracking
ProxySQL is a layer 7 database proxy that understands the MySQL Protocol. It provides high availability and high performance out of the box. ProxySQL has a rich set of features, and the upcoming version 2.0 has new exciting features, like the one described in this blog post.
One of the most commonly used feature is query analysis and query routing in order to provide read/write split.
When a client connected to ProxySQL executes a query, ProxySQL will first check its rules (configured by a DBA) and determine on which MySQL server the query needs to be executed. A minimalistic example is to send all reads to slaves, and writes to master. Of course, this example is not to be used in production, because sending reads to slaves may return stale data, and while stale data is ok for some queries, it is not ok for other queries. For this reason, we generally suggest to send all traffic to master, and thanks to the statistics that ProxySQL makes available, a DBA can create more precise routing rules in which only specific set of queries are routed to the slaves. More details and examples are available in a previous blog post.
ProxySQL routing can be customized a lot: it is even possible to use next_query_flagIN to specific that after a specific query, the next query (or the next set of queries) should be executed on a determined hostgroup. For example, you can specify that after a specific INSERT, the following SELECT should be executed on the same server. Although this solution is advanced, it is also complex to configure because the DBA should know the application logic to determine which are the read-after-write queries.
Causal consistency reads
Some application cannot work properly with stale data, but they can operate properly with causal consistency reads: this happens when a client is able to read the data that has written. Standard MySQL asynchronous replication cannot guarantee that. In fact, if a client writes on master, and immediately tries to read it from a slave, it is possible that the data is yet not replicated to slave.
How can we guarantee that a query is executed on slave only if a certain write event was already replicated?
GTID helps in this.
Since MySQL 5.7.5 , a client can know (in the OK packet returned by MySQL Server) which is the GTID of its last write, and can execute a read on any slave where that GTID is already executed. Lefred described the process in a blog post with an example: the server needs to have session_track_gtids enabled, and the client can execute WAIT_FOR_EXECUTED_GTID_SET on a slave and wait for the replication to apply the event on slave.
Although this solution works, it is very trivial and not usable in production, mostly for the following reasons/drawbacks:

executing a query with WAIT_FOR_EXECUTED_GTID_SET before the real query will add latency before any real query
the client doesn’t know which slave will be in sync first, so it needs to check many/all slaves
it is possible that none of the slaves will be in sync in an acceptable amount of time (will you wait few seconds before running the query?)

That being said, the above solution is not usable in production and is mostly a POC.
Can ProxySQL tracks GTID?
ProxySQL acts as client for MySQL . Therefore, if session_track_gtids is enabled, ProxySQL can track the GTID of all the clients’ requests, and know exactly the last GTID for each client’s connection. ProxySQL can then use this information to send a read to the right slave where the GTID was already executed. How can ProxySQL track GTID executed on slaves?
There are mostly two approaches:

Pull : at regular interval, ProxySQL queries all the MySQL servers to retrieve the GTID executed set
Push : every time a new write event is executed on a MySQL server (master or slave) and a new GTID is generated, ProxySQL is immediately notified

No need to say, the pull method is not very efficient because it will almost surely introduce latency based on how frequently ProxySQL will query the MySQL servers to retrieve the GTID executed set. The less frequent the check, the less accurate it will be. The more frequent the check, the more precise, yet it will cause load on MySQL servers and inefficiently use a lot of bandwidth if there are hundreds of ProxySQL instances. In other words, this solution is not efficient neither scalable.
What about pull method?
Real time retrieval of GTID executed set
Real time retrieval of GTID executed set is technically simple: consume and parse binlog in real time! Although, if ProxySQL becomes a slave of every MySQL server, it is easy to conclude that this solution will consume CPU resources on ProxySQL instance. To make things worse, in a setup with a lot of ProxySQL instances, if each ProxySQL instance needs to get replication events via replication from every MySQL server, network bandwidth will soon become a bottleneck. For example, think what will happen if you have 4 clusters, and each cluster has 1 master generating 40GB of binlog per day and 5 slaves, and a total of 30 proxysql instances. If each proxysql instance needs to become a slave of the 24 MySQL servers, this solution will consume nearly 30TB of network bandwidth in 1 day (and you don’t want this if you pay for bandwidth usage).
ProxySQL Binlog Reader
The pull method described above doesn’t scale and it consumes too many resources. For this reason, a new tool was implemented: ProxySQL Binlog Reader.

ProxySQL Binlog Reader is a lightweight process that run on the MySQL server, it connects to the MySQL server as a slave, and tracks all the GTID events.
ProxySQL Binlog Reader is itself a server: when a client connects to it, it will start streaming GTID in a very efficient format to reduce bandwidth.

By now you can easily guess who will be the clients of ProxySQL Binlog Reader: all the ProxySQL instances.
Real time routing
ProxySQL Binlog Reader allows ProxySQL to know in real time which GTID was been executed on every MySQL server, slaves and master itself. Thanks to this, when a client executes a reads that needs to provide causal consistency reads, ProxySQL immediately knows on which server the query can be executed. If for whatever reason the writes was not executed on any slave yet, ProxySQL will know that the write was executed on master and send the read there.
Advanced configuration, and support for many clusters
ProxySQL is extremely configurable, and this is true also for this feature. The most important tuning is that you can configure if a read should provide causal consistency or not, and if causal consistency is required you need to specify to which hostgroup should be consistent. This last detail is very important: you don’t simply enable causal consistency, but you need to specify that a read to hostgroup B should be causal consistent to hostgroup A. This allows ProxySQL to implement the algorithm on any number of hostgroups and clusters, and also allows a single client to execute queries on multiple clusters (sharding) knowing that the causal consistency read will be executed on the right cluster.
Requirements
Casual reads using GTID is only possible if:

ProxySQL 2.0 is used (older versions do not support it)
the backend is MySQL version 5.7.5 or newer. Older versions of MySQL do not have capability to use this functionality.
replication binlog format is ROW
GTID is enabled (that sounds almost obvious).
backends are either Oracle’s or Percona’s MySQL Server: MariaDB Server does not support session_track_gtids, but I hope it will be available soon.

Conclusion
The upcoming release of ProxySQL 2.0 is able to track executed GTID in real-time from all the MySQL servers in a replication topology. Adaptive query routing based on GTID tracking allows to provide causal reads, and ProxySQL can route reads to the slave where the needed GTID event was already executed. This solutions scales very well with limited network usage, and is being already tested in production environments.
Do you want to know more?
Join us at Percona Live 2018 and attend our session Consistent Reads Using ProxySQL and GTID

ProxySQL new routing algorithm to handle thousands of hundreds of schemas/shards

ProxySQL is a service designed to scale and handle traffic in very large setups.
In the past we already showed examples of how ProxySQL can handle thousands of MySQL servers, and millions (yes, millions!) of distinct users.
Although, recently a new challenge raised: can ProxySQL perform routing based on MySQL schemaname for a vary large number of schemas?
We know that in production setups ProxySQL is already performing routing to MySQL servers based on schemaname for few hundreds schemas, but can it handle thousands or more?
Technically there is no limit on how many shards ProxySQL can handle, as long as there are rules for them.
Tibor Korocz already pointed in a blog post that more query rules can affect performance.
In case of query routing based on schemaname no regular expressions are involved (therefore each rule is not expensive), but the same principle applies: the more rules need to be evaluated, longer it takes to make a decision (in this case to perform the routing).
Scope of this blog post is to understand the performance implication of performing routing based on the number of rules.
To simplify the setup to run some benchmark, we used only 1 MySQL server as backend, and we created 50000 schemas:
for i in `seq 10001 60000` ; do
echo “CREATE DATABASE IF NOT EXISTS shard_$i;”
done | mysql -u root
To run the benchmark we used a various number of rules.
For every test, we configured ProxySQL setting the right number of rules: for 100 schemas we created 100 rules, for 1000 schema 1000 rules, etc.
For example, for 200 rules we used this configuration:
(
echo “DELETE FROM mysql_query_rules; INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\”sbtest\”,120000);”
for i in `seq 10001 10200` ; do
echo “INSERT INTO mysql_query_rules (active,username,schemaname,destination_hostgroup,apply) VALUES (1,\”sbtest\”,\”shard_$i\”,1,1);”
done
echo “LOAD MYSQL QUERY RULES TO RUNTIME;”
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
We then ran queries as the following:
for j in `seq 1 50000` ; do
echo “USE shard_$(($RANDOM%200+10001))” ; echo “SELECT 1;”
done | mysql -u sbtest -psbtest -h 127.0.0.1 -P6033 –ssl-mode=disabled -NB > /dev/null
It is relevant to note that we aren’t interested in the total execution time (this is why we used this simple one-liner to generate traffic), but in the time spent inside the Query Processor.
Because the execution time of the queries is not relevant, we also configured the query cache.
To compute the time spent in the Query Processor we set variable mysql-stats_time_query_processor=’true’ and computed the value of status variable Query_Processor_time_nsec. This status variable measures the time spent inside the Query Processor, in nanoseconds.
As we expected, the more query rules ProxySQL needs to evaluate, the longer it takes to compute the destination hostgroup for that schemaname. This results in latency before executing the query:
The graph above shows that for 10 rules the average latency is around 1us (microsecond), for 100 rules the average latency is 2us , and that it gradually grows to 6us for 500 rules and 256us for 20000 rules.
6us average latency for 500 rules doesn’t seem to be an issue, but 256us (0.256ms) average latency for 20000 does seem like an issue.
What is also important to note is that these values are the average values for all the shards. Routing decision for some shards are faster than others, depending in which order the rules are written.
In fact, with 20000 rules the average latency for shard_10001 is 0.89us , while the average latency for shard_30000 is 581.1us !
Surely, this doesn’t seem a scalable solution.
MySQL Query Rules Fast Routing
ProxySQL 1.4.7 introduces a new routing capability that enhances what already configurable in mysql_query_rules.
The new capability is configurable using table mysql_query_rules_fast_routing. This table is simpler compared to mysql_query_rules, and its purpose is simple: given username, schemaname and flagIN, finds the destination_hostgroup.
For reference, this is the table definition of mysql_query_rules_fast_routing:
CREATE TABLE mysql_query_rules_fast_routing (
username VARCHAR NOT NULL,
schemaname VARCHAR NOT NULL,
flagIN INT NOT NULL DEFAULT 0,
destination_hostgroup INT CHECK (destination_hostgroup >= 0) NOT NULL,
comment VARCHAR NOT NULL,
PRIMARY KEY (username, schemaname, flagIN) )
Table mysql_query_rules_fast_routing should be considered as an extension of mysql_query_rules. After processing the rules in mysql_query_rules, the rules in mysql_query_rules_fast_routing will be evaluated, unless the last matching rule defined in mysql_query_rules sets apply=1.
Because of the nature of the rules in mysql_query_rules_fast_routing, one and only one rule will be evaluated.
The rules defined in mysql_query_rules_fast_routing are loaded in a hash table, where the key is username, schemaname and FlagIN , and the value is the destination_hostgroup .
New routing algorithm in action
This is the command we used to configure mysql_query_rules_fast_routing with a different number of rules, up to 50000 rules in this example:
(
echo “DELETE FROM mysql_query_rules; DELETE FROM mysql_query_rules_fast_routing;”
echo “INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\”sbtest\”,120000);”
for i in `seq 10001 60000` ; do
echo “INSERT INTO mysql_query_rules_fast_routing (username,schemaname,flagIN,destination_hostgroup,comment) VALUES (\”sbtest\”,\”shard_$i\”,0,1,\”\”);”
done
echo “LOAD MYSQL QUERY RULES TO RUNTIME;”
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
The results are really interesting:
In other words:

between 10 rules to 10k rules, the average latency is between 1.7us to 1.9us
at 20k rules the average latency is 2.2us
at 50k rules the average latency is 2.4us

What is really interesting is that this new routing algorithm allows ProxySQL to perform schema routing for up to 10000 shards introducing less latency the old routing algorithm was introducing for 100 shards, and it scales very well to 50k rules wth very small overhead compared to 100 shards in the old routing algorithm.
Disabling mysql-stats_time_query_processor by default
In ProxySQL 1.4.4 the new variable mysql-stats_time_query_processor was introduced to dynamically enable or disable measuring the time spent in the Query Processor. Although timing is very important to understand the impact of having too many rules, measuring time elapse with good precision has some performance impact.
In fact, on this same server where we ran these benchmark, measuring time elapse with very good precision has a cost (latency) of 0.3us , mostly spent in kernel space (system calls).
That also means that from the metrics above you should remove 0.3us to determine the latency when measuring time is not enabled. For example, “at 50k rules the average latency is 2.4us” should become “at 50k rules the average latency is 2.1us” .
For this reason, since ProxySQL 1.4.4 , mysql-stats_time_query_processor is disabled by default.
Why flagIN is relevant?
The new routing algorithm performs routing based on username, schemaname, and flagIN .
I think some are asking why flagIN is relevant.flagIN in mysql_query_rules_fast_routing allows to set flagOUT in mysql_query_rules based on other criteria later relevant for routing.
For example, assume that you don’t want to only perform routing based on username+schemaname, but also read/write split.
You can use mysql_query_rules to:

set flagOUT = 1 for all queries to be sent to the server that is the master, no matter in which cluster
set flagOUT = 2 for all queries to be sent to the server that is the slave, no matter in which cluster

The value of flagOUT resulting from mysql_query_rules will become the flagIN in mysql_query_rules_fast_routing. This will allow to combine read/write split to routing based on username+schemaname.
For benchmark purpose, let’s create 200.000 rules:
(
echo “DELETE FROM mysql_query_rules; DELETE FROM mysql_query_rules_fast_routing; INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\”sbtest\”,120000);”
for i in `seq 10001 60000` ; do
echo “INSERT INTO mysql_query_rules_fast_routing (username,schemaname,flagIN,destination_hostgroup,comment) VALUES (\”sbtest\”,\”shard_$i\”,0,1,\”\”);”
done
echo “INSERT INTO mysql_query_rules_fast_routing SELECT username, schemaname, FlagIN+1, destination_hostgroup, comment FROM mysql_query_rules_fast_routing;”
echo “INSERT INTO mysql_query_rules_fast_routing SELECT username, schemaname, FlagIN+2, destination_hostgroup, comment FROM mysql_query_rules_fast_routing;”
echo “LOAD MYSQL QUERY RULES TO RUNTIME;”
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
mysql -u admin -padmin -h 127.0.0.1 -P6032 -e “SELECT COUNT(*) FROM mysql_query_rules_fast_routing”
+———-+
| COUNT(*) |
+———-+
| 200000 |
+———-+
Updating the graph after testing routing based on 200k rules:
As you can see from the graph above, ProxySQL is now able to perform routing decision based on 200k rules with almost no extra impact compared to 100 rules in the lagacy algorithm.
Drawback of the new algorithm
This new routing algorithm seems to have a minimum overhead of 1.7us (or 1.4us with mysql-stats_time_query_processor disabled).
The minimum overhead is not present in the old/legacy algorithm, therefore it makes sense to use the new routing algorithm only if you have more than 100 routing rules.
Furthermore, ProxySQL Cluster capability doesn’t support mysql_query_rules_fast_routing yet.
Memory usage
To reduce contention between worker threads in ProxSQL, threads maintain their own copy of query rules. This means that the more rules you have and the more threads you have configured, the more memory is needed. In this example, loading 200k rules and using 4 threads lead to a memory usage of 280MB.
In future we plan to make configurable if, for query processing, ProxySQL should use more memory and be lock free, or use less memory and not be lock free, or a mix of the two options:

mysql_query_rules to be lock free

mysql_query_rules_fast_routing to be shared between threads

These are implementation details we will be looking into when planning to optimize ProxySQL even further.
Conclusion
ProxySQL 1.4.7 introduces a new routing algorithm that doesn’t replace the legacy algorithm, but enhances it.
Thank to the new routing algorithm, now ProxySQL can easily handle routing based on hundreds of thousands of schemas/shards with almost no impact (few microseconds).

Releasing ProxySQL 1.4.6

Proudly announcing the release of the latest stable release of ProxySQL 1.4.4 as of the 1st of February 2018.
ProxySQL is a high performance, high availability, protocol aware proxy for MySQL.
It can be downloaded here, and freely usable and accessibl…

Compiling ProxySQL on FreeBSD

Compiling ProxySQL 1.4.6 on FreeBSD 11.1
FreeBSD is the operating system of choice for many MySQL installations and a topic that often comes up in the ProxySQL Google Forums as well as the ProxySQL Github Issue Tracker is related to ProxySQL on FreeBS…

Benchmarking ProxySQL 1.4.4

Comparing ProxySQL 1.4.3 vs. 1.4.4
ProxySQL 1.4.4 has recently been released (GA on Dec 20, 2017) which naturally leads to the question “What performance gains are to be expected when moving to the new version?”. In this article we compare performance between 1.4.3 and 1.4.4 in a CPU bound workload. The instances are configured with the default settings for the initial benchmark and then again after tuning one specific variable, mysql-max_stmts_per_connection, which can lead to substantial performance gains.
Lets first discuss what the ProxySQL variable mysql-max_stmts_per_connection affects and how it is evaluated. ProxySQL maintains a counter for each backend connection which increments each time a statement is prepared on that connection. Just before the connection is returned to the pool, this counter is evaluated against mysql-max_stmts_per_connection, and if the threshold is exceeded then the connection is closed (behaviour up to version 1.4.3) else the connection is reset (behaviour starting version 1.4.4 in order to improve the efficiency of connection handling as pointed out in Percona’s blog post regarding: XtraDB Cluster best practices for AWS).
Note: The mysql-max_stmts_per_connection variable is configured to 20 by default and can be tuned up however keep in mind that when increasing mysql-max_stmts_per_connection you may need to also increase the value of the MySQL variable max_prepared_stmt_count which has a maximum limit of 1048576.
In the graphs below performance is compared between ProxySQL version 1.4.3 and 1.4.4 using default values and two benchmarks for each (one with mysql-max_stmts_per_connection set to 20 [default] and another with the variable set to 100). A Sysbench benchmark was executed for 300 seconds with 64x threads performing a mixture of point and range selects on 10x tables consisting of 40,000,000 rows each running on a 3x node Percona XtraDB Cluster each running on 40x cores and 1 Gbps NICs.
The key averages for each benchmark are as follows:

Metric
1.4.3
(default)
1.4.4
(default)
1.4.3
(mspc 100)
1.4.4
(mspc 100)

QPS (average)
126179.53
141689.68
155537.27
163209.93

Latency (average)
14.67
13.10
11.93
11.37

Latency (95th percentile)
18.28
16.41
15.55
14.73

It is interesting to note that ProxySQL 1.4.4 is 13% faster out of the box with the default settings compared to ProxySQL 1.4.3.
It is also quite interesting to see that when mysql-max_stmts_per_connection is tuned to 100, and for this specific workload, ProxySQL 1.4.3 in itself could perform 24% faster!
With ProxySQL 1.4.4 we can see that when mysql-max_stmts_per_connection is tuned to 100 performance is 15% faster, however this is still 5% faster than ProxySQL 1.4.3 when tuned as the code is more efficient in the newer release.

A similar trend can be seen in terms of latency between the various versions of ProxySQL and the tuning of the mysql-max_stmts_per_connection variable.
Once again ProxySQL 1.4.4 exhibits the lowest amount of latency (especially when mysql-max_stmts_per_connection is tuned higher than the default value).
Naturally the effects of the mysql-max_stmts_per_connection variable largely depend on your workload and a synthetic read only sysbench workload serves more for comparative purposes.

Based on the above benchmarks it is fair to say that ProxySQL 1.4.4 has more consistent and efficient performance with regards to 1.4.3 resulting in at least a 13% improvement in average QPS and a 15% improvement in terms of average Latency out of the box.
### For reference, the command used to benchmark was:

sysbench –threads=64 /usr/share/sysbench/oltp_read_only.lua –tables=10 –table-size=40000000 –report-interval=5 –rand-type=pareto
–forced-shutdown=1 –time=300 –events=0 –percentile=95 –mysql-user=sbtest –mysql-password=$pw –mysql-db=sbtest10t40M
–mysql-storage-engine=INNODB –mysql-host=127.0.0.1 –mysql-port=6033 –point-selects=25 –range_size=5 –skip_trx=on run
Authored by: Nick Vyzas

Releasing ProxySQL 1.4.4

I am happy to announce the release of the latest stable release of ProxySQL 1.4.4 on 20 December 2017.
ProxySQL is a high performance, high availability, protocol aware proxy for MySQL.
It can be downloaded here, and freely usable and accessible accor…

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569