Month: June 2017

Industry Recognition of Percona: Great Products, Great Team

PerconaPercona’s focus on customer success pushes us to deliver the very best products, services and support. Three recent industry awards reflect our continued success in achieving these goals. We recently garnered a Bronze Stevie Award “Company of the Year – Computer Software – Medium.” Stevies are some of the most coveted awards in the tech […]

One more time with sysbench, a small server & MySQL 5.6, 5.7 and 8.0

 Update – the regression isn’t as bad as I have been reporting. Read this post to understand why.The good news is that I hope to begin debugging this problem next week. After fixing a few problems to reduce variance I am repeating tests to document the performance regression from MySQL 5.6 to 8.0. The first problem was fixed by disabling turbo boost on my Intel NUC servers to avoid thermal throttling. The other problem was the impact from mutex contention for InnoDB purge threads and I repeated tests with it set to 1 and 4. This is part of my series on low-concurrency CPU regressions for bug 86215.tl;dr for in-memory sysbench on a small server with a fast SSD
most of the regression is from 5.6.35 to 5.7.17, much less is from 5.7.1 to 8.0.1
innodb_purge_threads=4 costs 10% to 15% of the QPS for write-heavy tests
QPS is 30% less for 5.7.17 & 8.0.1 vs 5.6.35 on write-only tests
QPS is 30% to 40% less for 5.7.17 & 8.0.1 vs 5.6.35 on read-write tests
QPS is 40% to 50% less for 5.7.17 & 8.0.1 vs 5.6.35 on read-only tests
QPS is 40% less for 5.7.17 & 8.0.1 vs 5.6.35 for point-query
QPS is 30% less for 5.7.17 & 8.0.1 vs 5.6.35 for insert-only

ConfigurationI tested MySQL with upstream 5.6.35, 5.7.17 and 8.0.1. For 8.0.1 I used the latin1 charset and latin1_swedish_ci collation. I used the i5 NUC servers described here and the my.cnf used are here. I run mysqld and the sysbench client on the same server. The binlog is enabled but sync-on-commit is disabled. Sysbench is run with 4 tables and 1M rows per table. The database fits in the InnoDB buffer pool. My usage of sysbench is described here. That explains the helper scripts that invoke sysbench and collect performance metrics. When I return home I will update this with the sysbench command lines that are generated by my helper scripts.Results: write-onlySorry, no graphs this time. I run sysbench for 1, 2 and 4 concurrent clients and share both the QPS for each test and then the QPS for MySQL 5.7.17 and 8.0.1 relative to 5.6.35. The ratio is less than 1 when the QPS is larger for 5.6.35.All of these tests are run with innodb_purge_threads=1 which is the default for 5.6.35. The default for 5.7.17 and 8.0.1 is 4.The first batch of results is from write-only tests. Most of the QPS regression is from MySQL 5.6.35 to 5.7.17. Excluding the update-index test, going from 5.6 to 5.7 loses about 30% of QPS.update-index : QPS1       2       4       concurrency/engine5806    9837    12354   inno56355270    8798    11677   inno57174909    8176    10917   inno801update-index : QPS relative to MySQL 5.6.351       2       4       concurrency/engine.91     .89     .95     inno5717.85     .83     .88     inno801update-nonindex : QPS1       2       4       concurrency/engine10435   15680   18487   inno5635 7691   11497   14989   inno5717 7179   10845   14186   inno801update-nonindex : QPS relative to MySQL 5.6.351       2       4       concurrency/engine.74     .73     .81     inno5717.69     .69     .77     inno801delete : QPS1       2       4       concurrency/engine19461   28797   35684   inno563513525   19937   25466   inno571712551   18810   24023   inno801delete : QPS relative to MySQL 5.6.351       2       4       concurrency/engine.69     .69     .71     inno5717.64     .65     .67     inno801write-only : QPS1       2       4       concurrency/engine16892   25376   30915   inno563511765   17239   22061   inno571710729   16108   20682   inno801write-only : QPS relative to MySQL 5.6.351       2       4       concurrency/engine.70     .68     .71     inno5717.64     .63     .67     inno801
Results: read-write

The next batch of results is from the classic read-write OLTP sysbench test. But I repeat it using different sizes for the range query. The regression is larger here than for the write-only tests above perhaps because of the regression for range scans. Going from 5.6.35 to 5.7.17 loses between 30% and 40% of the QPS. The regression is worse for longer range scans.read-write.range100 : QPS
1       2       4       concurrency/engine
11653   18109   25325   inno5635
 7520   10871   14498   inno5717
 6965   10274   14098   inno801

read-write.range100 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.65     .60     .57     inno5717
.60     .57     .56     inno801

read-write.range10000 : QPS
1       2       4       concurrency/engine
337     604     849     inno5635
202     386     443     inno5717
200     378     436     inno801

read-write.range10000 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.60     .64     .52     inno5717
.59     .63     .51     inno801

Results: read-onlyThe next batch of results is from the classic read-only OLTP sysbench test. But I repeat it using different sizes for the range query. Most of the regression is from 5.6.35 to 5.7.17. Going from 5.6 to 5.7 loses between 40% and 50% of the QPS so the regression here is larger than above for the read-write tests. There isn’t a larger regression for larger range queries.

read-only.range10 : QPS
1       2       4       concurrency/engine
17372   30663   50570   inno5635
10829   19021   25874   inno5717
10171   18743   25713   inno801

read-only.range10 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.62     .62     .51     inno5717
.59     .61     .51     inno801

read-only.range100 : QPS
1       2       4       concurrency/engine
11247   20922   32930   inno5635
 6815   12823   16225   inno5717
 6475   12308   15834   inno801

read-only.range100 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .61     .49     inno5717
.58     .59     .48     inno801

read-only.range1000 : QPS
1       2       4       concurrency/engine
2590    4840    6816    inno5635
1591    2979    3408    inno5717
1552    2918    3363    inno801

read-only.range1000 : QPS relatie to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .62     .50     inno5717
.60     .60     .49     inno801

read-only.range10000 : QPS
1       2       4       concurrency/engine
273     497     686     inno5635
161     304     355     inno5717
159     299     350     inno801

read-only.range10000 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.59     .61     .52     inno5717
.68     .60     .51     inno801

Results: point-query and insert-only

Finally results for the last two tests — point-query and insert-only. MySQL 5.7.17 loses about 40% of the QPS for point-query and 30% of the QPS for insert-only compared to 5.6.35.

point-query : QPS
1       2       4       concurrency/engine
19674   36269   55266   inno5635
11964   22941   29174   inno5717
11624   20679   29271   inno801

point-query : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .63     .53     inno5717
.59     .57     .53     inno801

insert : QPS
1       2       4       concurrency/engine
11288   16268   19355   inno5635
 7951   12176   15660   inno5717
 7493   11277   14857   inno801

insert : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.70     .75     .81     inno5717
.66     .69     .77     inno801

innodb_purge_threads

Finally I repeated tests with innodb_purge_threads=4 to show the impact from that. On a small server (2 cores, 4 HW threads) there is too much mutex from innodb_purge_threads=4. As 4 is the default for 5.7.17 and 8.0.1 they suffer more than 5.6.35 when the default is used. The results above are for innodb_purge_threads=1 and then I repeated the tests with it set to 4. Here I show the QPS with purge_threads=4 / QPS with purge_threads=1. For the tests below QPS is reduced by 10% to 15% when innodb_purge_threads=4 on a small server. The insert-only test doesn’t suffer, but there isn’t anything to purge from the insert-only workload.update-index
1       2       4       concurrency/engine
.85     .76     .75     inno5635
.76     .76     .77     inno5717
.89     .96     .89     inno801

update-nonindex
1       2       4       concurrency/engine
.82     .78     .88     inno5635
.77     .79     .86     inno5717
.86     .95     .91     inno801

delete
1       2       4       concurrency/engine
.84     .81     .82     inno5635
.84     .81     .87     inno5717
.87     .92     .94     inno801

write-only
1       2       4       concurrency/engine
.89     .85     .85     inno5635
.88     .86     .87     inno5717
.91     .95     .94     inno801

insert
1       2       4       concurrency/engine
.99     .99     .99     inno5635
.99     1.00    1.00    inno5717
1.01    1.01    1.00    inno801

On Apache Ignite, Apache Spark and MySQL. Interview with Nikita Ivanov

“Spark and Ignite can complement each other very well. Ignite can provide shared storage for Spark so state can be passed from one Spark application or job to another. Ignite can also be used to provide distributed SQL with indexing that accelerates Spark SQL by up to 1,000x.”–Nikita Ivanov.
I have interviewed Nikita Ivanov,CTO of GridGain.
Main topics of the interview are Apache Ignite, Apache Spark and MySQL, and how well they perform on big data analytics.
RVZ
Q1. What are the main technical challenges of SaaS development projects?
Nikita Ivanov: SaaS requires that the applications be highly responsive, reliable and web-scale. SaaS development projects face many of the same challenges as software development projects including a need for stability, reliability, security, scalability, and speed. Speed is especially critical for modern businesses undergoing the digital transformation to deliver real-time services to their end users. These challenges are amplified for SaaS solutions which may have hundreds, thousands, or tens of thousands of concurrent users, far more than an on-premise deployment of enterprise software.
Fortunately, in-memory computing offers SaaS developers solutions to the challenges of speed, scale and reliability.
Q2. In your opinion, what are the limitations of MySQL® when it comes to big data analytics?
Nikita Ivanov: MySQL was originally designed as a single-node system and not with the modern data center concept in mind. MySQL installations cannot scale to accommodate big data using MySQL on a single node. Instead, MySQL must rely on sharding, or splitting a data set over multiple nodes or instances, to manage large data sets. However, most companies manually shard their database, making the creation and maintenance of their application much more complex. Manually creating an application that can then perform cross-node SQL queries on the sharded data multiplies the level of complexity and cost.
MySQL was also not designed to run complicated queries against massive data sets. MySQL optimizer is quite limited, executing a single query at a time using a single thread. A MySQL query can neither scale among multiple CPU cores in a single system nor execute distributed queries across multiple nodes.
Q3. What solutions exist to enhance MySQL’s capabilities for big data analytics?
Nikita Ivanov: For companies which require real-time analytics, they may attempt to manually shard their database. Tools such as Vitess, a framework YouTube released for MySQL sharding, or ProxySQL are often used to help implement sharding.
To speed up queries, caching solutions such as Memcached and Redis are often deployed.
Many companies turn to data warehousing technologies. These solutions require ETL processes and a separate technology stack which must be deployed and managed. There are many external solutions, such as Hadoop and Apache Spark, which are quite popular. Vertica and ClickHouse have also emerged as analytics solutions for MySQL.
Apache Ignite offers speed, scale and reliability because it was built from the ground up as a high performant and highly scalable distributed in-memory computing platform.
In contrast to the MySQL single-node design, Apache Ignite automatically distributes data across nodes in a cluster eliminating the need for manual sharding. The cluster can be deployed on-premise, in the cloud, or in a hybrid environment. Apache Ignite easily integrates with Hadoop and Spark, using in-memory technology to complement these technologies and achieve significantly better performance and scale. The Apache Ignite In-Memory SQL Grid is highly optimized and easily tuned to execute high performance ANSI-99 SQL queries. The In-Memory SQL Grid offer access via JDBC/ODBC and the Ignite SQL API for external SQL commands or integration with analytics visualization software such as Tableau.
Q4. What is exactly Apache® Ignite™?
Nikita Ivanov: Apache Ignite is a high-performance, distributed in-memory platform for computing and transacting on large-scale data sets in real-time. It is 1,000x faster than systems built using traditional database technologies that are based on disk or flash technologies. It can also scale out to manage petabytes of data in memory.
Apache Ignite includes the following functionality:
· Data grid – An in-memory key value data cache that can be queried
· SQL grid – Provides the ability to interact with data in-memory using ANSI SQL-99 via JDBC or ODBC APIs
· Compute grid – A stateless grid that provides high-performance computation in memory using clusters of computers and massive parallel processing
· Service grid – A service grid in which grid service instances are deployed across the distributed data and compute grids
· Streaming analytics – The ability to consume an endless stream of information and process it in real-time
· Advanced clustering – The ability to automatically discover nodes, eliminating the need to restart the entire cluster when adding new nodes
Q5. How Apache Ignite differs from other in-memory data platforms?
Nikita Ivanov: Most in-memory computing solutions fall into one of three types: in-memory data grids, in-memory databases, or a streaming analytics engine.
Apache Ignite is a full-featured in-memory computing platform which includes an in-memory data grid, in-memory database capabilities, and a streaming analytics engine. Furthermore, Apache Ignite supports distributed ACID compliant transactions and ANSI SQL-99 including support for DML and DDL via JDBC/ODBC.
Q6. Can you use Apache® Ignite™ for Real-Time Processing of IoT-Generated Streaming Data?
Nikita Ivanov: Yes, Apache Ignite can ingest and analyze streaming data using its streaming analytics engine which is built on a high-performance and scalable distributed architecture. Because Apache Ignite natively integrates with Apache Spark, it is also possible to deploy Spark for machine learning at in-memory computing speeds.
Apache Ignite supports both high volume OLTP and OLAP use cases, supporting Hybrid Transactional Analytical Processing (HTAP) use cases, while achieving performance gains of 1000x or greater over systems which are built on disk-based databases.
Q7. How do you stream data to an Apache Ignite cluster from embedded devices?
Nikita Ivanov: It is very easy to stream data to an Apache Ignite cluster from embedded devices.
The Apache Ignite streaming functionality allows for processing never-ending streams of data from embedded devices in a scalable and fault-tolerant manner. Apache Ignite can handle millions of events per second on a moderately sized cluster for embedded devices generating massive amounts of data.
Q8. Is this different then using Apache Kafka?
Nikita Ivanov: Apache Kafka is a distributed streaming platform that lets you publish and subscribe to data streams. Kafka is most commonly used to build a real-time streaming data pipeline that reliably transfers data between applications. This is very different from Apache Ignite, which is designed to ingest, process, analyze and store streaming data.
Q9. How do you conduct real-time data processing on this stream using Apache Ignite?
Nikita Ivanov: Apache Ignite includes a connector for Apache Kafka so it is easy to connect Apache Kafka and Apache Ignite. Developers can either push data from Kafka directly into Ignite’s in-memory data cache or present the streaming data to Ignite’s streaming module where it can be analyzed and processed before being stored in memory.
This versatility makes the combination of Apache Kafka and Apache Ignite very powerful for real-time processing of streaming data.
Q10. Is this different then using Spark Streaming?
Nikita Ivanov: Spark Streaming enables processing of live data streams. This is merely one of the capabilities that Apache Ignite supports. Although Apache Spark and Apache Ignite utilize the power of in-memory computing, they address different use cases. Spark processes but doesn’t store data. It loads the data, processes it, then discards it. Ignite, on the other hand, can be used to process data and it also provides a distributed in-memory key-value store with ACID compliant transactions and SQL support.
Spark is also for non-transactional, read-only data while Ignite supports non-transactional and transactional workloads. Finally, Apache Ignite also supports purely computational payloads for HPC and MPP use cases while Spark works only on data-driven payloads.
Spark and Ignite can complement each other very well. Ignite can provide shared storage for Spark so state can be passed from one Spark application or job to another. Ignite can also be used to provide distributed SQL with indexing that accelerates Spark SQL by up to 1,000x.
Qx. Is there anything else you wish to add?
Nikita Ivanov: The world is undergoing a digital transformation which is driving companies to get closer to their customers. This transformation requires that companies move from big data to fast data, the ability to gain real-time insights from massive amounts of incoming data. Whether that data is generated by the Internet of Things (IoT), web-scale applications, or other streaming data sources, companies must put architectures in place to make sense of this river of data. As companies make this transition, they will be moving to memory-first architectures which ingest and process data in-memory before offloading to disk-based datastores and increasingly will be applying machine learning and deep learning to make understand the data. Apache Ignite continues to evolve in directions that will support and extend the abilities of memory-first architectures and machine learning/deep learning systems.
——–Nikita IvanovFounder & CTO, GridGain,
Nikita Ivanov is founder of Apache Ignite project and CTO of GridGain Systems, started in 2007. Nikita has led GridGain to develop advanced and distributed in-memory data processing technologies – the top Java in-memory data fabric starting every 10 seconds around the world today. Nikita has over 20 years of experience in software application development, building HPC and middleware platforms, contributing to the efforts of other startups and notable companies including Adaptec, Visa and BEA Systems. He is an active member of Java middleware community, contributor to the Java specification. He’s also a frequent international speaker with over two dozen of talks on various developer conferences globally.
Resources
– Apache Ignite Community Resources
– apache/ignite on GitHub
– Yardstick Apache Ignite Benchmarks
–Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite
–Misys Uses GridGain to Enable High Performance, Real-Time Data Processing
–The Spark Python API (PySpark)
Related Posts
–Supporting the Fast Data Paradigm with Apache Spark. BY Stephen Dillon, Data Architect, Schneider Electric
– On the new developments in Apache Spark and Hadoop. Interview with Amr Awadallah. ODBMS Industry Watch,March 13, 2017
Follow ODBMS.org on Twitter: @odbmsorg
##

Shinguz: Storing BLOBs in the database

We have sometimes discussions with our customers whether to store LOBs (Large Objects) in the database or not. To not rephrase the arguments again and again I have summarized them in the following lines.

The following items are more or less valid for all large data types (BLOB, TEXT and theoretically also for JSON and GIS columns) stored in a MySQL or MariaDB (or any other relational) database.

The idea of a relational table based data-store is to store structured data (numbers, data and short character strings) to have a quick write and read access to them.

And yes, you can also store other things like videos, huge texts (PDF, emails) or similar in a RDBMS but they are principally not designed for such a job and thus non optimal for the task. Software vendors implement such features not mainly because it makes sense but because users want it and the vendors want to attract users (or their managers) with such features (USP, Unique Selling Proposition). Here also one of my Mantras: Use the right tool for the right task:

The main topics to discuss related to LOBs are: Operations, performance, economical reasons and technical limitations.

Disadvantages of storing LOBs in the database

The database will grow fast. Operations will become more costly and complicated.
Backup and restore will become more costly and complicated for the admin because of the increased size caused by LOBs.
Backup and restore will take longer because of the same reason.
Database and table management functions (OPTIMIZE, ALTER, etc.) will take longer on big LOB tables.
Smaller databases need less RAM/disk space and are thus cheaper.
Smaller databases fit better into your RAM and are thus potentially faster (RAM vs disk access).
RDBMS are a relatively slow technology (compared to others). Reading LOBs from the database is significantly slower than reading LOBs from a filer for example.
LOBs stored in the database will spoil your database cache (InnoDB Buffer Pool) and thus possibly slow down other queries (does not necessarily happen with more sophisticated RBDMS).
LOB size limitation of 1 Gbyte in reality (max_allowed_packet, theoretically limit is at 4 Gbyte) for MySQL/MariaDB.
Expensive, fast database store (RAID-10, SSD) is wasted for something which can be stored better on a cheap slow file store (RAID-5, HDD).
It is programmatically often more complicated to get LOBs from a database than from a filer (depends on your libraries).

Advantages of storing LOBs in the database

Atomicity between data and LOB is guaranteed by transactions (is it really in MySQL/MariaDB?).
There are no dangling links (reference from data to LOB) between data and LOB.
Data and LOB are from the same point in time and can be included in the same backup.

Conclusion

So basically you have to balance the advantages vs. the disadvantages of storing LOBs in the database and decided what arguments are more important in your case.

If you have some more good arguments pro or contra storing LOBs in the database please let me know.

Literature

Check also various articles on Google.

Nim: Repeat after me: MySQL is not a filesystem

Taxonomy upgrade extras: 

blob
text
lob
design

A summer with the MySQL Community Team !

The MySQL Community team will be supporting the following events during the summer and we will be present at some of them ! Please come to visit us !

Northeast PHP
August 9-11, 2017, Charlottetown, PEI Canada

 

We are happy to invite you to Northeast PHP where MySQL Community team is having a booth. Please find David Stokes, the MySQL Community Manager at MySQL booth in expo area. Dave also submitted a talk on “JSON, Replication, and database programming” which we hope will be accepted. Please watch the conference agenda for further updates.

 
We are looking forward to talking to you there!
More information / registration: http://2017.northeastphp.org/

UbuCon LA 

Lima, Peru, August 18-19, 2017
 
MySQL Community team is supporting this event as Platinum sponsor.
More information about the event & registration: http://ubucon.org/en/events/ubucon-latin-america/

Open Source Conference Hokkaido,
Hokkaido, Japan, July 14-15, 2017

We are happy to invite you to the next Open Source Conference in Japan, this time in Hokkaido. Local MySQL team together with MyNA (MySQL Nippon Association) are going to represent MySQL at this event. Do not miss the dedicated MySQL session &  opportunity to talk with our experts at the MySQL booth. This time we have really cute MySQL presents! We are looking forward to talking to you!

More information & registration: https://www.ospn.jp/osc2017-do/

Open Source Conference Kyoto
Kyoto, Japan, August 4-5, 2017

The other Open Source Conference in Japan which MySQL team is going to attend as Gold sponsor is Open Source Conference Kyoto, Japan. Same as in Hokkaido also here you can find our MySQL team together with MyNA (MySQL Nippon Association)  representatives at MySQL booth and listen the MySQL dedicated session. Do not miss the opportunity to talk to our booth staff this time with the cool MySQL branded presents! We are looking forward to meeting  you there!

More information & registration: https://www.ospn.jp/osc2017-kyoto/

COSCUP
August 5-6, 2017, Taipei, Taiwan

As a tradition also this year you can find MySQL team at the conference for Open Source Coders, Users & Promoters (COSCUP). We are again this year as Gold sponsor and newly this year MySQL got a whole day Open Source Database Track. As part of this track there are 7 MySQL talks where 5 of the speakers are from Oracle. Please find some of the topics below:

MySQL Server 8.0 by Shinya Sugiyama, the MySQL Master Principal Sales Consultant, Oracle
New Features in MySQL 5.7 Optimizer by Amit Bhattacharya, the Senior Software Development Manager, Oracle
A good way to use Redis with MySQL by Yuji Otani, the CTO of SKYDISC, Japan
MySQL InnoDB Cluster by Frederic Descamps (me!), the MySQL Community Manager, Oracle
MySQL InnoDB Cluster and MySQL Connector Workshop by Ivan Ma, MySQL Sales Consultant, Oracle & HK MySQL User Group Leader
Sponsored Commercial talk: Database Trend support Next Generation Web Application by Sanjay Manwani, the MySQL Development Director, Oracle
… and more… for more details check the COSCUP website…

Please find us at the MySQL booth in the expo area. We are looking forward to talking to you there!

More information & registration: http://coscup.org/2017-landingpage/

FrOSCon
August 19-20, 2017, Sankt Augustin, Germany

This year again we are very happy to invite you to the Free Open Source Conference (FrOSCon) which takes place in Sankt Augustin, Germany. You can find our MySQL representative at MySQL booth in the expo area as well as MySQL talk in the program. This year there is going to be a presentation “MySQ L5.7 – InnoDB Cluster [HA built in]” run by Carsten Thalheimer, the Senior MySQL Sales Consultant. Do not miss the opportunity to meet & talk to us there as well as check the program for the MySQL talk.

More information & registration: https://www.froscon.de

How to think about performance

I’ve noticed lately that whenever there’s some sort of performance problem, people like to
immediately look at configuration. I’m guilty of this too. Here’s an example from a few months ago.
A user chatted into VividCortex support…

… we have been having a concurrency issue in the evenings. I was wondering if you might point to some of the graphs to use to try and figure out where the bottleneck is?

I didn’t respond to the user, but I added an internal note in our support system.

I would increase buffer pool size to begin with.

Ughhh… that makes me cringe.

(To be fair, I think that was an OK suggestion. I noticed that their buffer pool reads [pages
read from disk] were reaching 7,000 / sec, which roughly translates to 7K IOPS, and they had
plenty of spare memory to use.)

I don’t like my response because the question was about performance, but I was thinking more about
configuration than performance.

Besides that, I’ve also seen cases where people encounter high MySQL replication delay and resort to
configuration changes (e.g. increasing open table limits, disabling binary logging (!), changing
the instance type, and so on) without really thinking about why a replica is too slow to keep up
with a master. Experimenting with config options with semi-educated guesses can be time consuming,
frustrating, and even dangerous if you make a mistake.

I’ve learned a lot about performance over the past four years but often times I think it can get
really complicated and hard to remember. Recently, I realized (with help, of course) that performance boils
down to two simple points:

Slowness is about spending time on something.1

Things spend time doing work or waiting.

Those two points are enough for a framework to ask great questions. For example:

X is slow. It’s spending time on something. Is it doing work, or waiting? How can I tell if it’s doing work or waiting?

If it is doing work too slowly, why? What does the USE Method tell me?
If X’s resource Y is saturated, why is Y slow? Go to step #1 for Y.
If it’s waiting, how can I tell what it’s waiting on?

and so on.

Following this framework may lead you to the cause of your performance problems, but it may not. I
think at the very least it’ll guide you in the right direction step by step. This is a very important
thing. I think it’s very easy to get lost in metrics and charts.

Example

Take a look at the screenshot below. These are CPU, disk, InnoDB, and MySQL charts in VividCortex around
the time a server stall (i.e. it got slow) was detected. All of the charts look interesting! Where would you start to
diagnose this stall using this page? Without a framework, I’d probably just scan every chart.

Let’s look at how we can approach this problem by thinking about how things spend time.
I know that during this stall, a bunch of threads piled up (MySQL concurrency) and the query throughput dropped to
about half. Why was this server slow? It doesn’t seem to be getting work done.

150 threads running… what are they doing? Are they doing work, or are they waiting for something?
Let’s check out the thread states.

Ahah! 101 threads are waiting for a lock! Well, that was easy. The
infamous query cache strikes again!

I think this process works well for performance problems at any level (system, database, application, etc).
I think it also probably gives you an idea of how to instrument your services, or find out where your
instrumentation is lacking.

This one’s from Baron (@xaprb)

How Does Percona Software Compare: the Value of Percona Software and Services

Percona Software and ServicesIn this blog post, I’ll discuss my experience as a Solutions Engineer in explaining the value of Percona software and services to customers. The inspiration for this blog post was a recent conversation with a prospective client. They were exploring open source software solutions and professional services for the first time. This is a fairly common […]

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569