Month: December 2016

Fun with Bugs #46 – On Some Bugs I’ve Reported During the Year of 2016

It’s time to summarize the year of 2016. As a kind of a weird summary, in this post I’d like to share a list of MySQL bug reports I’ve created in 2016 that are still remaining “Verified” today:Bug #79831 – “Unexpected error message on crash-safe slave with max_relay_log_size set”. According to Umesh this is not repeatable with 5.7. The fact that I’ve reported the bug on January 4 probably means I was working at that time. I should not repeat this mistake again next year.Bug #80067 – “Index on BIT column is NOT used when column name only is used in WHERE clause”. People say the same problem happens with INT and, what may be even less expected, BOOLEAN columns.Bug #80424 – “EXPLAIN output depends on binlog_format setting”. Who could expect that?Bug #80619 – “Allow slave to filter replication events based on GTID”. In this feature request I’ve suggested to implement filtering by GTID pattern, so that we can skip all events originating from specific master on some slave in a complex replication chain.Bug #82127 – “Deadlock with 3 concurrent DELETEs by UNIQUE key”. It’s clear that manual is not even close to explaining how the locks are really set “by design” in this weird case. See comments in MDEV-10962 for some explanations. Nobody from Oracle event tried to really explain how things are designed to work.Bug #82212 – “mysqlbinlog can produce events larger than max_allowed_packet for mysql”. This happens for encoded row-based events. There should be some way to take this overhead into account while creating binary log, IMHO.Bug #83024 – “Internals manual does not explain COM_SLEEP in details”. One day you’ll see Sleep for some 17 seconds logged into the slow query log, and may start to wonder why…Bug #83248 – “Partition pruning is not working with LEFT JOIN”. You may find some interesting related ideas in MDEV-10946.Bug #83640 – “Locks set by DELETE statement on already deleted record”. This case shows that design of locking in InnoDB does produce really weird outcomes sometimes. This is not about “missing manual”, this is about extra lock set that is absolutely NOT needed (a gap X lock on a record in the secondary unique index is set when the same transaction transaction already has the next key lock on it). As a side note, I keep finding, explaining and reporting weird or undocumented details in InnoDB locking for years, still my talk about InnoDB locks was not accepted by Oracle once again for OOW in 2016. What do I know about the subject and who even cares about those locks… Bug #83708 – “uint expression is used for the value that is passed as my_off_t for DDL log”. I was really shocked by this finding. I assumed that all uint vs unsigned long long improper casts are already found. It seems I was mistaking.Bug #83912 – “Time spent sleeping before entering InnoDB is not measured/reported separately”. The use case that led me to reporting this bug is way more interesting than the fact that some wait is not instrumented in performance_schema. You may see more related bug reports from me next year.Bug #83950 – “LOAD DATA INFILE fails with an escape character followed by a multi-byte one”. This single bug (and related bugs and stories) were original topic for issue #46 of my “Fun With Bugs” series. I was not able to write everything I want properly over last 3 weeks, but trust me: it’s a great story, of “Let’s Make America Great Again” style. With the goal for LOAD DATA to behave exactly as INSERT when wrong utf8 data are inserted, Oracle changed the way LOAD DATA works back and forth, with the last change (back) happened in 5.7.17: “Incompatible Change: A change made in MySQL 5.7.8 for handling of multibyte character sets by LOAD DATA was reverted due to the replication incompatibility (Bug #24487120, Bug #82641)”I just can not keep up with all the related fun people have in replication environments thanks to these ongoing changes… It’s incredible.Bug #84004 – “Manual misses details on MDL locks set and released for online ALTER TABLE”. Nothing new: locks in MySQL are not properly/completely documented, metadata locks included. yes, they are documented better now, after 11+ years of my continuous efforts (of a kind), but we are “not there yet”. I am still waiting for a job offer to join MySQL Documentation Team, by the way :)Bug #84173 – “mysqld_safe –no-defaults & silently does NOT work any more”. Recent MySQL 5.7.17 release had not only given us new Group Replication plugin and introduced incompatible changes. In a hope to fix security issues it comes with pure regression – for the first time in last 11 years mysqld_safe –no-defaults stopped working for me! By the way, mysqld_safe is still NOT safe in a sense that 5.7.17 tried to enforce, and one day (really soon) you will find out why.Bug #84185 – “Not all “Statements writing to a table with an auto-increment…” are unsafe”. If you do something like DELETE FROM `table` WHERE some_col IN (SELECT some_id FROM `other_table`) where `table` has auto_increment column, why should anyone care about it? We do not generate the value, we delete rows… This bug report was actually created by Hartmut Holzgraefe and test case comes from Elena Stepanova (see MDEV-10170). I want to take this opportunity to thank them and other colleagues from MariaDB for their hard work and cooperation during the year of 2016. Thanks to Umesh (who processed most of my bug reports),  Sinisa Milivojevic and Miguel Solorzano for their verifications of my bug reports this year.In conclusion I should say that, no matter how pointless you may consider this activity, I still suggest you to report each and every problem that you have with MySQL and can not understand after reading the manual, as a public MySQL bug. Now, re-read my 4 years old post on this topic and have a Happy and Fruitful New Year 2017!

The MyRocks community in 2016

The MyRocks community grew in 2016 with significant contributions from early evaluators. I appreciate their help and have been busy documenting the problems that they reported. I look forward to more contributions in 2017.MariaDB Corporation and Percona announced plans to include MyRocks in their distributions. You will be able to get MyRocks with expert support. Equally big have been the contributions from Sergey Pertunya to MyRocks.The MySQL team at Alibaba reported poor performance from range scans with sysbench. The problem was memory system contention on a performance counter and the fix made long range scans 2X faster on concurrent workloads.I have been using sysbench more and created a script to run a sequence of tests with it. During my tests I noticed too much variance during the read-only tests and the problem is non-determinism from the state of the memtable and level 0 of the LSM tree. The overhead for a search of the LSM tree depends on the amount of data in the memtable and the number of files in level 0. This state is not deterministic when a read-only test follows a read-write test. I filed issue 427 to see if we can get RocksDB to adapt and flush data earlier than normal from the memtable and level 0 when a workload becomes read-only.Sysbench is extremely useful even though the workloads provided by it are synthetic — especially the read-only tests with a small (multi-GB) database and most tests use a uniform distribution for keys. There is a talk on sysbench at FOSDEM 2017 and I look forward to using it more next year.Justin Swanhart pointed out the lousy performance MyRocks provided with a default configuration. As a result we improved the default configuration by increasing the memtable to 64M and block cache to 512M and I will share results at FOSDEM. See issues 369, 375, and 441.Justin Swanhart and Rick Pizzi reported problems with too-large transactions leading to lousy performance and OOM. MyRocks buffers in memory all changes from a transaction until commit and at commit time the changes are briefly double buffered when copied into the memtable. Until we make this better the workaround is to use rocksdb_commit_in_the_middle for bulk loads. Even after we make this better that option will be useful for bulk loads. Today we have a poor limit on the max size of a transaction via rocksdb_max_row_locks. This is a limit on the number of locked rows rather than on the memory used by a transaction. Even worse, the default is 1B. We are changing the limit to be on the amount of memory used by a transaction. Issues for this include 341, 347, 348 and 365.The MySQL team at Alibaba reported a performance regression from MyRocks group commit when running sysbench. They also reported a correctness problem that I ran into with sysbench. Auditing the code found another bug. A related problem is that rocksdb_disable_2pc was enabled by default, meaning engine & binlog crash safety wasn’t guaranteed. We have changed the default configuration to be crash safe and renamed the option to avoid double negatives. We also used YSCB and sysbench to reproduce the performance regression and have work in progress to make this better. That requires collaboration between the MyRocks and RocksDB teams. It is fun to watch the teams solve problems. Issues for this problem include 474, 481, 482, 488.

MariaDB’s Columnar Store

I have been keeping an eye on MariaDB’s Columnar store progress for a bit longer then half a year.MariaDB chose to take the infinidb code after Calpoint closed shop about two years ago and implemented it into their product. I was a bit wary about infinidb as well as it was a columnar store without compression that had mixed reviews on news hacker.However, it seems like MariaDB have pulled it off. They have added the infinidb engine to MariaDB with all its quirks and extra commands and they have added snappy compression as well. This is truely a huge win for them and their users, specifically in the area of reporting and analytics.Here are two real life examples for getting data ready for reporting currently happening in the wild:1) MySQL -> Sqoop -> Hadoop – where you would need a) 5-6 additional servers, b) someone to set those servers up in a hadoop cluster and then c) monitor the daily data transfer.2) MySQL -> CDC -> Kafka -> Hadoop – a) more technologies to master, b) a few more servers and some c) more monitoring. But this time, its streaming.To set all of this up could take from a couple of months to a year.Now with MariaDB, you have:1) MariaDB + some SQL scripts –  such as INSERT INTO datawarehouse.facttable SELECT … FROM site.table1 WHERE date >= curdate() – interval 1 day;2) MariaDB -> Pentaho/Talend -> MariaDB – Could be a bit slower, but with a GUI and really a lot of monitoring out of the box.As you can see, there are a lot fewer technologies, a lot fewer complexities and it is a lot more straight forward to develop.It is also very important to add that no one other than MariaDB is doing this. The closest you have is Tokudb which is great and can also last you a while, but a Columnar store is definitely more suited for this type of task.So once again, congratulations to MariaDB for offering such a huge benefit to its users.


客户一个测试环境,一个主机上面运行了很多数据库,某库的程序会时不时报错ora-4030。 加大了pga,然后还检查了ulimit的 data 和 stack都是ulimit。还是报错。 进而检查/var/adm/messages,发现有报错swap不足的情况。 所以,解决方法是加大物理内存,或者加大 […]

Log Buffer #500: A Carnival of the Vanities for DBAs

As the festive season starts, bloggers across Oracle, SQL Server, and MySQL have been producing some sparkling blog posts. This Log Buffer Edition highlights a few of them this week.Oracle:Spencer McGhin blogs about Python for Analytics – Exploring Data with Pandas.Say writes on Automated Deployment to SOA Cloud Services using Developer Cloud Services.Kent Graziano gets an early Christmas as newest version of Oracle (#SQLDev) SQL Developer Data Modeler (SDDM) is ready for download!Ben discusses Oracle SINH, COSH, and TANH Functions with Examples.According to Connor, there is a 12.2 treat for the festive season.SQL Server:The easiest way of explaining how a DevOps approach to database deployment can dramatically speed up the process is to use the term ‘shift-left’.SQL Server 2016 Enterprise Edition Performance AdvantagesVirtual machines are easier to manage via PowerShell than with the the Hyper-V Manager console. PowerShell Direct takes this further by allowing you to interact easily with the virtual machines on the Hyper-V host without needing network access.Mission critical features in SQL Server 201610 Ways to Save Money and Provide Comprehensive High Availability for SQL ServerMySQL:You are a DBA working hard to keep all the databases running so your staff can enter key data.On Databases, Temporal Datatypes and TrainsMySQL on AWS: RDS vs EC2JSON and MySQL Stored ProceduresOnline schema change for MySQL & MariaDB – comparing GitHub’s gh-ost vs pt-online-schema-change

Query Language Type Overview

This blog provides a query language type overview.
The idea for this blog originated from some customers asking me questions. When working in a particular field, you often a dedicated vocabulary that makes sense to your peers. It often includes phrases and abbreviations because it’s efficient. It’s no different in the database world. Much of this language might make sense to DBA’s, but it might sound like “voodoo” to people not used to it. The overview below covers the basic types of query languages inside SQL. I hope it clarifies what they mean, how they’re used and how you should interpret them.
DDL (Data Definition Language)
A database schema is a visualization of information. It contains the data structure separated by tables structures, views and anything that contains structure for your data. It defines how you want to store and visualize the information.
It’s like a skeleton, defining how data is organized. Any action that creates/updates/changes this skeleton is DDL.
Do you remember spreadsheets? A table definition describes something like:
Account number
Account name
Account owner
Creation date
Sorted ascending
Unique, indexed

Date, indexed
Number, linked with transactions

Whenever you want to create a table like this, you must use a DDL query. For example:CREATE TABLE Accounts
Account_number Bigint(16) ,
Account_name varchar(255),
Account_name varchar(255),
Creation_date date,
Amount Bigint(16),
PRIMARY KEY (Account_number),
FOREIGN KEY (Amount) REFERENCES transactions(Balancevalue)
);CREATE, ALTER, DROP, etc.: all of these types of structure modification queries are DDL queries!
Defining the structure of the tables is important as this defines how you would potentially access the information stored in the database while also defining how you might visualize it.
Why should you care that much?
DDL queries define the structure on which you develop your application. Your structure will also define how the database server searches for information in a table, and how it is linked to other tables (using foreign keys, for example).
You must design your MySQL schema before adding information to it (unlike NoSQL solutions such as MongoDB). MySQL might be more rigid in this manner, but it often makes sense to design the pattern for how you want to store your information and query it properly.
Due to the rigidity of an RDBMS system, changing the data structure (or table schema) requires the system to rebuild the actual table in most cases. This is potentially problematic for performance or table availability (locking). Often this is a “hot” procedure (since MySQL 5.6), requiring no downtime for active operations. Additionally, tools like pt-osc or other open source solutions can be used for migrating the data structure to a new format without requiring downtime.
An example:ALTER TABLE accounts ADD COLUMN wienietwegisisgezien varchar(20)
DML (Data Manipulation Language)
Data manipulation is what it sounds like: working with information inside a structure. Inserting information and deleting information (adding rows, deleting rows) are examples of data manipulation.
An example:INSERT into resto_visitor values(5,’Julian’,’highway 5’,12);
UPDATE resto_visitor set name=’Evelyn’,age=17 where id=103;Sure, but why should I use it?
Having a database environment makes no sense unless you insert and fetch information out of it. Remember that databases are plentiful in the world: whenever you click on a link on your favorite blog website, it probably means you are fetching information out of a database (and that data was at one time inserted or modified).
Interacting with a database requires that you write DML queries.
DCL (Data Control Language)
Data control language is anything that is used for administrating access to the database content. For example, GRANT queries:GRANT ALL PRIVILEGES ON database.table to ‘jeffbridges’@’ourserver’;Well that’s all fine, but why another subset “language” in SQL?
As a user of database environments, at some point you’ll get access permission from someone performing a DCL query. Data control language is used to define authorization rules for accessing the data structures (tables, views, variables, etc.) inside MySQL.
TCL (Transaction Control Language) Queries
Transaction control language queries are used to control transactional processing in a database. What do we mean by transactional processes? Transactional processes are typically bundled DML queries. For example:BEGIN
COMMIT or ROLLBACKThis gives you the ability to perform or rollback a complete action. Only storage engines offering transaction support (like InnoDB) can work with TCL.
Yet another term, but why?
Ever wanted to combine information and perform it as one transaction? In some circumstances, for example, it makes sense to make sure you perform an insert first and then perform an update. If you don’t use transactions, the insert might fail and the associated update might be an invalid entry. Transactions make sure that either the complete transaction (a group of DML queries) takes place, or it’s completely rolled back (this is also referred to as atomicity).
Hopefully this blog post helps you understand some of the “insider” database speech. Post comments below.

pre-Fosdem’17 MySQL Day

Oracle’s  MySQL Community Team is organizing a pre-Fosdem MySQL Day on Friday 3rd February 2017.
The event is free (like in beer) for all participants but as seating is limited, registration is required.
Come to meet MySQL engineers and listen to them talking about new MySQL 8.0’s features. Some of Percona’s experienced staff will also deliver some talks on troubleshooting, architecture, …
And finally, will also share their user experience with the audience.
What else to start the Fosdem Week-end in the best way ?
Agenda and registration can be found on lefred’s blog !

Pre-Fosdem’17 MySQL Day

This year, I’ve the honor to organize just before the Fosdem MySQL & Friends Devroom an extra pre-Fosdem MySQL Day. This MySQL Day will take place the Friday just before Fosdem’s week-end.
During that day, we will highlight MySQL 8.0 new features but not only.
Oracle’s MySQL Community Team is sponsoring this event. Seating is limited, so please register.
The event is free and the location is the same as the very popular MySQL & Friends Community Dinner.
You can register now on eventbrite.
This is the agenda:

Friday 3rd February

Welcome !
MySQL 8.0: Server Defaults – An overview of what settings have changed or are under consideration
Morgan Tocker
MySQL 8.0
MySQL 8.0: Unicode – What, why and how
Bernt Marius Johnsen
MySQL 8.0
MySQL 8.0: Common Table Expressions (CTEs)
Øystein Grøvlen
MySQL 8.0
Group Replication
Kenny Gryp
Group Replication
How avoids and deals with replication lag
Jean-François Gagné
MySQL for Beginners – Getting Basics Right
Peter Zaitsev
MySQL 8.0: Window functions – finally!
Dag H. Wanvik
MySQL 8.0
Coffee Break
Using Optimizer Hints to Improve MySQL Query Performance
Øystein Grøvlen
MySQL 8.0
Monitoring without looking at MySQL
Jean-François Gagné
Fun, Sport, Not-MySQL
What you wanted to know about your MySQL Server instance, but could not find using internal instrumentation only
Sveta Smirnova
ProxySQL Use Case Scenarios
MySQL 8.0: GIS – Are you ready ?
Norvald H. Ryeng
MysQL 8.0

TEL/電話+86 13764045638
QQ 47079569