Full Text Search with MongoDB, MySQL and SQLite – Part 2

Part 2 of Full Text Search with MongoDB, MySQL and SQLite

This article covers full text search in 3 databases – MongoDB, MySQL and SQLite. In previous article of Full Text Search with MongoDB, MySQL and SQLite – Part 1, loading data into tables and creating full text indexes.

full text search

Our search requirements are:

  1. How many people live in Mapletown
  2. Of those living in Mapletown, how many are in California
  3. How many people live in Mapletown (city) or the state of California
  4. How many folks live in city starting with Maple and the state of California
  5. How many 27 year old people live in Mapletown in California
  6. Switch this a bit – How many people live 27-30 year old people live in Mapletown in California
  7. How many people live 27-30 year old people live in Mapletown in California whose first or last name is Jerry

Server is running Ubuntu 16.04 with 1GB memory, 1 core i5 2.7GHz and around 30GB disk space. MySQL and MongoDB are running with default configuration.

Let’s begin searching!

1. People in Mapletown – Matching a single full text search keyword

Let’s find how many people live in Mapletown.

SQLite version – Matching a single word is FAST! Notice the syntax where <tablename> match '<keyword>'. This syntax is a little unique.

MySQL version – Matching a single word is FAST here as well! Notice that the syntax is easier to understand with where match(<fts field>) against ('<keyword>'). MySQL has 2 modes of comparison: in natural language mode and in boolean mode.

MongoDB version – Matching a single word is pretty slow in comparison with SQLite and MySQL

2. People in Mapletown, CA – Matching two full text search keywords with AND

Let’s find how many people live in Mapletown in California!

SQLite version – Searching for both words is also FAST! Notice that there are two ways to write the same query. One with capitalized AND and one without any reserved word. By default, SQLite will use AND.

MySQL version – Notice the operator + before mapletown and california. That sign indicates an AND. We need records that contain both mapletown and california – that’s the indication MySQL gets with +

MongoDB version – When both mapletown and california are quoted, an AND search is conducted

3. People in Mapletown or CA – Matching two full text search keywords with OR

Let’s find how many people live in Mapletown city or in the state of California.

SQLite version – Searching for either mapletown or california is ridiculously FAST. Notice how quickly results are returned.

MySQL version – When no operator is provided, an OR search is performed. MySQL is also very fast at retrieving data from full text index.

MongoDB version – In comparison with SQLite and MySQL, MongoDB is really slow. As far as an application is concerned, MongoDB can perform satisfactorily.

4. People in CA with city starting with Maple – Matching a partial full text search keyword with AND

Let’s find how many people live in a city starting with Maple in California!

SQLite version – Blazing fast search with partial match continues with SQLite. Notice the wildcard * behind the first keyword maple.

MySQL version – MySQL is also blazing fast. Like SQLite, MySQL also uses * wildcard.

MongoDB version – Comparatively, much slower! Not only is the syntax usually convoluted, results are also slower. Notice the lack of wildcard. By default, Mongo is doing partial matches.

5. 27 year olds in Mapletown, CA – Matching multiple fields and keywords in full text search

Let’s find how many 27 year olds live in Mapletown in California!

SQLite version – Blazing fast results. Notice age:27 provided in the keyword. That tells SQLite to search for 27 in age field. Mapletown and California can be searched in all fields.

MySQL version – Notice how MySQL allows for full text search and regular search. If age was indexed, we could have hinted MySQL to use index on age. SQLite won’t be able do this kind of a match, but its own FTS4 match is insanely fast anyway.

MongoDB version – As queries are becoming somewhat complex, MongoDB is taking longer to extract results.

This same query written differently is much faster as we are searching for age of 27 in a different field. Age field is not indexed, yet the speed is better than the above version. The speed is still nowhere close to SQLite and MySQL.

6. 27-30 year olds in Mapletown, CA – Matching multiple fields, ranges and keywords in full text search

Let’s find how many 27-30 year olds live in Mapletown in California!

SQLite version – Speed is not impacted at all. However, query is getting a little longer and un-pretty.

MySQL version – Notice that the query is still fast and syntax is still quite familiar.

MongoDB version – Look at the query now. This is looking like GraphQL and becoming difficult to type. Also notice that MongoDB is looking for age in its own separate column, which is not indexed – just like we have in MySQL. Query performance is above average in comparison with previous queries.

7. 27-30 year old named Jerry in Mapletown, CA – Matching multiple fields, ranges, exact word and keywords in full text search

Let’s find how many 27-30 year old people named Jerry live in Mapletown in California!

SQLite version – Speed is not impacted at all. However, query is getting a little longer and un-pretty.

MySQL version – Notice that the query is still fast and syntax is still quite familiar.

MongoDB version – Note the first version where jerry is lower case and we find no match. The next query is the right case and we get a match, but it takes 1.6s to get the result.

Summary

There are some observations from this unscientific, quick study:

[table id=2 /]

What would I use? That choice depends on a lot of factors. However, the easy answer is: I’ll use SQLite for read-heavy queries where a single writer is sufficient. Areas where multiple processes are writing to the database, I’ll use MySQL or PostgreSQL. PostgreSQL also has full text search capabilities; I’ll cover that some other time.