Key/value tables and how to use them in PHP and MySQL

Today’s subject is a little bit advanced, but I think that beginners should be able to follow if they read carefully.

Key/value approach in database design could come in handy when we need to store some arbitrary data about another table. For example, we have a users table that holds our user data. Everything is working fine, but some day our client decides that he wants to collect 2 telephone numbers, sex of the user, date of birth …

If we try to predefine all the potential wishes of the customer in our table, it would be awkward and our table would grow horizontally beyond reason.

Here comes key/value table to the rescue. Let us create a users table (as simple as it can be):

[code lang=”sql”]
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`pass` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM ;

[/code]

This table will hold nothing but a login data. Let us create a user_data table which will hold everything else:

[code lang=”sql”]
CREATE TABLE IF NOT EXISTS`user_details` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`key` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`value` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM;
[/code]

The fields key and value will actually store the user data. Because this table is going to grow vertically, potentially we can store an unlimited and completely arbitrary information about a user. Two different parts of your app can handle storing the user details completely independent of each other. This is the power of this approach.

This approach is highly flexible, adding a new attribute at a late design phase is very simple. We can even create a mechanism that adds attributes according to real time configuration.

Enough about good sides, let me talk about drawbacks of this design. One major drawback is that there is no database data types in our user_data table. For example, consider this statement:

[code lang=”sql”]
UPDATE user_data SET value = ‘What?’ WHERE key = ‘date_of_birth’
[/code]

We are setting a string value for a key that should hold user’s date of birth. Of course, this can be handled through data validation on the application side, so it is not such a drawback.

A bigger problem is if we use user_data table for filtering. Imagine that we have only one users table with all the fields and need to retrieve all users that live in Chicago. In classic database design this would be the query:

[code lang=”sql”]
SELECT username FROM users WHERE city = ‘Chicago’
[/code]

Simple. But in our new key/value database design this query would look like this:

[code lang=”sql”]
SELECT username FROM users a
INNER JOIN user_data b
ON a.id = b.user_id
WHERE b.key = ‘city’
AND b.value = ‘Chicago’
[/code]

Pretty complex for such an easy task. Wait till you see this example. What if you need to filter data by 2 criteria. For example in classic design:

[code lang=”sql”]
SELECT username FROM users WHERE city = ‘Chicago’ AND state = ‘Illinois’
[/code]

In key/value design you can easily fall in the, so called, AND trap. At fist glance everybody will say: This is easy and type something like this:

[code lang=”sql”]
SELECT username FROM users a
INNER JOIN user_data b
ON a.id = b.user_id
WHERE
(b.key = ‘city’ AND b.value = ‘Chicago’)
AND
(b.key = ‘state’ AND b.value = ‘Illinois’)
[/code]

But this query will ALWAYS return nothing. Zero. Nada. That is because there is never ONE row in user_data that has both criteria met. So, now this complex query becomes monster:

[code lang=”sql”]
SELECT username FROM users a
INNER JOIN user_data b
ON a.id = b.user_id
INNER JOIN user_data c
ON b.user_id = c.user_id
WHERE
(b.key = ‘city’ AND b.value = ‘Chicago’)
AND
(c.key = ‘state’ AND c.value = ‘Illinois’)
[/code]

Conclusion is that this approach is great but must be used with caution. I would mainly use it for storing some arbitrary data that will never be used for filtering.

Tags:mysql php

16 thoughts on “Key/value tables and how to use them in PHP and MySQL”

Bill Karwin September 9, 2010 at 23:21

Storing “key/value” tables is also called an Entity-Attribute-Value design. I agree one should be cautious and think twice before using EAV. Then think a third time, and as many times as necessary until you decide not to use it.

Read these debunkings of the EAV design:

http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/

http://www.simple-talk.com/sql/t-sql-programming/avoiding-the-eav-of-destruction/

Since you can’t query the user_details efficiently, you might as well store a TEXT blob column in the users table, and store all the user details in XML or JSON or YAML or something.
Jason September 10, 2010 at 20:06

Seems like a good place to store user settings from a preferences page.
Tri Bui September 10, 2010 at 21:59

Another important design disadvantage of EAV is that for real data stores, the “values” column can not be indexed meaningfully, can not be constrained and thus can not be a foreign key.

tri bui
1. Zvonko September 11, 2010 at 11:29
  
  Thanks for the comment.
  
  You got a point there. Anyway, we can use EAV for some things, but for real optimizing we should use some of the NOSql solutinons.
huarong September 11, 2010 at 03:42

why not use memcacheDB?
MemcacheDB is key-value ,and faster than mysql.
1. Zvonko September 11, 2010 at 11:28
  
  Yes, I know. Actually it is called Membase as MemcacheDB is not persistent. But the theme of this tutorial was using this in MySql.
  
  In future tutorials I am planning to show all NOSql “databases” and how to use them.
  
  Thanks for the comment
joe September 12, 2010 at 20:49

@Zvonko — memcached is not persistent, but memcachedb is.
1. Zvonko September 12, 2010 at 21:07
  
  Sorry, my mistake. Firm which produced memcached, published persistent key/value storage called Membase: http://www.membase.org/
  
  I see that memcachedb also exist. And is persistent.
Zoran October 13, 2010 at 22:43

I’d rather store the key – value pairs in PHP array and in the database will have only one table users: with plus one more field which will be text (config_options) and will keep the serialized (use serialize() function) array from php in the database as string.
When you take it out from the database (without JOINs) use unserialize() and you will have your PHP array.
I think it’s much faster solution than joining queries from two tables which can get big and without proper indexing it will get slower.
1. Zoran October 13, 2010 at 22:50
  
  Just to add that you can keep as many as you want config options pairs of key values and you can add unlimited fields to your key = value pair, for example language_id (to translate every key, value pair), and all of that in just one text field.
  I am using LEFT JOINS too, but always trying to avoid it when i can
2. Zvonko October 13, 2010 at 22:51
  
  @Zoran Nice, but how would you search a user using one of the keys in array as condition?
horaciod January 18, 2011 at 20:47

you can make a simpler sql statement like this:

SELECT username FROM users a
where id in (select user_id from users_data where
(b.key = ‘city’ AND b.value = ‘Chicago’)
AND
(c.key = ‘state’ AND c.value = ‘Illinois’) )

Using subselects is a good option for performance and cleaner coding.
I’m using this in postgresql in bdigital.uncu.edu.ar
cadaeib May 21, 2011 at 18:10

Add entity on EAV, Write datas on EAV and JSON, search information on EAV (thx horaciod) with SQL (sphinx or Lucene too), read datas with JSON.
Just one more table with all the datas encoded with JSON ?
rt May 23, 2014 at 01:55

or you could use prepared statements or mysql stored procedures. With some views it’s easy to do a where statement.

SET @sql = NULL;

SELECT

GROUP_CONCAT(DISTINCT

CONCAT(

‘MAX(IF(metaKey = ”’,

metaKey,

”’, metaValue, NULL)) AS ‘,

metaKey

)

) INTO @sql

FROM appDataEntryValues;

SET @sql = CONCAT(‘SELECT entryID, ‘, @sql, ‘ FROM appDataEntryValues GROUP BY entryID’);

PREPARE stmt FROM @sql;

EXECUTE stmt;
1. rt May 23, 2014 at 01:57
  
  this sql doesn’t use views it’s just a prepare statement. I haven’t yet figured out how to make it work with php.
2. rt May 23, 2014 at 01:59
  
  the sql for the table i used is
  
  CREATE TABLE `appDataEntryValues` (
  
  `id` int(20) NOT NULL AUTO_INCREMENT,
  
  `metaKey` varchar(100) DEFAULT NULL,
  
  `metaValue` varchar(150) NOT NULL,
  
  `entryID` bigint(20) NOT NULL,
  
  PRIMARY KEY (`id`)
  
  )

Comments are closed.