I'm toying around with a script to allow users to search for other users based on certain criteria contained in their profiles.

Anyhow, some of the attributes on the user profiles are things like what they are seeking like dating, someone to talk to, etc. I decided to serialize the array and store the list in the db that way which results in something like: a:6:{s:7:"friends";s:1:"1";s:8:"hang_out";s:1:"1";s:6:"dating";s:1:"1";s:4:"talk";s:1:"1";s:8:"intimate";s:1:"0";s:5:"other";s:1:"0";} The entire list is there (both checked and unchecked values) and I can tell what was checked but how do I perform a search against this data? I tried several things that really didn't work out so well.

For example if a user who is searching chooses just one or two of the options I need the query to find only the users who have also checked that option (denoted by "1").

I tried so serialize the search options and use IN in the query but that didn't work out the way that I was expecting.

Is there any easy way to do this or should I be taking an entirely different approach to saving and searching the data?

If you are really attached to the idea of serialising the data instead of having distinct columns that you could query, you will need to use the "like" comparison, e.g. select * where serialstring like '%"friends";s:1:"1"%' will return all records with friends checked.
Much better, though, to use either distinct columns for each value, or a tupple table, if you want the "columns" to be dynamic.

Member Avatar for diafol

If I'm getting you correctly, you've got serialized data for every user.

I'm no expert on this but, what about delimiting each word or phrase that can be serached by a unique symbol, e.g. | (pipe).

So

|dating|intimate|talk|sensitive|caring|born liar|

could be the entry for a user.
Then just search for

LIKE '%|talk|%'

etc.

You concatenate 'AND LIKE's to build up your WHERE clause. Perhaps this isn't very different to what you're doing at the moment.

Not sure how this thread came back to life, it was a double post and the other was marked "solved" ... I thought this one had been deleted. 8-)

Anyhow, as much sense as it DOESN'T make ... I got into the position where it would be really beneficial to store the data in a single column and be able to search that column as well. As a quick solution I ended up doing pretty much what you guys are suggesting. Although I'm still working with the data as an array I stored it as a comma separated string and used '%LIKE%' to search the string.

I was thinking about storing the array as an INT and performing the search with bitwise operators. I know I can find exact matches, are either of you guys familiar enough to know if I can find intersecting values?

Member Avatar for diafol

Not sure if I understand your implementation.

Do your users have free rein over the words that are entered into their profile (DB record) or do they check boxes? In other words, are the terms limited?

If they are limited and only numbers in the 10s of terms, using tinyint (boolean) columns could be useful. However if you can expect thousands of random words, an 'array type' solution may be best.
I've seen search tables set up that store distinct words with a join table like (user_id, term_id). I think phpBB and forums use this type of system. I can only imagine the processing that must go on with these! However, it may work better with a long list of limited terms as you don't need to create a new column every time.

The terms are limited through checkboxes, in fact there are only 7 options available.

I know tinyint would work for exact matches but that not quite what I'm after.

ie.

// Let's say the following values were assigned to the boxes. 

'1' => 'Friends',
'2' => 'Dating',
'4' => 'Talking'

If a user searches Friends and Talking or "5" I want it to include the column containing "6" (Talking, Dating) because the search criteria and the column data has 'at least' one value in common ... "Talking." I have to read up on bitwise operations because it seems to me it would only return exact matches. I have it working the way that it is now (stored as a string) but if I could get the INT thing to work it would simply a few things but mostly condense an awful lot of code.

// Edit // Now that I look at it it is possible to find an intersecting value based solely on the math, or process of elimination. I'm (kinda) sure I could write the function to do the math but how the heck would I incorporate it into a search?

Member Avatar for diafol

Bitwise would be the way to go - int [3] (up to 127 for 7 options?)

Example
=======
//Terms

$dating = 1;
$single = 2;
$liar = 4;
$rampant = 8;
$quiet = 16;
$berserker = 32;
$sensitive = 64;

The above could (perhaps should) be global constants as opposed to variables.

let's assume Dave and Joe have the following profiles:
Dave berserker and rampant = 32 + 8 = 40 (saved in his profile field)
Joe quiet, sensitive and rampant = 16 + 64 + 8 = 88 (saved in his profile field)

If you want to search for just berserkers:

"SELECT * FROM users WHERE profile & 32"

AND 'em up:

"SELECT * FROM users WHERE profile & 32 AND profile & 64"

for users that are berserk AND sensitive

@Ardav,

Bitwise worked out much better than I originally anticipated. I was able to cut my code down by almost 75% for that particular search. I wanted to return a result if the search criteria and the stored value had even one value in common between the two of them. I had some help with this one but instead of having to "AND" or "OR" through the values and searching with %LIKE% I was able to break it down a step further. After the fact the solution seemed obvious.

3 & 13 > 0

// The bit fields in big-endian notation would be:
0011 (3)
1101 (13)

Using > 0 allowed the search to hit on intersecting values without it having to be an exact match.

I'm marking this thread solved again 8-)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.