Advanced Search development - 'homemade' fuzzy logic...

Author Message

steve walker

Monday 09 August 2004 1:07:01 am

Good morning!

To get you off to a great Monday start to the week I've got a horrible little conundrum for you!

I have a mammoth site build coming up, and want to use Ez to do it. Its an ecommerce site for diamonds and some associated jewellery, but mainly just sells stones.

One of the biggest challenges is creating the page that allows you to select certain criteria relating to stones (such as 'cut' and 'carat') - and getting the search results back listing the stones that fit your criteria.

To do this I'm intending to have 4 drop down lists which allow you to choose cut, carat, clarity and colour - now, afaik I can use http://ez.no/community/bug_reports/search_lacks_multi_attribute_queries if I want to search within specific attributes, so thats one issue solved.

The other issue is introducing some sort of fuzzy logic to the search results...

Lets take the variable 'carat' - and lets say it has values 1,2,3,4, and 5. My user searches carat values of 3. I want to build in a mechanism that says 'if there are no values '3', search on '2' and '4' as well' so I reduce the likelihood of getting zero results returned.

One way I thought of doing this was to have 2 fields in the diamond class for each attirbute - say for 'carat' I would have 'carat_actual' and 'carat_fuzzy'. Carat_actual would contain the real value of the stone (example: 2), whereas carat_fuzzy would be populated with the fuzzy values of the stones (example: 1, 3).

Then you'd use 2 fetches to create youer results. The first fetch would search only the 'carat_actual' values, the second fetch would search the 'carat_fuzzy' values - the search results would be ordered by 'show contents of first fetch, then show contents of second fetch' - so the most accurate results are at the top.

My questions are:

- Does this seem like the right way to create this search functionality? If so, can anyone help me/give some pointers with the construction of the fetchs to create the search results?

- Is there a simpler or more effective way of creating this type of functionality?

Thoughts on this would be greatly appreciated.

Thanks, Steve.

http://www.oneworldmarket.co.uk

Ekkehard Dörre

Monday 09 August 2004 2:53:03 am

Hi,

is this contribution
http://ez.no/community/contributions/applications/grapevine_lonely_hearts_ads_for_your_website

there is a custom search. You can get some ideas there.

Greetings, ekke

http://www.coolscreen.de - Over 40 years of certified eZ Publish know-how: http://www.cjw-network.com
CJW Newsletter: http://projects.ez.no/cjw_newsletter - http://cjw-network.com/en/ez-publ...w-newsletter-multi-channel-marketing

steve walker

Monday 09 August 2004 4:14:14 am

Ekke,

Thanks very much for the response.

I'll take a look at the contrib and let you know how I get on.

Regards, Steve.

http://www.oneworldmarket.co.uk

steve walker

Tuesday 10 August 2004 10:02:16 am

Hi there,

I have had a look at the search facility in the Dating Contrib - it seems to search specific attributes (such as sex etc) - but this is now solved with http://ez.no/community/bug_reports/search_lacks_multi_attribute_queries?

My key issue seems to be the way in which search data is fetched - first fetch listing exact matches, and the second fetch listing near matches...

Ekke or anyone else - can you suggest some approaches for dealing with this?

Thanks, Steve.

http://www.oneworldmarket.co.uk

steve walker

Thursday 19 August 2004 4:26:06 am

Hello!

Really hoping someone could give me input on this!

As far as I can tell I'm going to want a plain-vanilla search going on, just that the search needs to stack 2 fetches - the first exact one, the second fuzzy one - and use these to generate the results?

Am I barking up the wrong tree, is there an easier way to achieve this?

Ideas and thoughts greatly appreciated!

Regards, Steve.

http://www.oneworldmarket.co.uk

Paul Borgermans

Thursday 19 August 2004 5:04:05 am

Steve,

For fuzzy searches, you can obtain a lot with attribute filtering on content, tree|list fetches instead of the search function. See also my contrib for like and not like matches or the in and not_in filter operators ...

I may add more filter operators soon. One candidate is 'between' to select in a range.

-paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

steve walker

Thursday 19 August 2004 5:35:43 am

Paul,

Thankyou for the reply.

Without wanting to push for too much spoon-feeding, do you have any code examples you could copy/paste into this thread?

Regards, Steve.

http://www.oneworldmarket.co.uk

steve walker

Friday 20 August 2004 5:06:56 am

HI there,

Unfortunately I dont think the 'like'operator will help here as I'm dealing with numerical values. And I'd like our client to be able to specify the additional values that a item can be searched on.

Just to recap what I want to achieve - an item has an attribute, say, called 'example' - in my class there is 'example-exact' field and 'example-fuzzy' field.

example-exact = '4'
example-fuzzy = '3 5'

I want the search to check for example-exact matching values, then example-fuzzy matching values - and then order them in a list showing exact matches first followed by the near matches. The search results will then display the top 3 results only.

Can this be achieved by developing the fetch function used by the search.tpl? I've had some good input from Ekke and Paul but cant quite nail down a process of achieving what I want.

Steve.

http://www.oneworldmarket.co.uk

Paul Borgermans

Friday 20 August 2004 5:14:19 am

Hi

You can use the 'in' filter operator after applying the contributed patch (for 3.4.1):

Edited: added missing 5th ')'

{let fuzzyresults=fetch(content,tree,hash(parent_node_id, <nodeid>, attribute_filter,
 array(and(array('yourclass/attribute','in','(3,5)')))))
nonfuzzyresults=fetch(content,tree,hash(parent_node_id, <nodeid>, attribute_filter, 
array(and(array('yourclass/attribute','=','4')))))
}
{* do fuzzy and non-fuzzy things *}

{/let}

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

steve walker

Friday 20 August 2004 10:56:56 am

Paul,

Many thanks - I'll let you know how I get on.

Regards, Steve.

http://www.oneworldmarket.co.uk

steve walker

Monday 23 August 2004 7:37:06 am

Paul,

I've applied the patch, and initial tests are looking good :-) One problem I am seeing, using a fetch of:

{let fuzzyresults=fetch(content,tree,hash(parent_node_id, 59, attribute_filter,
 array(and(array('203','in','(7,8,6)'))), array('204','in','(7)')))
}

Is that the fetch isnt using the the class attribute '204' in an exclusive sense...

i.e. its fetching items that have 'in' values either in class attribute '203' and '204' - I want it to only fetch records that match *both* these conditions.

Is it possible to do this as my fetch will eventually be searching on 4 class attributes?

Regards, Steve.

http://www.oneworldmarket.co.uk

Hans Melis

Monday 23 August 2004 7:49:23 am

Hi Steve,

You've got an error in the syntax of your attribute filter. Try this instead:

{let fuzzyresults=fetch(content,tree,hash(parent_node_id, 59, attribute_filter,
 array(and,array('203','in','(7,8,6)'), array('204','in','(7)'))))
}

hth

--
Hans

Hans
http://blog.hansmelis.be

Paul Borgermans

Monday 23 August 2004 7:54:07 am

It should work, but the template code you posted is wrong:

This is what it should be:

{let fuzzyresults=fetch(content,tree,hash(parent_node_id, 59, attribute_filter,
 array(and,array('203','in','(7,8,6)'), array('204','in','(7)'))))
}

I would suggest to use also the class/attribute identifiers (alphanulmeric), instead of the numeric attribute id. Be aware also that the attributes must belong to the same class.

-paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Paul Borgermans

Monday 23 August 2004 7:54:38 am

Hans was first --- :-)

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

steve walker

Monday 23 August 2004 8:37:19 am

Thanks to both of you, fetch is now operating properly!

Steve.

http://www.oneworldmarket.co.uk

steve walker

Monday 23 August 2004 9:09:18 am

http://www.oneworldmarket.co.uk

Paul Borgermans

Monday 23 August 2004 9:55:14 am

Steve

1)
Wildcards are not possible with 'in', you need to use 'like':


array('stone_product/fcarat','like','%6%')

2)

You can use the limit parameter in your 2 individual fetches, but not something combined for both (and you would need some way of defining what is are the top 3 results in a combined case).

hth

-paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

steve walker

Monday 23 August 2004 10:34:34 am

Hi Paul,

Thankyou for the response.

<i>
You can use the limit parameter in your 2 individual fetches, but not something combined for both (and you would need some way of defining what is are the top 3 results in a combined case).
</i>

This presents me with bit of a problem... the client only wants 3 results ever shown on a page, and obviously you'd just want the exact matches shown... fuzzy results are only to show up in the absence of exact ones, or to make up the result list to three... with the current 2 independent loops you will always show x amount of exact and y amount of fuzzy which isnt really going to work.

Is it possible to pipe the two fetches into one list and then loop through this new list? Or any other work arounds?

Regards, Steve.

http://www.oneworldmarket.co.uk

Hans Melis

Monday 23 August 2004 10:44:24 am

Hi Steve,

I would use something like this:

{let page_limit=3
     exact_matches=fetch(content,tree,hash(....., limit, $page_limit,....))
     fuzzy_matches=array()}
{section show=count($exact_matches)|lt($page_limit)}
  {set fuzzy_matches=fetch(content,tree,hash(....., limit, sub($page_limit,count($exact_matches)),....)}
{/section}

This will fetch maximum 3 exact matches. The {section} part is only executed if the count of $exact_matches is less than the page limit (3 in this case). If it executes, it will only fetch fuzzy matches until you have the number of results defined in $page_limit. If it doesn't execute, $fuzzy_matches remains an empty array.

This might not be totally what you want but it could get you started :-)

--
Hans

Hans
http://blog.hansmelis.be

steve walker

Monday 23 August 2004 11:11:01 am

Thanks Hans!

I suppose a possible way of always ensuring only 3 max results are shown is to put a line in the fuzzy section, something like:

$NEW_page_limit = 3 - $page_limit

Cheers!

http://www.oneworldmarket.co.uk

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 19 2025 04:22:09
Script start
Timing: Jan 19 2025 04:22:09
Module start 'layout'
Timing: Jan 19 2025 04:22:09
Module start 'content'
Warning: PHP: E_WARNING Jan 19 2025 04:22:10
DOMDocument::loadXML(): Opening and ending tag mismatch: literal line 20 and paragraph in Entity, line: 20 in /home/ze/public_html/share.se7enx.com/kernel/classes/datatypes/ezxmltext/ezxmloutputhandler.php on line 174
Warning: PHP: E_WARNING Jan 19 2025 04:22:10
DOMDocument::loadXML(): Opening and ending tag mismatch: paragraph line 20 and section in Entity, line: 20 in /home/ze/public_html/share.se7enx.com/kernel/classes/datatypes/ezxmltext/ezxmloutputhandler.php on line 174
Timing: Jan 19 2025 04:22:10
Module end 'content'
Timing: Jan 19 2025 04:22:10
Script end

Main resources:

Total runtime0.8892 sec
Peak memory usage4,096.0000 KB
Database Queries130

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0045 589.3047152.6406
Module start 'layout' 0.00450.0029 741.945339.4766
Module start 'content' 0.00750.8804 781.4219841.8984
Module end 'content' 0.88790.0012 1,623.320340.3359
Script end 0.8891  1,663.6563 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00310.3464160.0002
Check MTime0.00130.1473160.0001
Mysql Total
Database connection0.00060.065810.0006
Mysqli_queries0.760185.48661300.0058
Looping result0.00120.13261280.0000
Template Total0.860896.820.4304
Template load0.00200.222120.0010
Template processing0.858896.590220.4294
Template load and register function0.00020.026510.0002
states
state_id_array0.00120.133710.0012
state_identifier_array0.00140.152320.0007
Override
Cache load0.00220.24261350.0000
Sytem overhead
Fetch class attribute can translate value0.00080.091240.0002
Fetch class attribute name0.00120.1342230.0001
XML
Image XML parsing0.00220.250940.0006
class_abstraction
Instantiating content class attribute0.00010.0070280.0000
General
dbfile0.00240.2710280.0001
String conversion0.00000.000940.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
20content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
31content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
8content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
6content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
7content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 74
 Number of unique templates used: 7

Time used to render debug report: 0.0001 secs