Search logging for eZ Find
==========================

This spec is about search logging with ez find

Rationale
~~~~~~~~~

Using Solr itself as a storage medium provides a few advantages:
- it is easy to store and analyze search queries and results
- Solr is a non-blocking daemon for inserting lots data concurrently
- a simple click-through mechanism allows for storing search results relevance feedback with some heuristics


Implementation
--------------

items to log
~~~~~~~~~~~~

- timestamp
- main search results
- search phrase
- session id of current user (to correlate search results)
- username
- total number of results
- method used (dismax, simplestandard, ...)
- other query parameters (offset, limit, boost factors, ...)


data store
~~~~~~~~~~

- the search log data is stored in a seperate, dedicated Solr shard
- data is regularly backed up in snapshots, should the index become corrupted
- an xml dump can also be made regularly, just by doing (range queries) and store the resultoutput


Openness
~~~~~~~~
It appears that Google Analytics is able to store search statistics :
* http://code.google.com/intl/fr-FR/apis/analytics/docs/gdata/gdataReferenceDimensionsMetrics.html#d5InternalSearch
* http://code.google.com/intl/fr-FR/apis/analytics/docs/gdata/gdataReferenceDimensionsMetrics.html#m5InternalSearch

It is highly possible that other statistics tools support such features. Having
an open architecture with regard to search logging implementation would help
supporting such alternative/complements.
More information here : http://www.google.com/support/googleanalytics/bin/answer.py?hl=en&answer=75817
