Sunday, April 28, 2013

Generating recommendations via CouchDB - Part 1

  • I set out to build a solution in CouchDB with a simple usecase in mind:
    Given a product being browsed, we want to provide suggestions based on what others have purchased in the past.
  • I've already discussed the strategies for breaking down a single sale/receipt/invoice in Generating recommendations via Elasticsearch - Part 1, so I decided to employ them for CouchDB as well. The difference being that instead of generating multiple documents for each sale:
    (total # of lineitems) * (total # of lineitems-1)
  • I found it simpler to store the sale documents themselves and generate keys that represented product pairs, via the MAP function in CouchDB. If you're unfamiliar with the core concept of map/reduce, you can watch this five minute screencast: Understanding map reduce with CouchDB.
  • Here's what the results of the map operation look like:
  • Applying the REDUCE function yields rows that tell us which product was bought together with another and how many times. This result is similar to the facets created by Elasticsearch.
  • So what's lacking for a complete implementation?
    1. We need a query that fetches results from this map/reduce view for the product being browsed by a consumer, for ex: T-shirt (demo)
    2. Next, we need to discern which products have the highest count.
    3. Next, we need to retrieve the product's ID in order to fetch more information about it ... for the users to view.
      • Where & how to store this information in the current map/reduce view ... so that we may fetch it as part of a query itself ... is an open question for now.
      • Maybe its something that cannot even be accomplished via CouchDB? Would it require an alternate implementation with secondary indexes such as the ones Cloudant provides?


Post a Comment