Archive

Archive for May, 2013

Using mongodb text search with node.js

May 22, 2013 Leave a comment

In my last post I talked about enabling mongodb’s beta text search, which at least to me was a little less than intuitive to accomplish. That’s probably partly because of the beta nature of this feature.

The next challenge was figuring out how to interact with the text search functionality from node.js, since interacting with it from an application that needs to provide search is the whole point. I’m sure that at some point the node.js native driver will support syntax specifically for searching, but at the moment its not there yet. This post assumes that text searches are enabled and you’ve added an index.

Before I show how I am accessing the text search feature, it is helpful to know how my modules are put together in general. At the top of each module I set up the mongo connections. In this post I’m going to use “articles” as my example collection. The setup for the db object looks something like this:

var db = {};
var mongo = require('mongodb').MongoClient;
mongo.connect(config.mongodb.articles.url, function(err, cdb){
db.articles = cdb.collection('articles');
});

This leaves me with a db.articles object that provides access to the collection’s methods, including find, update, save, and so on. I would add each collection needed for the module to the db object in the same way. Unfortunately, the collections object doesn’t have a method for text searches. For that, I need access to the cdb object included in the callback to mongo.connect. To do that, I add the cdb object to my db object, which puts it in scope for the rest of my module.


var db = {};
var mongo = require('mongodb').MongoClient;
mongo.connect(config.mongodb.articles.url, function(err, cdb){
db.articles = cdb.collection('articles');
db.cdb = cdb;
});

Obviously if you have a collection called cdb you’ll need to change this. You could just set the cdb object itself in the module scope.

Then I add a search method to my module. Simplified, this looks like this:

search: function(query, callback){
db.cdb.command({text: "articles", search: query}, function(error,results){
results = (results && results.results) ? results.results : [];
callback(error,results);
});
}

We want the results that this method returns to be an array, not the object with the extra stuff mongodb adds to it. The extra conditional in there is to prevent it from throwing errors if results is undefined or something. There is probably extra logic, acl’s filtering and so on in the real thing, this is stripped down to just show the text search.

The results passed to the callback in this method will be an array of objects, each of which have two elements: score and obj. obj will have the full document for each match.

The extra steps shown here will go away when they add text searches to the driver, but for now this is a fairly functional approach. I hope it saves someone the extra time it took me to sort this out.

Advertisement
Categories: mongodb, node.js

MongoDb text search

May 22, 2013 1 comment

Full text search in noSQL databases is far less common than one would think. Most apps I build can benefit from full text searches, even if they don’t need sophisticated search capabilities. There are external solutions for most databases, mostly tying in Lucene through Elastic Search or Solr. Sometimes those external solutions are just the way you need to go, and I’ve used external Lucene integration with CouchDb before. But I was glad when I saw that text searches are included in MongoDb 2.4, at least at a beta stage.

The main catch I’ve had in my testing so far is I had a hard time figuring out how to enable this feature. Like many people (I expect), I’m using packages for Ubuntu, so I needed to figure out how to get this feature enabled in /etc/mongodb.conf. The documentation shows how to enable text searches in the command to start mongo, and mentions that you can put this in the config file, but it doesn’t tell you how to put it in the config file.

This doesn’t work:
textSearchEnabled=true

You end up with a response that says
error command line: unknown option textSearchEnabled

This is the syntax to put in the config file instead:
setParameter=textSearchEnabled=true

Once I added that, the feature was enabled. In the mongodb console I was then able to add my initial index like this:
db.content.ensureIndex( { title:'text', body: 'text' });

and search it like this:
db.content.runCommand("text", {search:'Lorem'})

This returns an object that contains an array called results with the results in it, one result object for each document that matched the text in either the title or body field. Each result is in turn an object with a score and the matching document. It also returns a stats object that tells how many documents were found and how long it took.

Overall, this feature is very promising. While it doesn’t appear as strong in its search capabilities as the Lucene solutions, having it directly available in MongoDb itself is a big win for deploying solutions for customers quickly.

Next up: interacting with the text search functionality through the native node.js driver.

Categories: mongodb