Introducing Lucene2Objects

I've been playing with Lucene .NET for over 2 years now. It all started as part of my incorporation to a NLP investigation group and my first task was to look into Lucene since nobody was using it. I was baffled with the strength that Lucene had, besides, the biggest players were using it! Now that I’ve get to know it a bit better I see why so many people use it, put simple: It’s awesome! However, Lucene does have a problem which is the learning curve. Wrapping your head around the concept of documents, queries, analyzers and how to get a pseudo efficient search working are a few of the issues with using Lucene on a project.

Enter Lucene2Objects, my basic idea is to make a simple interface into Lucene for those developers wanting to incorporate search annotations into the domain model. Now, let’s take an example of a system handling messages (of the “Hi! How do you do?” kind, not the WM_PAINT kind), is most probably that users would like to search for something inside their messages. A (very) basic approach gives us a simple class:

public class Message
{
 public int Id { get; set; }

 public string Text { get; set; }

 public string Title { get; set; }

 public DateTime Sent { get; set; }
}

This is neat, but if I want to implement search I can either use the services provided by my DB backend as Full Text Indexing from SQL Server (which is awesome by the way, but lacks some other cool stuff) but the biggest problem is that we would then be fixing (or tightly coupling, for the fan boys of OOP/IoC/SOLID) the data store to the solution of finding a text, which is almost definitely a bad thing.

Now, if we want to use Lucene, we need to make a few configuration stuff, learn some stuff about indexing, tokenizers, analyzers and a huge list of stuff that some folks (me included) find amusing, but others find really boring (not to mention those who find it daunting). But imagine a world where you could do something like this:

var iWriter = new IndexWriter(Environment.CurrentDirectory + @"\index");
var message = new Message { Id = 12, Sent = DateTime.Now, 
                            Text = "Some text on the message!", 
                            Title = "This is the title" 
              };
iWriter.AddEntity(message);
iWriter.Close();

Cool uh? Just point a folder and save. Nice! Well, and how would I search for stuff on that folder? Easy piece

var iReader = new IndexReader(Environment.CurrentDirectory + @"\index");
var messages = iReader.Search<Message>("text");

foreach (var message in messages) {
 Console.WriteLine("Message: {0}", message.Title);
}

Fine! And how does my model knows where to search? What to index? What not to index? Well, validations were a similar issue, so, why not give it a similar solution? Just annotate away!

[SearchableEntity(DefaultSearchProperty = "Text")]
public class Message
{
 public int Id { get; set; }

 [Indexed]
 public string Text { get; set; }

 [Indexed]
 public string Title { get; set; }

 public DateTime Sent { get; set; }

 public DateTime? Read { get; set; }
}

If you liked that way of handling things with Lucene, you’ll love Lucene2Objects. Keep in mind however, that I’m the only person working with this idea, so if you like it and want to put something into it, let me know! For now, I’ll leave the Lucene2Objects as a package in Nuget, so you can play with it. I’ll put it into my BitBucket repo this week along with my Scaffolders for SharpLite.

6 comments:

  1. Can this lib handle Object relationships, like Hibernate Search? (for eg.: IndexEmbedded, Index Collections, ManyToMany relations etc.)? Nice work anyway!

    ReplyDelete
    Replies
    1. At this point I haven't added those functionalities to the current version, but is the first item on the todo list, I plan to have it soon. Thanks for letting me know!

      Delete
  2. Very good work! It would definitely make the use of Lucene.net a lot easier. Any reasons you have a dependency on Ninject? Cheers

    ReplyDelete
    Replies
    1. I use Ninject to manage dependencies for the CompositeAnalyzer I made to handle Lucene. I'm currently writing a post on more detail about why I did it and how that works.

      Hope it helps..!

      Delete
  3. Have you put the source anywhere?

    ReplyDelete
  4. I've just placed the source code on my Bitbucket account, you can check it out on https://bitbucket.org/davidcondemarin/lucene2objects

    ReplyDelete

Commenting is allowed!