Your Privacy

This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY

Apache SOLR Search – Find Everything

Apache SOLR Search – Find Everything
Back to insights

Recently, I had a chance to build a search application for one of our clients.  The client had an existing Google Search Appliance integrated with multiple Plone CMS sites.  The Google Search Appliance has been discontinued and they were looking for a replacement for their search application.  UDig recommended Apache SOLR as a replacement.  We were able to leverage a single search schema for multiple sites and consolidate their search application into a single page application that utilizes AJAX Solr to provide a comprehensive search for the end users.  We found that the open source community has embraced SOLR as a search solution and as such, there is a Plone plugin that we used to provide near real-time indexing of new documents as they are added to the Plone CMS.  Also, with a rich set of API’s available in SOLR, the existing content, no matter how old, was indexed and searchable.  The client also wanted the ability to define custom rank orders to documents that they considered highly relevant.  With SOLR, we easily changed the ranking order based on the criteria supplied by the client.

What is Apache SOLR?

Apache SOLR is an enterprise search platform built on Apache Lucene.  Lucene is a search engine packaged together in a set of jar files.  SOLR takes the Lucene API and builds features on top of them to make the API’s available to a web server.  This also makes building a search application much easier.  SOLR is defined as a “highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. SOLR powers the search and navigation features of many of the world’s largest internet sites” (http://lucene.apache.org/solr/).  This standalone search server provides multiple mechanisms to enter your data into SOLR and provides a query language to retrieve your documents from SOLR.

So, how do you begin?

Well, first download and install SOLR using Apache’s easy to follow quickstart.  After you have a working SOLR demo site, then you can begin to figure out how to use it in your environment.  Let’s say you have a web site and you want to index the web pages for a custom search.  You can use Apache Nutch which is a mature web crawler that integrates directly with SOLR.  How about my .net Forms application? Well, there are apis for that too.  SolrNet is a .net client for SOLR.  Maybe you have a Java application – SolrJ to the rescue!  How about that file system that has hundreds of documents?  Are you constantly trying to find a word document from 2 years ago?  You can index that file system into SOLR and then search the index for that document.  In our case, we used the Plone CMS SOLR plugin to index documents.  The plugin supported both HTML documents and attachments such as Excel, Word and PDF.  This met our needs for indexing and we ended up with an index that we could use to build our search application.

Building a Search Application

We chose AJAX Solr to build out the search application.  AJAX Solr is a JavaScript library that can be extended to provide custom search results. This choice provides the users with a single search application to search all of the different locations that data is stored.   The result is a cohesive application that the users will come to rely on.  We built out the search application to include some of SOLRs wonderful features such as Faceted Search, Filtering, Query Suggestions, Spell Check and Auto-complete.  We also ranked the results so that relevant information is provided to the user higher up in the search results. Let’s breakdown some of the search features of SOLR.

To send a query to the SOLR server, you construct an URL to be sent to the server.

/solr/query?q=*:*

Basic Search

To search a term in your index called searchableText, simply put the query after a colon on the search URL

Phrases
To search a phrase, enclose the query in double quotes

Sloppy Phrases Search

A proximity query will search for a phrase within a phrase.  Utilizing a tilde (~) we can tell SOLR to look for the number of words to search for. “fast search” will match “fast search” and “fast solr search” in the searchableText field.  ~1 tells SOLR to search within 1 word of our search phrase.

Boost Queries

Any query clause can be boosted with the ^ operator. The boost is multiplied into the normal score for the clause and will affect its importance relative to other clauses. In this example, any documents with “UDig” in the searchableText field will have its score boosted by 10 which will cause that result to be higher in the results than a searchableText field with only the word “blog” in the field.

Range Queries

A range query selects documents with values between a specified lower and upper bound. Range queries work on numeric fields, date fields, and even string and text fields.

  • Square brackets [ ] denote an inclusive range query that matches values including the upper and lower bound.
  • Curly brackets { } denote an exclusive range query that matches values between the upper and lower bounds, but excluding the upper and lower bounds themselves.

There are many, many ways to slice and dice your search index.  SOLR has a very rich API which can be utilized to provide users with the best search results possible.  From internal sites and databases to externally facing websites, utilizing search will help users find everything.  In fact, our recent project actually helped users save lives.  Click here to read more.

Digging In

  • Digital Products

    Unlocking Business Potential: The Power of Custom Application Development

    Like any savvy business leader, you’re likely always on the lookout for tools to give your company a competitive edge. And in doing so, you’ve undoubtedly considered investing in custom application development. But the question is, how do you ensure that such a major investment in a custom web application development provides a strong return on […]

  • Digital Products

    Mastering Legacy Application Modernization: Strategies for Success

    The ironic truth of the business world is that change is the only constant. But this means that failing to keep pace with the competition and its technologies will only end with you falling behind. That’s where legacy application modernization enters the fold. When you modernize legacy applications, your team gains access to new features […]

  • Digital Products

    CTO Confessions Podcast

    In this episode of CTO Confessions, Rob Phillips, the Vice President of Software Engineering at UDig, digs into his journey from a passionate technologist in his youth to a seasoned leader in the tech industry. He shares valuable lessons on transitioning to senior leadership, the importance of understanding and articulating company problems, and the art of empowering teams for high performance.

  • Digital Products

    Navigating the Challenges of On Premise to Cloud Migration

    In today’s rapidly evolving technological landscape, the shift from on premise solutions to cloud-based infrastructure has become a pivotal transformation for organizations seeking to modernize their IT operations. This transition holds the promise of increased agility, cost savings, and enhanced scalability. However, it is not without its set of formidable challenges that organizations must navigate. […]

  • Digital Products

    The Power of Transferrable Skills in Tech Projects

    Every project has its own unique elements that require flexibility to be effective and achieve success. This often requires picking up new pieces of a tech stack, learning a new programming language, or a new project methodology. Fortunately, there are also many transferrable skills that carry over from one project to the next. In my […]

  • Digital Products

    The Four Pillars of Effective Digital Product Development

    In 2020 alone, approximately two billion consumers purchased at least one digital product. From software licenses to mobile apps and tech tools, consumers are becoming increasingly active in the digital product market, a trend that has naturally spurred brands across a wide range of industries to reevaluate their digital product design and development process workflows. […]