« DORS/CLUC 2010 conference | Main | Mojo Facets actions, changes and editing »

Mojo Facets - evolution of faceted browsing

My server side faceted browser just got a bit better. In fact, it become 10 times better. But, let's try to explain this story step by step...

This week I will try to introduce faceted joins. Primary motivation is great Plants For A Future database which consists of more than one text file.

Use case is something like following:
I would like to know all plants which can have medical use, are edable and have perennial habitat (so I don't have to re-plant them every year).

And you can watch the video to see how easily this can be done:

But, this still doesn't make MojoFacets 10 times better than before. This is quite small dataset (still about 10 times bigger than Exhibit could handle), but I had new problem: 100Mb source file a bit less than 30000 items. To make it scale more I implemented pre-calculated filters and sorts. They serve same usage as indexes do in relational databases, but they are calculated on demand and stored in memory.

Let's see in action how does it work with ~30000 items:

In this video, we saw:

  • starting memory usage of ~13Mb
  • 100Mb dataset with 29869 items
  • filter by autor with 45644 taking ~10s
  • use regex filter ic,
  • godina_izdavanja is numeric facet
  • jezik filter using cro slo ser regexps and toggle it
  • show popup title on filters
  • turn off filters to show under 4s load time
  • at the end, we consumed ~260Mb of memory
Ok, 4s might not seem blazingly fast, but have in mind that all this is implemented in pure perl (so deployment is lightweight) using Mojolicious web framework. But it has it's overhead. Other than 260Mb or RAM for browser, it will also take 600Mb of RAM memory for server side. But, if you can live with 6* file size factor server side this might be very interesting as a faceted browsing tool for the web.

TrackBack

TrackBack URL for this entry:
http://blog.rot13.org/mt/mt-tb.cgi/695

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on May 25, 2010 12:20 AM.

The previous post in this blog was DORS/CLUC 2010 conference.

The next post in this blog is Mojo Facets actions, changes and editing.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 5.04