Saturday, February 28, 2009

Different kinds of Filtering

This is the third post about filtering. In this post we will look at different kinds of filtering.

Filtering is usually a good way to help people work with large data sets and depending on the kind of data and the goal there are at least the following kinds of filtering:

Indexing

Indexing such as tag clouds or alphanumeric filtering can be used as a simple way for people to narrow down the data set to a more manageable chunk and to help people focus on what they are interested in. Indexing is defined as only allowing you to filter by one tag/letter and is only really useful if people already know something about the data or what the tag/indexer means.

Multiple Tag Filtering

In Multiple Tag filtering you get to select more than one tag to narrow down the result set. Del.icio.us provides a great example of this kind of filtering:image

There is always an implied AND between the tags meaning that only results that have all the chosen tags will be shown. Del.icio.us implementation where you can remove any tag from your filter lends itself well to exploration of the data. In the above example for instance you could remove ‘ux’ and the explore what other topics there are articles about.

Faceted Navigation

Faceted navigation is one of my own personal favorite kinds of filtering because it encourages exploration and does not require you to know the data very well beforehand. To get faceted search to work it is crucial that the facets are only showing existing data i.e. if you are searching for MP3 players the price ranges should be derived from the actual price ranges and not let you select $1000 when the top price is under $500 (yes, seriously, $500).

image

Though it is a best practice to only show criteria that will lead to results you could get faceted search to work with links to empty results as long as you can predict the possible links. In other words, if you know that the display size for MP3 players come in 1.0”, 1.8”, 1.9” and so on and you don’t currently have any with 1.9” in stock you could still include the link and let people save that query as an agent, but you should not include 1.1”, 1.2” and so on, just because that is the logical numbering. In other words you should never allow people to get to a meaningless query.

The strength of faceted navigation is also its weakness. In domains where you don’t know the possible data values which is often the case in for instance computer management, you need a way to allow people to search for possible rather than actual values.

Attribute Based Filtering

Attribute Based Filtering is in its raw form really easy to program, because it really just exposes the database, and is hence seen in way too many applications. It is also really hard to make usable because it normally requires people to not only know the data but also understand AND, OR, and groupings. Having seen enough Development Mangers get Boolean queries wrong in Microsoft’s bug database has let me to believe that there really is no reason to expose the horrors of Boolean logic to anyone but maybe programmers.

WinAmpViewBuildingSimple

Interestingly attribute based filtering is the defacto standard way of making auto playlists in all the media players I have used. In many ways, I think this is a bit lazy and that those applications should really work towards faceted navigation based on your music collection. When I say ‘lazy’ it is because faceted navigation not only requires a design, but also that the application constantly monitors the data set and creates the facets. That requires a lot of programming.

From a design perspective, making faceted navigation work also requires a way to OR criteria together that is typically absent i.e. Artist is Dire Straits OR Mark Knopfler. Normally you would only be able to pick one facet from each category but to get playlist to be really valuable you want to be able to pick more than one genre or artist.

Rule Based Filtering

Rule Based filtering is best known from Microsoft Outlook and is a way of making attribute based filtering easier. Essentially rules are pre-made queries where people are only asked to fill out the operands.

OutlookRule2

If you for instance create a rule like: “with specific words in the subject” this gets translated to “with Word1 OR Word2 or Wordn in the subject” which means you cannot change the logic, the attributes or the operators.

Rule based filtering is both usable and powerful because it takes away much of the complexity of writing queries by pre-packaging the most likely used queries in the right format and can cover a large set of different cases. There will however always be cases that the pre-canned rules do not cover and for that you typically need attribute based queries.

Next we will dive into some existing implementations of attribute based filtering.

Getting deep on Filtering

This is the second post in a series on filtering. This is really getting into the gory details of filtering and may seem a little dry. Skip it if it bores you.

Terms and Definitions
  • Search: you start with nothing and get something
    • Think of web search. You start with a blank screen and get some results back
  • Filter: You start with something and get less
    • You have a full collection of music. By setting up a criteria you will see less than everything
  • Find: You start with something and move around in it
    • Try it in Word or in your browser. You find the first instance of your search string but you can still see the rest of the document.
Attributes, Operators and operands

‘Artist equals NOFX’ follows the normal pattern for a filter criterion on songs. When we break it down into pieces we will use the following terms:

· Attribute: Artist is an attribute just as Rating, Play count, and Release Year

· Operator: Equals is the operator. Typically filter criteria uses operators such as Contains, equals, above, below, and between.

· Operand: In the example above, ‘NOFX’ is the operand. Operand is just a fancy word for the value(s) we want to operate on.

Boolean filtering

At the end of the day, filtering is really about finding the right subset of data, and if you want to get in deep, you should read up on set theory. For this discussion that will not be necessary though. We don’t really care about mathematics and what people could possibly do, we care about making a filtering mechanism that works well for the majority of cases. ‘Works well’ means that it is easy to use and complete enough to cover 95% of the scenarios of the people who use our products.

You will nonetheless need to understand a little about AND, OR and grouping.

If you have a basket of apples, some green, some red, and those apples are of different sizes and weight, you can subdivide the apples into smaller groups by those properties. You could for instance pick up only green apples, or only red apples that are larger than a tennis ball. To express the criteria by which you picked the apples, the latter could be expressed as

Color is ‘red’ AND size = Larger than tennis ball.

Now say you weren’t looking at apples but bell peppers. They come in small, medium, and large and in Green, Yellow, Orange, and Red. Now if you are like me, you prefer small green ones. But if you are more colorful you may like yellow, orange, red in medium and large. To get those you would use the criteria:

Color is Yellow, Color is Orange, Color is Red, Size is Medium, Size is large.

Now, without any additional information about which information goes together you have effectively used an OR

Color is Yellow OR Color is Orange OR Color is Red OR Size is Medium OR Size is large.

and may end up with Small Yellows or Medium Greens. Actually, the only kind you will not end up with are small green ones, which is good because those would then be left for me :) but you clearly got more than what you intended.

However, if you only use AND you not get any bell peppers at all, clearly also not what you intended.

Color is Yellow AND Color is Orange AND Color is Red AND Size is Medium AND Size is large.

What you want is a mix of AND and OR with some parenthesis to demarcate which pieces goes together

(Color is Yellow OR Color is Orange OR Color is Red) AND (Size is Medium OR Size is large).

With this criteria in place we have the foundation for all the filtering we need. If you look closer at it you will see that we have OR between all criteria for the same attribute, parenthesis around all the criteria that is related to the same attribute and AND between parenthesis. This is a core concept which we will dive much more into once we get to the Active Directory Administrative Center filtering: Criteria on the same attribute are OR’ed together and grouped, Criteria on different attributes are AND’ed

Next we will look at different kinds of filtering and after that we will look at some product examples.

On Filtering

I have worked quite a bit with filtering over the past couple of years and I just happen to be an obsessive filterer when it comes to my music collection. As in I rarely listen to a specific album or artist but to playlists that are made up of genres, rating and so forth.

This will be the first post in a series on filtering. My main goal is to explain a recent design of an attribute based filtering I have been part of for Windows Server 2008 R2 Active Directory Administrative Center –lovely short name, ain’t it ;-)

First however, I think we should talk a little about filtering in general, the different kinds of filtering, some different implementations of query based filtering and then we will get to the AD Administrative Center design and why we designed it the way we did.

Why Filtering

If you boil it down to the core we only really deal with a few questions in UI design. ‘Find stuff’ and ‘deal with stuff’. Both gets more and more exaggerated as there is more and more stuff and the stuff becomes more and more complex. My music collection is some 10.000 songs and that is not even considered large by today’s standards, and each song has so much meta-data that the traditional windows property pages crumble under the weight. Now try managing a data center, or try to look at events, or figure out the meta-data on an AD object and the music example will seem trivial in comparison.

But music makes good example data for talking about filtering because it has built in natural groupings such as artist, album, year, and genre, and most music players allow you to add to that meta-data with for example ratings or moods. In that way music has many of the same characteristics of the data we deal with in the IT management space.

I split my music into Indie Rock, Alternative Rock, Y’alternative, Hard Rock etc. and make playlists with criteria like

Hard rock, Heavy Rock, Alternative, unrated or with a rating over 3 and not marked for not playing.

If we write that out it becomes

Genre = Hard rock OR Heavy Rock OR Alternative
AND Rating is above 3 OR Rating is empty
AND Comments does not contain ‘don’t play’

This query is not unlike queries we will find in the IT management space, in ERP, or anywhere else where we are dealing with large data sets of structured data.

Next, let’s discuss the details of filtering and Boolean logic (Getting deep on filtering) or you can skip that if you know the stuff or really don’t care and go straight to the different kinds of filtering.

Thursday, February 26, 2009

A Good Time to Quit?

We just got a new person on our team. He had been at Microsoft before but quite some years ago to go travel fro a couple of years. Now travelling had become tiresome and he wanted to settle down again for a bit and so got back to work.

Got me thinking… is now really the perfect time to quit and go travelling or work on a new business idea? At least assuming you have the funds, or should you wait? Think about it. Everything is cheaper right now. Airfares are dirt cheap as the carriers are desperate for business, hotels are dying for business, and stores seem to be inventing new anniversaries that can give them an excuse for a sale. Maybe now is the best time in years to quit for a while enjoy life and then get back to work when the economy is picking up again.

Thursday, February 12, 2009

It is you, not me… A Great Ted talk

http://www.ted.com/talks/elizabeth_gilbert_on_genius.html

There is nothing scarier than the beginning of a creative process where the canvas is blank and everything is possible. Will you succeed again, will you deliver the desired outcome?  Elizabeth Gilbert talks about Muses or similar concepts. Before the renaissance, people believed that creativity came from without, not within. From Muses, genies, spirits. That takes the pressure off you and puts it onto an external force. I like that idea. I also think it has a lot to do with the environment you are in, the people you are working with and so forth.  But then again, you could think about those as the spirits :)