Taxonomy NCO AKA 'Matchy Matchy'

We've been working on an interesting new system over the past few years in the field of semantics.

It all started out with a big problem at the School of Everything. People were searching for music teachers and not finding people that had tagged their teaching profiles with piano. People looking for Martial Arts weren't finding teachers tagged with Tai Chi. The freetagging subjects vocabulary was getting out of control and the community was fragmenting as it expanded.

I did a bit of research and found some interesting academic work around NCO (Normalised Co-Occurrence) analysis. This uses set theory to guess what the semantic distance is between any pair of terms in a freetagging vocabulary.

One thing lead to another and we eneded up releasing a Drupal module which sorts it all out rather nicely:

This gives people nice 'you may also be interested in' recommendations, and allows us to serve up piano teachers on the music page. In addition my colleague Peter Brownell at School of Everything has built a clustering system on top of the NCO engine which lets us tell people what the big subjects are where you are at any point in time. As the community gets into something new, it bubbles up though the NCO and eventally becomes a main cluster if enough people are interested in it. This drives the life of the community, and has transformed the fragmentation issue into a major new driver.

I did a presentation on this at the Guardian recently, and it's now up on slideshare:

Anyone interested in this kind of stuff, and especially SKOS folk are very welcome to get in contact with us, we're keen to extend this project out through the community now.