KDE’s semantic desktop: Nepomuk vs. Baloo

One of the most disliked features of the early KDE SC 4 releases was the developers' attempt to establish the semantic desktop. The tools to further this goal are Nepomuk and Akonadi. While Nepomuk tries to interconnect meta data from different desktop applications, Akonadi is a service that stores and retrieves data from PIM applications like mail, calendar and contacts. Together, they pave the road to allow users to find data, structured and connected by tags, ratings and comments, covering different file formats. On top of that, Strigi performs the indexing that enables users to find data with simple search terms in KDE's file manager Dolphin.

In the background, the Soprano framework acts as a repository to store information generated by Nepomuk. This information is then stored as Resource Description Framework (RDF), which is a fundamental standard for describing and modeling information in the semantic web.

With RDF, we are closing in on the crux of the matter that made Nepomuk hard to manage for many users. RDF seemed like a right tool for the job, but it proved to be oversized for the use with desktop systems. The initial indexing process hogged RAM and CPU resources to a point where a lot of PCs came to a grinding halt if the hardware involved was not of the latest make.  On up-to-date hardware, the initial run could still take days, depending on the amount of data to crawl. So, a lot of users turned Nepomuk off, which in turn deprived the developers of much needed testing and bug reports. Things got better as KDE SC 4 moved along, but never to the point where developers or users were happy. At some point in the late KDE 4 cycle, development of Nepomuk was exhausted, and the resource usage was still not appropriate.

With KDE SC 4.13,  a new tool moves in, and takes Nepomuk's place. Baloo was developed using a lot of Nepomuk code, but leaving RDF out of the picture. Instead of using a central database, it stores data decentralized in plugins. At the core of it, there are three services:  Data Stores, Search Stores and Relations. A Data Store stores data permanently. Search Store is a plugin which offers search capabilities for a specific kind of data. So far, there are three search stores implemented as plugins: File Search, Email Search, and Contact Search. The data provided by the three services is stored using a combination of sqlite and xapian.

Relations can be defined as ties between two uniquely identifiable identifiers. For example, relations can exist between PIM data, still provided by Akonadi, and data from files that Baloo has in storage. There can be different types of relations. A TagRelation might map a tag to a unique identifier, or an ActionRelation could map a file received by a device to the device itself.

Besides the search mask in Dolphin, there is also a new graphical interface to perform your searches. It's a Plasma widget called Milou and is officially not yet part of KDE SC. Eventually, Milou is supposed to also take the place of Krunner, equipped with a new search library in the background by the name of Sprinter. Plasma developer, Aaron Seigo, has connected the bits and pieces on this for us in his blog post.

When I first upgraded to KDE 4.13, I took some precaution to make it a smooth move from Nepomuk to Baloo in regards to the already existing databases in the background. If you never used Nepomuk, you will not need to go through this. Let's look at System Settings for a start. There you will find an icon called Desktop Search. Next to it you might, depending on the packaging of your distribution, see a similar icon called Desktop Search Advanced. If the second one is missing, you can try to install baloo-kcmadv, which should provide the alternative interface. At the moment of writing, there is no decision yet on which interface will survive, but let's hope for the advanced one, as it provides the user with more ways of configuring what gets indexed. Make sure indexing is deactivated for now.

When installing KDE 4.13 or updating to it, existing Nepomuk databases are supposed to be migrated to Baloo standards. That does not work reliably yet for everyone, but you can easily check if it did work for you. If the migration went smooth, the first line in the file ~/.kde4/share/config/nepomukserverrc looks like:

[Baloo] migrated=true

If that is not the case, you can manually start the process. First check if Nepomuk is running:

$ nepomukctl status

Towards the bottom of the output you should see the line:

Nepomuk Server is running

If that's not the case, start it with:

$ nepomukctl start

followed by:

$ nepomukbaloomigrator

After this script has done its magic, you should find the migrated and converted data under either ~/.kde/share/apps/baloo or ~/.local/share/baloo, depending on your distribution.

Should the script also fail to do the job, and there is no database under one of the two possible locations mentioned above, you can go to ~/.kde4/share/apps/nepomuk/repository/main/data/virtuosobackend/ and delete the files in there to make a clean cut. 

Now go back into System Settings - Desktop Search Advanced, and enable the file indexer. You will hopefully not be bothered by any signs of the activity running in the background.  Baloo is by far not as resource hungry as its predecessor, and works fast and reliable for me. Over the past weeks, I have read complaints that said it was also slow and a memory hog. I tested Baloo on a couple of machines and can not second that.

One of the users having problems with Baloo ran KDE on a machine with one GB of RAM. KDE will run on that kind of hardware, but it will be no fun and you would have to turn indexing off. My recommendation is to have at least 4 GB RAM with KDE. This will give you a snappy desktop environment, and Baloo will run smoothly in the background. As KDE SC moves on to KF5, Baloo will be ported to Qt 5. Existing databases will stay compatible between versions. Nepomuk has come to its designated end-of-life and will not be ported. It will vanish from your computers as you upgrade to KF5.

Subscribe to Xmodulo

Do you want to receive Linux FAQs, detailed tutorials and tips published at Xmodulo? Enter your email address below, and we will deliver our Linux posts straight to your email box, for free. Delivery powered by Google Feedburner.

The following two tabs change content below.

Ferdinand Thommes

I live as linux developer, technical author and city guide in Berlin, Germany and Charleston S.C. Other than being nerdy I dig riding bicycles and love cooking and good literature.
Your name can also be listed here. Write for us as a freelancer.

13 thoughts on “KDE’s semantic desktop: Nepomuk vs. Baloo

  1. I've had a much better experience with Baloo compared to Nepomuk, to the point where I actually find it quite useful and a killer feature of Plasma.

    • I wouldn't quite say it's a killer feature of Plasma yet. For example the widget Milou is not particularly good at finding things and needs more work and more features. If I enter a string in KRunner or in the launcher Search box, the results returned by Baloo are better than if I use Milou. Why that is, I don't know. (And yes, I do increase the size of the Milou display pane.) But I do agree that Baloo is a much better experience than Nepomuk, and, like you, I'm actually finding it quite useful.

      One thing that Baloo needs -- perhaps it does have this feature, but it's not obvious how to do it -- is the ability to show the path of a file it finds, or open a Dolphin window displaying the file in the directory it resides. The path is not listed, just the file name, and I often find it difficult to find where the file is, even if I can open it from Baloo.

  2. While I have 16 gig RAM here, any memory/cpu/disk usage for functionality I don't need nor want is too much.

    Fortunately as a gentooer I have the luxury of configuring kde4 with USE=-semantic-desktop at build-time, so I don't have the stuff (substitute an appropriate 4-letter word if you wish) even installed for my kde4 desktop. Of course that means no kdepim/kmail these days, but I found claws-mail works well for my needs... and doesn't lose messages and need a database reset to get them back, like kmail was doing once it got akonadified. I do still have to install strigi as various kde bits I do use won't build without its headers, but without a backend and without nepomuk/soprano/baloo/virtuoso/akonadi/etc, that's pretty much all it is, some required headers to build against.

    kde-frameworks/kde5 is supposed to be even more modular so hopefully it'll be even easier to kill the semantic-desktop stuff there, tho milou replacing krunner doesn't exactly sound promising. I did try kde-frameworks-5/plasma-5-rc for a bit and installed milou/baloo/etc temporarily there, but with a bit of time I'd expect to be able to remove them. Should that not be the case, there's plenty of alternative desktops around and that would very likely be just the excuse I needed to do some serious desktop shopping, after running a kde desktop since the kde2 era. Time will tell, of course.

  3. Hmm, what is this semantic desktop good for? I don't really understand how it's supposed to be used. I arrange my stuff in folders to find it, and if I don't I use locate, and move it over in the first place I looked :)

    • I use KDE both at work and home, and I've never found a use for this "semantic desktop" business. I just end up disabling it and going on my merry way. I wonder if anyone outside the KDE dev community actually uses that feature.

    • Heh, I don't even use locate, for much the same reason -- I found that its database-update took time and resources better used for other things, and I never used it anyway, so...

      ls/grep and mc's find, plus a working understanding of computers and how to use directories for organization, seems to be enough for me.

      I think the idea with semantic-desktop is the same one that makes gmail so popular. Just toss everything in the same big directory/folder heap and use tags and indexed search for sorting. But that tends to repulse a lot of computer literate folks the same way the same big-pile-in-the-center-of-the-floor clothes storage method tends to repulse moms of teenage boys, who don't need help finding a particular dress and matching accessories they have in mind because they know exactly where the dress is in their closet and exactly where and in what drawers they put the accessories as well. And if they don't, they have a good enough idea of the places to look that they can still find it faster than they could go look it up in some indexer, if they had one. And they expect the same of their kid, who happens to have other ideas! =:^)

  4. Nepomuk and baloo and all related to it yet and in the future is crap. It's so crappy and we are required to want and to have it.

    I have all my files and folders, etc, where I want it. No mess, all well structured and easy to find even with a lot of external and internal hard disk drives.

    I don't have to use any kind of 'search', never had to. No locate, no nepomuk, no baloo, no windows search, ... My hard disk drives grow and grow in numbers from year to year, still no chaos.
    That's true for almost 2 decades, and still counting.

    Users like myself get punished severly with semantic desktop bullshit; precious hours to murder, delete, deactivate, re-link to /dev/{null,true}. So damned buggy and awfully wrong.

    The KDE devs are getting on my nervs,...
    Baloo developer and maintainer Vishesh Handa: have a look at his comments at bugs.kde.org.
    Aaron Seigo for 'promoting' such bullshit and marketing it as 'soooo special and important,...'.

    You 2 are sooooo wrong.

    Please, make semantic desktop only OPTIONAL, no hard dependencies.

    • I agree, my experience with Baloo had it thrashing my disk and making my entire system unresponsive and involved a frustrating few hours trying to find out how to disable it.

      Honestly I see most people disabling Baloo just like they did with nepomuk. I see that as a good thing because then maybe the KDE developers will give up on this crap.

      Just like lebrown I keep my files in order and never have to search for anything. This makes the KDE developers insistence on making it hard to disable these bloated feartures frustrating and I have been seriously considering other desktop options!!

  5. I was a really heavy Kmail user in the 3.x era (to the tune of 70GiB of e-mail). Then the semantic desktop fad arrived, and I was unable to have things like two inboxes (I had four), and Kmail turned into an unuseable mess. And the Akonadi/Nepomuk thing was impossible to walk around. My desktop became clogged by the search-engine-in-a-desktop thing to the point I saw no reason to use KDE or Linux anymore as my main environment. Years pass on, and today I see people discussing search-engine-in-a-desktop-thing v2.0. Oh my. Why for 's sake KDE is still beating this dead horse? Not that the technical achievement of writing a search engine is no feat, but force-feeding users with something not suited for the vast majority of daily use-cases (people don't tend to spend the days searching their own files or indexing and indexing) is an ill-advised thing to do with an ill-advised tech not suitable to desktop use, that will alienate its user base. OS X has a similar but better implemented system, but still of minimal usefulness. The same about Windows. Please KDE developers, let's focus on features that matter instead of bringing more pain to your userbase. I even was thinking the semantic-desktop-thing was a plot from MS to discredit Linux as a viable desktop option.

    • You mentioned MS Windows. That got me thinking back to before the turn of the century when I was a volunteer and public beta tester on IE4-5.5 and the related MSIE newsgroups (by IE6 I was switching to Linux so while I ran it before I finally switched, I didn't do the beta).

      Back then MS Office had some sort of indexer, and people used to complain about how Office slowed their system down. One of the tricks the experts knew was to turn off this indexer, which was set to start with the desktop by default, and suddenly people found their computer got most of its speed back! =:^)

      One might think we would have taken the lesson from that and not had to go thru it again on Linux, but here it is over a decade and a half later, and the first thing the kde experts are doing and recommending to others on a kde system is turning off the file indexer to try to get some speed back. Sad, but it works today just as it did back then. =:^)

      Of course the better option, where possible, is to turn the stuff off at build-time, so it doesn't even get built and installed. That's what I've been doing on kde4 on gentoo since the kde 4.7 era, when I gave up on the akonadified kmail and switched to something else (claws-mail, as I said earlier), so I /could/ turn off the stuff, since akonadi depended on at least part of it being operational.

      Now they're talking about replacing krunner with a baloo-ified thing in plasma5. I hope not, and that at minimum, krunner or some good other non-balooified option remains. Because due to various issues during kde4 I'm running a lot less kde now that I did in the kde3 era, and should I have to jettison krunner and plasma as a result of balooification, what little of kde that would still be around wouldn't be worth maintaining, so I'd probably be off of kde entirely.

  6. This is very common these days where we pick a side like a sport team and defend it. Baloo doesn't work well on some very common high end systems. It works for you. Congratulations! I have an 8 core 3+ GHz box with 16GB of RAM and 1.5TB Raid Array. I have to turn off Baloo when I want to use the machine and then turn it back on when I am not using it in the hope that it will eventually settle down. Otherwise my machine is unsuable for days after a reinstall.

    Unix used to be about being simple and elegant and providing choices of tools. The modern corruption of this has led to Windows/Mac thinking of "everyone wants this!" and "it works on my machine, so ship it!" Things are developed all the time that don't take into account that I might be logged on in two places or that my X server might not be my PC system console. Why? Because that's now how Windows works.

    Shame on developers (and I am a developer) that force us to use their software, especially if they would rather defend it with a fan boy attitude than fix problems. At least now it is easier to turn off, but even that was difficult to do.

    Nepomuk, baloo, the installer, all the opaque software that is being created reminds me of why I left Windows in the first place. Give us control. Give us visibility. Give us choices. I can pick a file system, a desktop environment, and a shell. Why do I want to get married to a file indexer?

  7. "One of the users having problems with Baloo ran KDE on a machine with one GB of RAM"

    Yes, only GTK free Distros like KaOS run smooth with 1 GB RAM.

Leave a comment

Your email address will not be published. Required fields are marked *