Already some time ago Google released version 3.5 and 3.6 of their picture management software Picasa which provides a local face recognition feature which is detecting faces in pictures and does some grouping of similar faces which than can be connected to persons. This does not require to upload your images to any of their servers. As I'm much in favor on deciding where pictures which contain myself and my family are going and I assume that this holds true for most of my friends this seems to be a good solution to get finally some (more) structure in my photo collection. Unfortunately the official Linux version is still 3.0 and lacking this feature.
Long time I did not bother to give any later version a try on Linux but this weekend while I wanted to sort some holiday pictures - I started Picasa while my wifi connection was down. Looking into the empty local folder which usually is mounted with the pictures from the NAS Picasa decided to drop all it's knowledge about the pictures. One connection was up again it started scanning all over the pictures again - ARGHHH. That seemed to be a good point in time to change something in the setup.
Up to know I'm still not aware of a good alternative to Picasa which overcomes the major issue with Picasa - namely not supporting multiuser/single repository setup and being a windows application. Suggestions are welcome. So I just gave the current Windows version - which is 3.6 - a try. As the official Linux version comes with an embedded wine, it should be possible to get the current windows version running under wine as well. And Bingo - both under Ubuntu 9,10 and 10.04 the standard wine package is good to run Picasa without any hassle. That was easy although some of the nice feature - namely geo-tagging is not working. I decided it to run Picasa now on the NAS itself (Atom 330 based) to circumvent the network /multi-user issues by just having a shared account on the machine. Performance of Picasa - which still tends to sleep now and then for a while - is comparable to the earlier setup. It seems that network latency and smaller computation power balance themselves.
It took roughly 2 days to scan our complete picture collection. Once in a while we dropped into the process and started to create persons and match face groups to persons. Went quite well. I must say that our picture collection for sure contains quite some challenges - like uncounted picture of the 4 first years of my Son's life. Surely a challenge. We ended up with having a lot of distinct face groups for the same persons, but this is manageable. The amount of false positives is quite low. In the end we still have more than 5000 faces which are not matched yet. Quite some work but most of them are faces of unknown people somewhere in the crowd. Really working nice. We also got some quite funny detections like recognizing the face of a Playmobil puppet as a face or detecting my son's face on a picture which only contains the Christmas tree in the living room - on the first glance but there is also a small picture of my son in the living room.
Once we have finished the job, I really want to work on extracting the information from the Picasa database/files and get them into the pictures itself to prevent vendor lock-in. There seems to be a tool for this purpose - AvPicFaceXmpTagger. Will give it a try once the face recognition process is finished.
There is no free lunch. Picture data becoming more and more structured is great if you want to manage your stuff but you might still not want to provide all this structure once you upload the picture e.g. to a website. So there is still a task for removing the structured information again form the pictures.