Archive for Documentation

FAQ

Would you be interested in the future to do custom development?

Definitely. Specially during these first releases. The API is getting stable and as usual lots of feedback on how things should be is needed from users.

How can I be notified of any new information regarding the future of this project ?

All news and improvements are posted at http://server.imgseek.net which can be subscribed on an RSS reader or by email. User and development issues are discussed on the official mailing lists. Also please don’t hesitate in asking for more details if anything is not clear.

How about an approach where we can color index images of our used cars (say we have images of the cars, but not the colors of the cars). We would like our users to be able to find cars in similar colours. Do you think that isk-daemon can be used for indexing ~50.000 cars photos in 10-20 major color categories (Black, grey, brown, green ….)?

What happens is that for such a technology to give you exceptional results it would need to actually recognize or get a reasonably good approximation of what is the car and what is not when looking at an usual car photo with some other background and image features with different colors that contrast with the actual car color.

The application programming interface (API) for interacting to isk-daemon doesn’t currently suppport this type of query but it is on the roadmap for the next months. Please leave an email or subscribe to the RSS feed to keep updated on the progress.

Will isk-daemon find ’similar’ images and pass back a ‘list’ of image files (filenames) ?

The isk-daemon image server would return a list of image IDs and not filenames. These image IDs (long unsigned integers) are the same you supplied to the isk-daemon when you added the images to its database.

Can I get an hash or array from isk-daemon?

Yes, you can. The result for a similarity query is an XML describing the similar images or an array represented using the RPC technology you choose (SOAP, XML-RPC), so you can transform it into an array or similar data structure that is native to your web frontend code.

What are the advantages over GIFT?

We believe the main advantage over GIFT is the query results quality, speed (isk-daemon has a very optimized C++ implementation with smart query algorithms based on Wavelets) that takes a very different and simpler approach that GIFT takes and finally, the simplicity of integrating it into your existing web application.

How accurate is it? How can it be made more accurate and how easily?

It’s hard to measure it’s accuracy, but results are quite good. We should test it on a real database and system to see if it’s good enough for your needs.

 

Does it get more accurate with the more images you put into a database?

Yes.

Do large image databases make it slow down?

See the Scalability section.

What is the order of image matching within the multiresolution wavelet decomposition (colour, shape…)?

All these factors are taken into account at the same time, but the importance of shape versus the importance of colour could be fine adjusted. This option is to be added to the API on the next versions.

Is this matching information code compatible with other software or could it be made to be so?

 

The matching information code (that is, an image fingerprint once a new image added to the database is processed and stored on the image similarity database) is not compatible with other systems, due to isk-deamon unique image similarity algorithm.

How can I improve content-matching for the typical images I’m working with?

 

There is no simple answer. These are some of the things you could do to provide better matches when considering your specific problem:
1) tweak the internal weights used on the wavelet haar decomposition and index/signature building
2) pre-process your target and training images in order to remove background noise: use face detection and image segmentation techniques to separate what is a face from what is not and only supply to isk-daemon a pre-processed version of images where only faces appear

Regarding the first approach, the weights/parameters you mostly would need to change are

const float weights[2][6][3]

which can be found at https://imgseek.svn.sourceforge.net/svnroot/imgseek/net.imgseek.imgdb/trunk/src/net/imgseek/imgdb/core/imgdb.h

You may try running experiments against a control group of photos to see which variations on these weights cause a smaller rate of classification errors.

To do so you would need to do some refactoring in order to expose these weights through the imgdb.h API so they can be varied dynamically from your calling code.

Source code for the image similarity processing algorithms can be found at https://imgseek.svn.sourceforge.net/svnroot/imgseek/net.imgseek.imgdb/trunk/src/net/imgseek/imgdb/core/

This when compiled generates dynamic libraries with bindings for Python and Java.

Where is the user interface showing thumbnails?

isk-daemon does not come with an end-user interface for querying for similar images. You will need some PHP coding skills in order to code a working web screen that should make a call to isk-daemon, understand the query reply a list of image ids and similarity ratios) and from the replied image ids, construct an html screen where the image thumbnails associated to these ids are shown.

Please take a look at the sample PHP code which queries isk-daemon for images similar to the one having the id “1″ and prints the resulting image ids on the web page. You would need to adapt it to generate the HTML code for an image gallery browser, with <img> tags with their “src” attribute pointing to the URL where the images associated to these IDs are being served to the end-user.

isk-daemon does not keep track of image filenames and the pixel data associated with a given image id. Your frontend web system should know how to generate a page containing thumbnails and the server path where the images associated to these IDs are.

isk-daemon was built this way in order to be flexible and to allow it to work integrated to any image-related web site. Most of them already have a database associating its internal image ids to image metadata like filenames and so on, so there is no point in duplicating this data inside isk-daemon.

How exactly does the similarity work? What about the wavelet and image processing technology behind imgSeek and isk-daemon?

All metric and query ideas involving 2D wavelet transforms were based on the paper Fast Multiresolution Image Querying  by Charles E. Jacobs, Adam Finkelstein and David H. Salesin, which is available at http://grail.cs.washington.edu/projects/query/

The C++ code at imgdb (part of imgSeek and isk-daemon) is almost a direct implementation of the ideas on this paper.

Is it possible to access the representation through the database file or similar?

Image signatures (wavelet transform coefficients) and data structures for coefficient buckets are serialized to disk using an internal
binary format. You can see how they are serialized by looking at the savedb() C++ routine at the imgdb source module. (both at imgSeek source code or at
https://imgseek.svn.sourceforge.net/svnroot/imgseek/trunk/net.imgseek.imgdb)

Comments

Checklist for integrating to your existing server-side systems

In order to help you integrating isk-daemon to your existing image-related web site, we propose the following list of questions. Get in contact with as much details as you can so we can try to help.

  1. Do all images on your site have a logical representation on a relational database like MySQL, PostgreQSL etc where metadata may be stored?
  2. Or are they just referenced on static HTML pages?
  3. Do you have any server side logic for your site (PHP, ASP, CGI etc) ?
  4. What’s your level of expertise writing web applications ?
  5. Is your site hosted on Windows or Linux ?
  6. Do you have shell access (Linux) ?
  7. Do you plan on running isk-daemon on a shared hosting environment or on dedicated machines ?
  8. Which other capabilities are available at your hosting package ?
  9. Are the concepts and relations mentioned on the isk-daemon architecture clear to you ?
  10. Would you agree on placing a link back to the isk-deamon site on search results as a way to promote this technology ?
  11. Do you have any web user interface already in place or are you planning a new one ?
  12. Where are image files are stored ?
  13. Which HTTP server do you use ?
  14. How many images do you plan on making available to searching ?
  15. How much CPU and RAM memory do you have available at your servers ?

Comments (1)

Cluster architecture

isk-daemon cluster architecture

The diagram above depicts the interaction between an existing web application and a cluster of isk-daemon instances.

All isk-daemon instances communicate to each other automatically balancing the load of adding images to its database and querying for similarity. This way, the web server(s) acting as a client to the isk-daemon cluster is isolated from the fact that there is a cluster of image similarity engines and can make calls to any instance.

Optionally, isk-daemon can make use of memcached instances to cache query requests in order to increase querying throughput.

Comments

Development roadmap

The roadmap for the upcoming months including the major milestones is as follows. Bug fixes shall be available immediately as intermediary releases.

Release 0.7 - March/2008
Basic support for a cluster of daemons, improving scalability and performance. See more details on the initial cluster architecture.
Support for cluster configuration on web admin interface.
Improved support for multiple database spaces on the web admin interface.
Optionally save image filenames associated to imported images.
Alternative querying modes: by average color and improved flexibility for similarity queries (parameters for the weight given to color and shape similarity)
Improvements to Database management
Command to remove databases
Database access statistics
Documentation for using isk-daemon modules as a library for building image similarity applications (so in a sense isk-daemon would be basically an XML-RPC/SOAP/POX frontend to this library)
On Windows, image filenames with extended characters won’t get imported
saveDbAs() is not remembering the supplied filename, so further calls to saveDb() will save to a file named “not yet saved” instead of the previous one.
allow batch operations in order to minimize number of remote calls during normal operation
addMultipleImages
removeAll
removeIdRange
queryIDs

Release 0.8 - June/2008
Caching mechanism for improved performance and lower network usage (memcached support and bloom filters for image id exchange between server instances)
Improved network usage for cluster synchronization
Auto-discovery for cluster instances, so cluster instances don’t need to know by configuration time which other instances are available on the local network. Failover should then be completely transparent.
Support for querying based on signature processed at client-side, enabling web-based search by sketch. One example sketch applet shall be provided.
Improved API examples
Improved image querying operatios to the web admin interface, showing image thumbnails for results.
UTF-8 image filename support

Release 0.9 - August/2008
Support for image classification based on content similarity. The API would be extended to provide a method for querying which group an image belongs to and for training the internal classification engine by providing a group on the moment an image is added to the system.

Release 1.0 - September/2008
video support
codecs
mpeg2
xvid
divx
formats
mpg
mov
avi
segment video into scenes and index each scene as being just another image on database

Comments

About isk-daemon

isk-daemon is an opensource standalone server or library capable of adding content-based (visual) image querying to any image related website or software.

Learn more about it on the main topics at the left sidebar.

Comments

« Previous entries · Next entries »