Archive for Features

Using isk-daemon as a potential solution to image copyright tracking and filtering

We’ve been working steady on improvements to the current available implementation, and according to the current roadmap,  release 0.9 (due April/2008) should have support for image classification based on visual content similarity. The API would be extended to provide a method for querying which group an image belongs to and for training the internal classification engine by specifying a group on the moment an image is added to the system.

This would potentially allow isk-daemon to be used for image copyright tracking and filtering.

Comments

Development roadmap

The roadmap for the upcoming months including the major milestones is as follows. Bug fixes shall be available immediately as intermediary releases.

Release 0.7 - March/2008
Basic support for a cluster of daemons, improving scalability and performance. See more details on the initial cluster architecture.
Support for cluster configuration on web admin interface.
Improved support for multiple database spaces on the web admin interface.
Optionally save image filenames associated to imported images.
Alternative querying modes: by average color and improved flexibility for similarity queries (parameters for the weight given to color and shape similarity)
Improvements to Database management
Command to remove databases
Database access statistics
Documentation for using isk-daemon modules as a library for building image similarity applications (so in a sense isk-daemon would be basically an XML-RPC/SOAP/POX frontend to this library)
On Windows, image filenames with extended characters won’t get imported
saveDbAs() is not remembering the supplied filename, so further calls to saveDb() will save to a file named “not yet saved” instead of the previous one.
allow batch operations in order to minimize number of remote calls during normal operation
addMultipleImages
removeAll
removeIdRange
queryIDs

Release 0.8 - June/2008
Caching mechanism for improved performance and lower network usage (memcached support and bloom filters for image id exchange between server instances)
Improved network usage for cluster synchronization
Auto-discovery for cluster instances, so cluster instances don’t need to know by configuration time which other instances are available on the local network. Failover should then be completely transparent.
Support for querying based on signature processed at client-side, enabling web-based search by sketch. One example sketch applet shall be provided.
Improved API examples
Improved image querying operatios to the web admin interface, showing image thumbnails for results.
UTF-8 image filename support

Release 0.9 - August/2008
Support for image classification based on content similarity. The API would be extended to provide a method for querying which group an image belongs to and for training the internal classification engine by providing a group on the moment an image is added to the system.

Release 1.0 - September/2008
video support
codecs
mpeg2
xvid
divx
formats
mpg
mov
avi
segment video into scenes and index each scene as being just another image on database

Comments

Basic features


These are the basic features for the server side daemon:

  • Query for images similar to one already indexed by the database, returning a similarity degree for the images on database that most resemble the target query image.
  • Query for images similar to one described by its signature. A client-side widget may generate such signature from what a user sketched and submit it to the daemon.
  • XML+HTTP interface for easy integration with other web and desktop or rich-client applications
  • Fast indexing of images one-by-one or in batch
  • Quickly remove images from database one-by-one or in batch
  • Key-based security for API users
  • Built-in web-based admin interface with statistics and ad-hoc maintenance commands/API testing
  • Optimized image processing code implemented in C++

More upcoming features can be found on the development roadmap.

Comments

Offline processing and image storage

When telling the daemon to index a file, you can either inform it a local file path with image data (plenty of image formats are currently supported), or a remote HTTP URL, in which case it would download the image to memory.

In both cases, as soon as the image is processed, it can be removed from disk or memory, thus removing copyright concerns when storing images. The image similarity database will only store a signature of the image (which is actually, just a few bytes long).

This signature can be seen merely as metadata describing the image, just like you probably already store image dimensions, source url, source gallery url, etc.

The provided API enables the image processing (as part of the process to add it to the image database) to be done on an adhoc basis (for example when user submits an image to your system) or in offline batches.

Comments