Using images on older computers or in special situations often demand reducing the number of colours in the image drastically. Sometimes there are additional constraints on the colours used. The currently available tools for this often lack vital features, or they are designed for being fast on computers designed in the 90ies, not 2013.
An advanced tool that will excel at colour reduction, taking as much CPU time as needed to search for an optimal palette while having hooks for imposing limitations on the destination palette.
The general idea is doing the reduction in a custom colourspace, and using an iteration-based solver where each entry in the palette will search for the best solution while competing with the other palette entries. Each colour will get a score depending on a set of rules, and the goal for each colour is to increase its score.
Dithering should really have to be specified from the beginning, because|
the palette picker will have to optimize differently depending on how much dithering will be used.
It could be nice to adjust the amount of dithering used, which would impose a limit on the maximum error that would be allowed between a source and destination pixel.
For floyd-steinberg-style dithering, a leaky integrator could be used instead of the perfect integer one that is normally used.
|The problem with neighbouring pixels is that dithering relies on error diffusion, so the error is spread out to neighbouring pixels. This makes it very difficult selecting colours optimally because a good colour for one voxel is not necessarily good for another voxel in the same position in voxelspace, because it has another position in the image.|
|After iterating, but before scoring, a dithering step can be performed. This will try the palette out, and the actual errors from the pixels can then modify the weight of the corresponding voxel.|
Maybe a voxel can have 8 neighbours connected, which could be taken into consideration when scoring. A colour could get points for the actual voxel, and then 1/8 from each of the neighbouring voxels, or actually some sort of gaussian distribution depending on the distance.|
This method may make it easier for a palette to adjust to dithered images because a set of voxels occupying the same position in voxelspace will have "attraction vectors" in unique directions.
The attraction vector is to be used when adding scores to dithered voxels. An off-voxel colour will get a higher score if it's in the general direction of the attraction vector.|
This could possibly also be used when generating palettes for HAM-images.
Some sort of ordered dither is often prettier when|
using very few colours. With our 3D method, each pixel will have 1-4 colours to dither between.
A pixel is selected at random from the image. A colour from the palette is selected, and the error between the source pixel and the new pixel is calculated and spread out to its unmapped neighbours using some sort of gaussian point.|
This is repeated until no more pixels are left unmapped.
The idea behind the random balancing could be extended to some sort of massive dependency, so that each pixel influence every other pixel nearby, and quantizing one pixel will spread the error to the neighbours, and if they are already quantized, a new quantization for that one may be picked, etc.|
This could take a while to calculate but might give great results.
Perhaps it can be done with a single line of the image at a time.
|When dithering, some target video hardware can have additional features that will blend colours in different ways. These may be beneficial and could be utilized to improve the quality.|
|An interlaced picture will quickly alternate between two lines fairly close to eachother. This could be used to create complementing lines, making sure that the colour for a pixel on a line is different from the one above and below it for pixels that must be dithered.|
|The colours in analog video will bleed and blend. This effect could be abused, but the resolution of the target image must be known.|
|Some target screens are more fuzzy than others.|
Fitting means to adjust pixels by contrast, gamma, hue etc to fit a fixed palette. This can be used with the same scoring methods as the reduction iterations, but instead of moving the colours, different operations are done to the voxels.|
The original image will have its colours changed, but if things go as planned, it will still look "ok", but more importantly, it will utilize the fixed palette in a much better way, and probably come out looking better as a result.
Reduction is the core part. It will find a number of colours that are optimal for the input image(s).|
The idea is to fit a couple of points in a 3D space so that the points will have an equal score.
|The colours in the voxelspace are collecting points and a score.|
|We have to know when to finish iteration and call it a day. It should be possible to manually abort the process with ctrl-c or similar no matter which algorithm is used to determine the finish point, and a delayed real time preview can be shown at all times to monitor progress.|
|The algorithm can be set to run for a fixed number of seconds. This is probably a stupid idea.|
Some sort of threshold could be either set explicitly or be there by default. When the target is reached, the algorithm quits. How do we define the target quality?|
Obviously if the image can be accurately represented
with the limited number of colours the user has chosen, the algorihm will quit.
When iterating, we will probably move a shorter and shorter distance the more we iterate. After a while, this won't make any change to the palette at all, and there's no point to continue any more: an optimal palette has been reached, given the starting positions and the rules.|
Perhaps a local minima has been found.
A set of iteration runs can run in parallel, but with different initial positions. If these runs converge to the same palette, the duplicates are thrown out and this will continue until there's only one palette left.|
Assuming the iteration will have no state other than the palette, this can be implemented by just swapping palettes each iteration. The palette will contain very little data and will have to be changed each iteration anyway.
This assumes that the iteration is stable. This is absolutely not a requirement.
|Stopping after a fixed number of iterations may be interesting while developing, but probably pretty useless otherwise.|
The initial positions of the colours can be selected either by a completely random choice, some sort of "good initial guess", or by some other method.|
Initial colours are selected from a random set of voxels already in the space. Duplicates are tried again.|
Benefits of this is that there will be less wasted iteration steps in the beginning, although this will probably be a trivial amount of time. It should give a better initial start, also when doing it repeatedly, since it will sample the data.
A drawback is that the optimal position for a colour is not necessarily located on top of a voxel, especially so for dithered images.
When using CIEDE2000 to measure the difference, the faster CIE76 could be used to find a probably good starting point from a random set of coordinates.|
In order to remove some bias here, the number of iterations using CIE76 should also be random.
|Initial positions can be spread out evenly, either as full greys, or in a 3D grid in the voxelspace. This is probably bad, because there's no random choice and thus it will alwas iterate to the same palette with the same input.|
The colours are inserted completely at random in the voxelspace, within some sort of bounding volume.|
Benefits to this is that there's no bias at all, and it can be repeated a lot of times to find different sets of "optimal" palettes which will perhaps all converge to the same palette.
If a good iteration policy is found, it will only take a few iterations for the colours to enter the good spots, so spreading colours out at random shouldn't be detrimental to the speed.
|There are different ways to move the colours to possibly better positions. They can be moved individually one at a time, or in groups. Genetic algorithms could work.|
|Genetic algorithms "breed" palettes with eachother, keeping ones that score higher.|
|Perhaps the least squares algorithm can be adapted to this. Each colour would then move in a direction according to perhaps the center of the (weighed) points it already have. This is probably what some trivial clustering algorithm does.|
|The points could move at random a few different steps, and pick the best of these. Possibly connected to "genetic algorithms".|
After or during the iteration movement phase, additional constraints on the position in the voxelspace can be imposed. Trivial limitations are quantization to 8 or 4 bits per pixel in the target colourspace, but more complex ones could be used.|
Other examples from "real life" would be the EHB mode in the Amiga, which would have a 4-bit quantization per pixel in the sRGB or RGB space, and each colour would have an additional colour with half the intensity, or half the magnitude in the sRGB space, also quantized to 4 bits and rounded down.
Some colours could be forced to static positions, for example enforcing full black and full white, or using a common, fixed palette for the first 16 colours.
All target colorspaces will also have a limited gamut, which the colour can not exceed. That must also be considered here.
Maybe there is a fixed set of colours to pick from, that can't easily be described as a quantized version, then it would be tested for here.
|The key part of the algorithm is that each colour will get a score, measuring how good it is at representing pixels in the image. How a colour accumulates a score is central to the program, and there are a few issues to think about.|
I'll call this the "greyscale problem". It is easier to describe in a greyscale image.|
Assume we want to generate two colours. Assume that the input image is a good quality greyscale photograph, having a roughly equal amount of pixels for each shade of grey, and a full range from black to white.
Most algorithms will generate two grey colours to best represent this image, as would this one unless some more advanced rules are implemented.
The problem with this is that every pixel in the image that are brighter than the light grey colour in the palette can not be represented at all in this new palette, even by using dithering. The same thing goes for every pixel darker than the darkest colour. This can be extended to a three dimensional RGB space and more colours. It is harder to visualize, but the problem still remains: No pixel outside of the palette "gamut" can be represented, so the quantized image will be more dull and limited than necessary.
To account for this, it's important to realize that a colour that is generated by dithering using other colours in the palette can only look like a blend between those colours. By visualizing this in a 3D space, the colours in the palette forms a convex hull. Using dithering and disregarding image quality, any colour that is a convex combination of the set of colours in the palette can be represented. No colour outside can be represented at all.
This can be modelled by completely discarding voxels outside the convex hull in the scoring process: no colour will get any points from a voxel that can not be represented even by dithering. When this is done, a colour can increase its score a lot by moving away from highly covered clusters to the edge and capture the outliers instead of competing with other colours.
Completely discarding points may not be the optimal solution, maybe they should just get a lower score. However, they can't actually be represented accurately so they shouldn't add to any colour.
If a colour gets a score just for hogging a voxel, it doesn't matter how far it is from the center of a cluster of voxels, just that it's the only colour hogging the cluster. This is obviously bad, so we need to introduce an incentive for the colour to move near the center of clusters.|
Consider a colour only hogging two voxels. If we weigh the score for these voxels based on a linear distance, it is possible that the colour can get an equally score at many positions, at least for degenerate cases.
When performing ordinary curve fitting, the score is often weighed with the squared distance. Which one is better here must be evaluated.
When calculating the score for a colour, it is important to only count a voxel once.|
If we give each colour a score for each voxel just depending on the distance, even if it's squared, each colour would end up on the same position.
A unique mapping between a voxel and a colour would ensure that colours spread out where there are unmapped voxels, and that they compete with eachother.
A one-to-one mapping is not necessarily the optimal strategy when the final image is to be dithered. Then a group of colours will be involved in generating the final pixel, and it is a bit arbitrary which colour will be used for which pixel because it depends on the dithering algorithm and the position in the image. Assuming an ideal dithering algorithm, constant colour and infinite dithering resolution, any colour inside a tetrahedron defined by four colours can be rendered. This means that we may want to share a voxel between four colours.
There is however one major drawback with this: dithering is just a necessary evil, not a good looking solution. The problem is that if a voxel can be represented close enough without using dither, that's much better, but if it must be dithered, it's probably better for the colours involved to be at an equal distance from the pixel. This means that a cluster of similar voxels would be best represented by four surrounding colours, none which would actually be in the center of the cluster.
Coming up with a solution to this will be difficult, and the rules should probably be changed depending on the amount of dithering requested.
The four colours used can be found by doing a triangulation of the palette, and assigning the four vertices as colours for the voxels inside the formed tetrahedra.
|The scoring algorithm must probably be different depending on if the target image is supposed to be dithered or not, and if so, the amount of dithering to be done. The concept of voxel neighbours may be utilized to spread out the score for voxels to more than one colour.|
A "loner" can not be dithered. This is a pixel of a single colours with no close neighbours, or a single pixel of a colour vastly different from its neighbours.|
It's not apparent if this should be taken into consideration or not, or if it will even affect anything. It may be important when selecting the palette for low-resolution images like sprites etc.
Perhaps a traditional clustering algorithm can be used, one that will identify outliers. The clusters it *will* find can be surrounded with colours and use dithering, but the outliers can be handled like exact colours.
Taken to the extreme, the cluster test can be performed in five dimensions: X, Y, R, G, B. This would identify pixels close in space and close in colour. It's not immediately obvious if this is beneficial or not.
Under certain conditions, some colours may not get any score at all. This can be normal, for example if there are less unique voxels than colours, or if heavy restrictions on the palette is in use.|
We only have to take this into consideration during the iteration steps, and don't treat this as some sort of abnormal case. Undesireable, but not abnormal.
|The smoothness can affect the scoring because a pixel in a noisy field can have its colour varied by a lot before anyone takes notice, but a pixel in a smooth area is more important to get right. The smoothness will be calculated either with wavelets/fft or by using some other high pass filter. It will then adjust the voxel weight, so that no extra care should have to be taken to the smoothness during the iteration.|
|The palette is the set of colours that are currently iterating. It will have a fixed number of colours during the whole run, but due to restrictions for each colour, some of these may occupy the exact same space, and then only one will be counted and the other will have a score of zero.|
A colour is the name of a single moving point in the 3D space where all the voxels are. When everything is finished, all the colours will form the new palette.|
The colour has a coordinate in voxelspace and a total score. While iterating, it will occupy a volume and the volume will contain zero or more voxels.
Depending on the algorithm, it may have concepts like "Neighbours", "enemies" etc.
|During the iteration, each colour entry will get a "score", depending on an algorithm, some options, and most importantly its relation to every other colour and the voxels.|
|A colour is "hogging" a voxel if it's the only one getting a score from the voxel, or if it's a member of a few colours sharing a voxel.|
|The target colourspace is important to know while iterating, because it will impose a limited gamut on the generated palette. Nominally this is sRGB, but other colourspaces can be pretty common as well, for example some variants of YPrPb.|
|The palette forms a convex hull in a colourspace, defining the gamut for the generated palette if dithering is assumed. No colour outside the convex hull formed by the palette can be represented, even by dithering.|
|An image is a set of pixels. The goal of the algorithm is to reduce the number of colours in the image. The algorithm can be fed with multiple images but all the pixels will be thrown in to the same voxelspace.|
A pixel is a colour from the original image. It has a (usually quantized) colour in a 3D colourspace, usually sRGB, and a coordinate in a 2D image.|
The colour will be transformed to another 3D space, after which the pixel will be called a voxel.
A voxel is a point from the image (a pixel) transformed to the new coordinate colour space.|
A voxel has a 3D coordinate and a weight. It may also have pixel neighbours, which may be either linked explicitly or looked up backwards from the image.
|Voxelspace is the name of the colourspace in which we operate. It contains voxels and colours.|
The "smoothness" of a voxel is a measure of how different its pixel is from the surrounding pixels. It is related to the attraction vector, but when the vector can in theory be zero if the surrounding pixels cancel each other out completely, the smoothness measure will be further from zero the more this voxel sticks out in the image. A white pixel in a field of black pixels will have minimum smoothness, and a pixel surrounded by pixels with the same colour will have maximum smoothness.|
This is highly connected to the Wavelet/FFT modifier, and it is possible that the wavelet stage should actually just be a fast way to calculate the smoothness, or that this should replace the wavelet stage.
After pondering a bit, the smoothness will probably be incorporated into the voxel weight, and it doesn't need to have its own property.
...and it should probably be generated by a specialized algorithm and not FFT or highpass, because we want a value of how "different" this pixel is compared to its neighbours.
An attraction vector for a voxel depends on the position of the voxels that are close to the underlying pixel. A red pixel surrounded by blue pixels will have an attraction vector pointing towards blue in voxelspace, while a red pixel surrounded by green pixels will have a vector towards green, even though they occupy the same position in voxelspace.|
This can be important when adjusting the score for colours and dithering is being used.
The weight of a voxel determines how important it is to the image. Weights can be assigned to voxels either manually, from the alpha channel, or be calculated using wavelets or an FFT of some sort.|
The weights of each voxel will be used when scoring.
The alpha channel in an image can be used to give information to the algorithm on how important a pixel is. Fully transparent pixels will not be factored in at all, while fully opaque pixels will have the highest score. It varies linearly in between.|
|Perhaps sharp edges and noisy areas of the picture aren't as important as large areas with gradients. A wavelet filter or FFT filter can be used to determine which pixels belong to which category, and assign weights automatically.|
If there are 10 pixels of bright white, 100000 pixels of dark red and 100000 pixels of darker red, those 10 white pixels should not be worth 1/10000 of a red pixel. A histogram can be used to assign a higher importance to unusual colours.|
This is perhaps a bad idea, but can be tested.
|The colourspace to work in is important. Optimally, the distance in the colourspace should reflect how "different" we think two colours are in real life. If this is true, we can easily compare colours and calculate how far "off" we are.|
There should really not be any limitations on the amount of memory or CPU that is required to generate a good palette with this method. Anything one would reasonably want this to run on will have at least 2 GiB of RAM and a dualcore CPU of some sort.|
Since every operation we want to do is optimal to run on a GPU it may be beneficial to set a limit on the memory usage so it can fit completely in even an old GPU. Since the author's GPU has 512 MiB of RAM, this limit seems like a good target.
32-bit floats are really way too much for these calculations, so nothing above this is needed. It may even be possible to use signed 16-bit integers for the voxelspace, or half-floats.|
Floats will not give any benefits for the coordinates in the voxelspace, because voxels near the center aren't any more important than those at the edge. This is why signed or unsigned 16-bit numbers will give more precision than half-floats. When it comes to scoring, floats or integers can be used, because it's not that important. Since a lot of small values are to be accumulated, using half-floats is probably out of the question, but ordinary 32-bit floats would be fine.
|There are a lot of algorithms to solve most of the geometric problems we will encounter. These have to do with path finding, bounding volumes, nearest-neighbours, voronoi volumes etc.|
There are probably generic algorithms that can be used for any type of iteration.|
|It may be useful to know which voxels lie in the tetrahedrons formed by the set of colours.|
|The colours in the palette forms a mesh of tetrahedrons, and there are ways to create this mesh in a 'nice' way so that each tetrahedron is nice and small.|
|For each voxel, it may be necessary to find the nearest colour.|
|When each colour competes with another, they will form 3D voronoi volumes.|
|The whole palette forms a convex hull in the colourspace. This is not trivial to calculate.|
The interface to the program is preferably a fullscreen application and a commandline interface.|
|A stand-alone tool that can be used from the command line is of course necessary.|
|A GUI would be very nice while actually developing the code.|
When developing, it will be important to visualize steps on the way.|
The original image(s) will be transformed to another colourspace and then moved into the voxelspace. Each channel in the new colourspace can be displayed, along with the original alpha mask.
A new alpha mask will be calculated depending on image features like "smoothness". This is also very interesting to display, and the combination of these masks is the new weight map.
The voxelspace with all the voxels can be displayed in 3D and rotated. Each voxel can be drawn in their original colour, their weight represented by either transparency or size, and an attraction vector can be attached. There will be about 100000 voxels, so some sort of filtering should be used here.
The colours are then inserted in this space, and each colour is represented by a point in the voxelspace, in the right colour. Each colour forms a voronoi cell, which can be interesting to show. The whole palette forms a convex hull, which is important for showing the new gamut. A 3D triangulation is also performed between all colours, and the result can be displayed.
The current palette can be displayed while operating.
The resulting dithered image can also be shown in realtime while iterating.
It would be nice if one could manually iterate, and see what the next step would be, and also manually change colour entries to see what happens to the data.
|Writing your own visualizer may be cool, and it makes you learn a lot on the way there.|
|Mayavi2 seems to be "the shit" if you know python. I don't.|
|Gnuplot is very slow, but easy to use. (Assuming you know it already, otherwise it's hell)|
|KLT works by moving most of the energy to one dimension, the next highest energy to the next, etc. It's probably very unuseful here, BUT it can be used to check out algorithms in 2D, which is much faster and easier.|
It would be nice if the algorithm could be used in other code, like ImageMagick. To make this happen, it should probably be written in C, with C++ bindings added.|
This doesn't need to happen until the actual algorithm is tried and tested, and ultimately the GUI and command line should move to use this API so that no duplicate code has to be used.
|The implementation language is somewhat important to determine how portable the application is.|
The benefits of C is of course that I know C. Since the algorithm is way too slow to run on limited hardware anyway, it doesn't give many other benefits.|
The ImageMagick bindings are meant to be used with C.
The major benefit of using C++ is that every OpenCL and OpenGL tutorial is written for C++, even though the OpenCL and OpenGL APIs are formally for C. There are ImageMagick bindings for C++ which will be efficient enough for whatever we want to do.|
There are also a lot of scientific libraries for C++.
A drawback is that I don't know C++. However, I would like to learn some.
A lot of libraries can be used from python, and it might be easier to code than C++, and equally useful to learn.|
This whole thing is probably something that can be called 'vector quantization'. Vector quantization is not a specific algorithm.|
|Genetic algorithms can be used to "breed" new palettes from other good palettes.|
|There's a lot to read about image segmentation, and k-means clustering in particular. It is very similar to what we are trying to achieve here, but the rules are not that simple.|
|The human eye have various properties that forms the basis of our colour vision. These properties can be abused to form optical illusions, and they can be utilized to optimize the selection of optimal colours and to determine the best colourspace to operate in.|
Measuring how "bad" an image is compared to another image is very difficult and subjective. There must be a couple of different algorithms available.|
The "smoothness" should probably weigh in here, and the user-supplied alpha-channel. Basically the "weight" of the voxel.
A just noticable difference can be used to calculate some sort of optimal scoring method that is more flexible than a strict nazi-scoring.|
Basically, the difference algorithm will come up with a difference between colours. Every difference below a certain threshold could be treated equally if that is beneficial to the rest of the algorithm.
Lab seems to be the colourspace most designed for what we want to do. There are however a few different versions of this, and a few different algorithms that can be used to calculate the perceived difference.|
What we will implement is probably the CIE76 method for determining a difference, with an option to calculate all distances using the CIEDE2000 algorithm, possibly by using CIE76 to find a good starting point and just CIEDE2000 for the final iterations.
ImageMagick can be used to convert from the source colourspace to L*a*b*, and then to the possible sRGB space used for the destination image, but for calculating the constraints and other data, maybe a formula is needed. Of course, the colours can be put in an ImageMagick image and converted there.
|CIE76 is the algorithm that we would like to use, since the difference in colour is defined as the distance in the colourspace.|
This is an upgrade of CIE76. It is not as difficult to calculate as CIEDE2000 but more so than CIE76. It may be a good mid-level step.|
It has the crazy annoyance that the constants differ between "graphic art" and "textiles", which in practice means that the different folks in the committee couldn't agree to what a difference in colour actually means.
A lot of the values in the formula can be precomputed for the voxels, but that will take a lot of extra space.
|This is an upgrade to the CIE76 version. It involves millions of calculations instead of the naive one in CIE76, and it is likely that it is way too CPU-intensive to use in the application.|
|Uhm. I hope I don't have to think about this.|