Machine Vision

How to make it easier
Machine vision systems need user interfaces. Users control machine vision systems with the functionality which is available within the user interface. Often users see, feel and even experience the user interface as ‘the machine’. What a challenge would it be to create a standard user interface which fits ‘the soul of machine vision’: providing the necessary controls in a compact and attractive layout?

Because my new machine vision system is an interpreter controlled system and not an SDK, I cannot shift this challenge to the application developers. I need an embedded solution for that user interface. How should I solve this design step?

A flexible user control system in machine vision system should support something like the following three things:

  1. the compiled (C++) vision program which should be controlled
  2. a general or dedicated user interface with a layout with a lot of previously unknown controls
  3. the linkage between the controls of the user interface and the compiled program.

 

This cannot be so difficult. At a first glance this seems to be a standard task for the most kinds of user interface controlled systems. There are a lot of free and commercial GUI builders which can connect to WCF (Windows Connection Foundation). However, this approach lets a lot of work over to the interface designer.

And yet, it seems to me that there is no easier way to connect a compiled application to an independently designed user interface (please correct me when I am not right).

The point is that the functionality of a multi sensor machine vision application is quite complex, so that the GUI has to mirror a part of its functionality. And that makes the interface developer to learn a lot of stuff about the application before he can start his design process. But, in effect he will be re-inventing the wheel for a large extent.

Ok. Are there other possibilities? The only way to avoid this seems to me to compile the graphical user interface into executable of the machine vision system. This has advantages:

  • embedded and tested functionality in the GUI
  • shorter development time configuring pre-defined and tested dialogs and controls
  • creating, configuring and controlling the user interface with an embedded interpreter enhances flexibility
  • a compact and optimized user interface layout is implemented

 

but also disadvantages:

  • limited controls and limited functionality in the user interface
  • pre-defined layout with limited design possibilities 

 

My conclusion is that compiled graphical user interfaces solve a lot of problems and accelerate the development process significantly.

But the disadvantages: all these limitations. Could there be a way to make it ‘less limited’? I seems to me that the best approach is to make the compiled GUI as flexible as possible. How can the GUI made more flexible?

GUI builders allow and demand to design the functionality of the GUI. This I should not repeat, because it ends again in re-inventing the wheel for the application developers. But there is already an interpreter for the configuration of the machine vision system in my new program. Maybe, this interpreter could also be used to configure the buttons, icons, screens, dialogs, visual layout etc.?

I have an idea: let’s combine pre-defined GUI template pages with the interpreter:

  • pre-defined templates contain and support certain functionalities or ‘functional units’ in the resulting GUI. the developer doesn’t have to take much care of ‘how’ it is working, but on the visual layout.
  • the – already build in – interpreter can apply the visual properties, user access rights, icons, images, etc to these templates.

 

Programming this, a new question arose: where is the line between flexibility of the page templates and programming most of the GUI in the interpreter? The latter is absolutely not my intention. So far I would like to provide a ‘default solution’ for every page template. Developers can use this solution, but they can modify and configure the GUI in a very large range to their own wishes. This would help to limit the disadvantages of the embedded GUI template pages.

In a later stage, I should also implement an interface to the WCF world to allow user defined GUI’s with almost no restrictions.

Do you think that this approach would be suited for a fast design of flexible GUI’s for machine vision systems?

Many efforts have been done in standardizing ‘Machine Vision systems’. Standards make things exchangeable, more reliable and help to reduce risks in a development process:
  • Vision hardware (computers, lenses, sensors, cables, robotics) is standardized in a wide range
  • Machines are constructed with machine building standards
  • Image processing libraries and tool boxes (see e.g. http://www.machine-vision.eu/22-1-Machine+Vision+Libraries.html) help to develop the software with well tested routines.
  • The software uses compiler language standards.

Many standards, however, where come the risks from? First of all, we do not know in the beginning: will it work or not, can it work? Designing a machine vision system is a process of at least three steps:

  • layout decisions based on first tests,
  • programming and testing an application and
  • testing with real data.

Unfortunately, tests on ‘real data’ can often only be done with ‘real systems’. While some processes work excellent in a laboratory they may be not applicable in a real time or reals life application. How much would we like to get hard results as early as possible in the development process! Which direction should we go in order to achieve this as far as possible? Here some considerations:

  • Rapid prototyping can help. Often more than one design has to be tested. Software development is an essential part of the unknown factor. So, shortening the software development process could be a solution.
  • The application has to be programmed and tested. Again something about software development. Imagine, the result of the ‘rapid prototyping’ would be already the solution?
  • Testing and improvement: results of tests reflect on the design and the application. Using standards helps to change and improve components in the design (e.g. sensors, computers, lenses,…). The standards used in the software development, however, are so far away from the task to solve a problem, that the software brakes the development cycle.
  • At a certain moment, the software developer gets the feeling that he is not the first person writing this code. Moreover, he questions himself, why many things which are needed in almost every machine vision solution, have to be programmed again.

Conclusion: software development is probably the most risky part of machine vision development. Maybe, this is not new. But this tells us, where we have to improve. How can we make software development easier, faster, more reliable? How can we prevent re-inventing the software components of machine systems again and again?

Toolboxes

Precomputed and tested software components help to set up an application. The remaining part of the machine vision application is left over to the software developer.

Integrated development environments (IDE’s)
IDE’s help to combine the tools from a toolbox with a scripting language. Some of them allow to run the application directly from the developed scripts. Many of them can produce code in a higher programming language which allows to integrate the imaging process into a machine vision application. This is really fast prototyping!

Unfortunately, applications in a script loop run not as fast as they could run and the application design is limited to the tools of the IDE. When the code in a higher programming language is used, the developer has a good solution in image processing, but writing a machine vision solution is still much, much more.

Machine vision and image processing

What is the difference between a machine vision application and a vision process? Here I want to give an overview of the components which, besides image processing, are usually needed in a machine vision system:

  • access control with different user access levels
  • a multi language user interface which controls the machine vision system
  • an alarm system which handles software messages and in- and outgoing information to an information network
  • one machine vision system transparently running on a network of computers
  • synchronization of asynchronous measured sensor data with moving material
  • easy access and changes of application parameters depending on the purpose of the system
  • easy access and changes in the hardware configuration
  • easy access and changes in the classification and image processing
  • switching between multiple configurations and user interfaces for different tasks running on one system
  • metric units
  • default settings, limitations and thresholds
  • multi processing
  • easy linking to actuators, robot arms, or manipulating devices
  • highest throughput
  • offline development and improvement on measured and filed sensor data
  • remote control
  • report generation and report viewers

The picture seems to be clear: Image processing is really more than just buying a sensor; and machine vision is really much more than image processing. Although many components (especially the hardware and imaging libraries) are ready for use, software development is still badly supported in summary.

Reducing the risk of software development will directly affect the overall risks. Because we can list the targets of such a general machine vision system, we can build it. The strongest difference to an IDE with a multi purpose definition is that we know what the system has to be able to do. Programming converts to configuring the system.

Rapid prototyping and application development grow could together. Solving software problems converts into focusing on the purpose of the application. Software risks reduce strongly and the over all risks go down.

Please tell me what a general purpose machine vision system more should do!

Start

No comments
 
Today I am starting my blog. I would like to blog about some aspects of machine vision. Comments are very appreciated. Somebody told me: bad English is the most common technical language. So, please excuse my grammar and spelling. My topics here are Industrial Machine Vision and Machine Intelligence. 
 
Industrial machine vision
There are many advertising and/or help blogs on this topic. Some more general ideas I found in

I think there are hundreds or even thousands of programmers who are developing solutions in machine vision projects. I have much experience with projects that had been running out of time. Sometimes my customer’s expectations were far away from the agreements we had done. I am sure that I am not alone with my experience. But I want to think further: How can we better bridge this gap between expectations and reality in machine vision?

At this moment I am programming a ‘machine vision process’ which is designed to be a framework for new machine vision projects. Within this framework all things are configured by a script language (Tcl). This enables you to adapt the machine vision process in many aspects, e.g. sensors, classification, access rights, user interface, – and this all without any compiler.    

My intention is, to help others to reduce risks, development time and costs of the ’software part’ of machine vision projects. Besides my own experiences in machine vision, I would like to hear from you about the bottlenecks and how you solved it.

It seems to me that the highes hurdles are

  • the automation of the process,
  • the user interface 
  • the processing speed
  • the development time

On the other side, there are many good tools for image processing and good sensors available. The latter cannot be the reason why projects in machine vision are easily much more laborious than expected in the beginning.  Therefore I started the category ‘Industrial Machine Vision’. Please tell me about your experiences!   

 Machine Intelligence

Although I don’t know what intelligence is, I want to use the term ‘Machine Intelligence’ to emphasize that we really cannot know if machines can become intelligent or not. It seems to me that we always can get around what is tested in so-called intelligence tests with enough computing power. So I like to ask from another point of view: is our brain a Turing machine or not? It seems to me to be part of my intelligences (or imperfection) that I get very fast bored when something happens two or more times. I promptly change my attention towards something more interesting.

A computer, however, is optimized for only one purpose: same input, same output (besides being a fast Turing machine). This we like on machines: they are not curious. When my computer gets intelligent, it will be the time to buy a new one. I think that unpredictability of behaviour is an important part of intelligence. When machines can do that what our brains do, than our brains are machines. Not more and not less. Fortunately, this wouldn’t change anything. Except, that machine vision can take over some more tasks which still have to be done by people. The ethics of that are maybe good for another blog.

Ok, machines can do many very astonishing things. I tried to design an algebraic model of the brains to get more insight into the difference of signal processing and decision making in a network of spiking neurons. Unfortunately, I found only an operator which describes the procedure of signal processing. The operator itself seems to mee that it can be simulated with a Turing machine. The meaning of spikes, however, is hidden in their ‘rithme’and this rithme is not a well-defined state. It can change and develop very fast.  The operator can also adapt to a spiking signal input in a very large extent. The state (time and location) of incoming spikes is even of the same type of vector as they were used to describe the network operator itself. Unfortunately, I couldn’t find any obvious equation or law of conservation in this model until now.

I would like to discuss with you this algebraic approach and its meaning for machine vision in the category ‘Machine intelligence’.

The body will come soon.

First, there is a task which seems to be solved with a camera and a computer in a minute. You could even carry out this task at once without thinking about how to do it. One year later, the application is stil not working as you expected in the beginning. Why?

Everybody who is working with machine vision knows that human perception and a machine vision system cannot be compared. The reason may be that we are somewhat intelligence and that we have some experience in looking and doing. Obviously, the following is right:

[1] Computers do exactly what you tell them to do

Since 40 years, very much efforts were made to improve repeatability of computers. If the input is the same, output should also be the same. Good or bad? It is a fact. Sometimes we forget that we also have to accept that computers also don’t do what we didn’t tell them to do.

[2] In the data stream of a camera is almost no information regarding to the task that you want to solve

Sometimes megabytes of data are measured per second, while only a few decisions per second are done. Therefore, it should be possible to compress the data stream from megabytes to bytes without losing information regarding to the task of the system. In fact, we call this process of filtering data into irrelevant data (99.999%) and relevant data (0.0001%) ‘image processing’.

[3] Sensor data should contain as much relevant data as possible.

While it is possible to reduce measured data regarding to a certain task without reducing the relevant information, it is not possible to generate information that is not part of the sensor data. Lost is lost. Furthermore, it is very important for a good result that the measured data contain enough contrast between relevant and irrelevant information. So, the choice of sensors, sensor type, lighting, lenses and digitizing (lets call it the ‘optical path’) is decisive for the efforts which have to be made to get a good result. Otherwise, image processing will follow the rule: garbage in, garbage out.

What do sensors see? Reality shows that sensors see almost nothing, compared with the human eye. Let us hope, that sensors, lenses and lighting are chosen in that way, that at least some relevant data are part of the measured data. Focussing on relevant details in the image data, we want to make a decision based on the processed data. So we write an algorithm for a computer. Everything what we tell the computer to do, he will do (see [1]). However, insignificant changes in the measured data (changes in lighting, background or measured objects) should not affect the result. So we also tell the computer, what should not affect the result. The computer, however, is blind on that what we didn’t tell him explicitly (see [1] again). How can we ever tell a computer everything that will happen in future?

Sensors seem to see almost nothing, and image processing behaves short sighted, ignorant and is limited to that what we tell the system to do.

I would like to ask you this:

How should we make development of machine vision systems faster, easier, and more reliable?

How can we reduce the risks in a machine vision project?