Understand challenges and how to use 3D cameras for bin-picking, piece picking and machine tending.
Find all the resources listed at the end of the page.
So my name is Henrik Schumann-Olsen and I'm CTO of Zivid. And I'm going to discuss in this webinar about the role of the 3D camera in pick and place operations. So what is Zivid? Here is a short presentation of our company.
You can see our product portfolio, we are a so-called pure-play 3D camera company. Our product is the Zivid One+ which is in three variants, small, medium, large. And then, now, recently released is our newest camera, the Zivid Two.
And then, what do I mean with pure-play camera company? I mean that we are focusing entirely on the cameras, so that means our customers are system integrators or OEM's that creates the full picking sales and then we provide them with the best camera. So what does that mean for us? It means that we focus on the quality of the cameras and the image quality that they produce.
So that's our entire focus, meaning we put a lot of effort into creating the best point guards out there. So that is something we want to look at in this presentation. And what it means for your applications.
So if we look at the fundamentals of a pick and place operation, we have three important aspects. So the first is about detection. So obviously, we want to detect all the objects that are relevant for your application. And that depends on what types of application it is, but you want to detect everything. And then it has to be accurate and reliable, regardless of the type of objects, if it's different shape, size, material, and/or surfaces.
And when you have detected the objects, you want to be able to pick them and you want to do that accurately as well, you don't want a crash. And when you take the objects, you want to be sure that you are not accidentally dropping them. And then at last, depends on your application there can be different requirements for placing the object but regardless you want to be able to place an object without crashing or destroying the objects in general.
So let's look at the 3D camera and what role does the #D camera play in all of these. Obviously, it plays a central role in the detection. You need a 3D camera to see the objects and then you also need the vision algorithm that can detect certain forms that could be for instance, CAD-based or it could be an AI or similar. So you need these things.
But then there is something that's under estimated in the industry and that is the importance of the 3D camera in regards to pick and place operations as well. And I'm going to cover this in more detail in the presentation and I think it will give a lot of good insights for everyone. so let's look at some sample, pick and place applications.
So we have piece picking and that is in warehouse logistics setting where in robot is picking various items, individual SKUs that are called for automated order fulfillment. And then in the manufacturing environment, we have a bin picking which is in general, bin randomly filled with objects and the robot is picking each individual objects, maybe at the entry of the factory, so it's placing it on feeders, conveyers, or sorter or similar. And then very closely related to bin picking we have machine tending, which is the same kind of picking operation but then you pick to load or unload machines and that for instance could be a CNC machine. And you want to place the object inside that machine for processing and obviously then you need more accuracy in the placing operation.
So if we deep dive a little more in this picking application, what are the main challenges? So in bin picking, one of the challenges is the emptying the whole bin. The objects can't be located differently and they're hard to image, maybe they're shiny so that could be problematic.
And in piece picking it's all about SKU coverage. And with SKU coverage we mean the ability to detect reliably and pick all types of objects that you meet in the warehouse. And If you ask, is that possible? The answer is no. Today, state of the art unfortunately is not good enough in order to pick all the objects, so the SKU coverage is quite low.
An example of that is there's some, you can think now during COVID, they have hired approximately almost 200,000 people and now there is around a million people picking in the warehouses out there. So in principle, still humans are doing what we are trying to solve with robots so there is lots of work to do. So let's look into some of the main challenges into that.
So this is a typical image presented to a robot. And this is a bin in consumer, where there's lots of consumer goods here and so this could be in the warehouse. And this is a camera, low-end stereo camera that's widely used in the industry. So this is what we present to the robots out there.
When you look at this image, this is a 3D image, you clearly understand that it's not particularly easy, not for us as humans this is easy. I mean, yeah, it's very hard to see. I mean you can see some of the objects, yes, but most of it is just a blur. So we want to improve the camera quality.
So this image here is from a high-end stereo camera that is also used by a lot of piece picking applications out there. So of course, what you see here is better than the previous one. It's slower, it's 400 milliseconds compared to 66. But still there are problems.
So let's for instance, zoom in on this part here. And if I look at that detailed image then and I ask you, can you pick one of these objects then put yourself in the robot's position. Would you be able to pick this and I think you all know the answer. So keep that in mind, let's jump over to manufacturing.
So this is an image from some metal parts that are OK. They are not jammed together in the bin and they are actually simulated. But still, even when they are simulated this image shows the problems. So this is the same camera, the high-end stereo camera.
You can see here that there is group of objects that blend together and become quite hard to distinguish. If this was all in a bin, you can imagine how that would be. Some are easier they are better separated. So some are possible but there are still challenges.
In manufacturing, many use lasers. So this is from a high-end laser structured light system and the quality improves but still the resolution is low. And when the objects are quite shiny, you get artifacts and it's almost impossible to understand what's going on here. And cylinders here, they also start to get some reflection artifacts.
Their speed here is also slower, still a decent speed. And of course, if you increase this to much longer acquisition then you could get better data. But then you would also at some point, it will be too slow for the cycles I needed in this industry.
So this is what we work on in Zivid. We want to give the robots more human-like vision. Because in order for the robots to take over human tasks, they need to see more like humans. So we addressed that in all our designs.
In regards to detection, which was one of the three things we looked at, we believe that you need to be able to see all the details and you need to see with confidence. So that means you need to be able to see all the tiny, fine details in order to separate all of these objects from each other. When they are clumped together in the bin, if you can't distinguish it then it just becomes a big smudge and it's hard to put your fingers in there and grab something.
And then in regards to SKU coverage, you need to be able to see all types of object to see a wide variety regardless of color surface quality and things like that. And then, of course, as many of you probably have seen when the objects are shiny, kind of chrome reflective mirror type objects, it becomes a huge mess. The point clouds gets lots of noise and it's typically, very hard to impossible to do so that needs to be addressed as well.
That's what we have done in Zivid. And we have lots of different features and specification to handle all of this. I will just look at some point clouds to show you the effect.
So let's recap, the point cloud we saw from a consumer bin and we did zoom in on that part. I asked you to pick one individual object. let's look at how it should be.
This is a point cloud from Zivid Two and look at the speed. This one took 400 milliseconds. This one takes 100 milliseconds, that's four times faster and the quality is just wildly superior. And if I'm the robot and I get this image, I can actually pick objects more reliably. And that's what's needed, that's one of the important things in order to get more robots out there doing more advanced picking.
And this is the whole bin and as you can see it looks like an image from a cell phone. It's like as we want it to be, it's clear, and this is full 3D point cloud. Let's look at shiny and reflective objects, a little more detail look on that.
So this is a typical image. There's a lot of cylinders here clump together. And because of reflections and problems in the imaging, there's lots of artifacts and noise. You see that there's points here, to detect individual object here is almost impossible because it's just noisy. And this is a typical thing you would see from most sensors that doesn't deal with these things.
So in Zivid, we have something we call artifact reduction technology. And that's a lot of things but what they do is the sensor try to look what's underneath here. To understand the noise and get the real surfaces from beneath and you can see the results. Certainly, it is possible to detect these shiny objects and you can actually imagine that you could go in and pick them. And that is a huge thing in picking shiny objects in manufacturing.
And if we look at the high-end laser structured light system, we looked at previously with all the different artifacts we saw there and the low resolution, this was 500 milliseconds. And this is 300 milliseconds with Zivid Two. And as you can see, the data quality is superior.
We talked about detection, now let's talk about picking and placing. So in any system there's uncertainties, the problem here is the representation of reality. So the system, the camera, it tells the robot this is the reality.
This is where the objects are located but there might be errors in that representation. And that means that's the picture the robot gets is not true, it's slightly off. And that error creates lots of problems because when the robot then directed by the camera goes into pick objects, it might crash or it might mispick and that happens a lot. And it's very hard, even advanced system integrators struggle with these things.
So what we have done with the Zivid Two is that we have minimized this error. So this is a groundbreaking performance of 0.2 or below 0.2 dimension trueness error. And it's not enough to have that number that helps you, but that dimension trueness error needs to be not growing or become worse under temperature variations and under mechanical stress. So if the camera gets shocks or vibrations, that needs to be stable, or else things will start to degrade over time.
So those two things we have taken care of in the Zivid Two camera. And here is an example of a point cloud where the trueness is below 0.2 and then it is as it should be. And you can do detections and you can do picking in that point cloud.
If there is trueness error that means there is some scaling rotation and translation error, then you see that the red points here is what the robot actually see but the gray is the correct one. Of course this is exaggerated now, but this shows you the point that if I tried to pick this object I would go to the red here. But the object is actually located here, so then you get some problems might be crashing or you get mispick it or you don't get it as accurate as you want. If I can go back and forth here, you see the point it's very, very important and it's like these hidden things that you think maybe it's your detection algorithm that makes you crash but it's actually the trueness of the camera that is wrong. So it's important and we have solved it.
So the last thing I want to talk about is the opportunities we have seen and our customers has asked for and the markets we see the trends now that more and more can be solved with on-arm solutions. On-arm solutions gives opportunities in regards to flexibility in what the system can perform. If you want to pick from several bins or you want to use the same system to both pick and place, that can be done and of course the improvements you can achieve with on-arm solution.
If you take image from different perspective, different positions, can be vastly improved especially in regards to shiny and reflective objects where the positioning of the camera can be the whole thing actually. So our news camera is very small, lightweight. It mounted on the robot, it doesn't restrict the robot motion so you can continue to do advanced application while the camera is on-arm mounted and as you can see in the picture here.
So to summarize, the term universal picking is not solved and that is due to historical limitations in the vision system and 3D app, of course a lot of other things as well but that's the part we are addressing. And detection is not the same as picking, so you need to be able to detect correctly but you also need to have a true representation so the picking is not failing and same for placing. And then at the end, the Zivid Two, a remarkable TV camera and it gives you great flexibility and without any compromise on image quality. So that was all!
If you have more questions related to successful random bin-picking, just fill in your email, and we'll get in touch.