Vision innovations for picking in pick and place robotics.
Find all the resources listed at the end of the page.
Raman, let's kick things off with you.
All right, good afternoon.
My name is Raman Sharma.
I'm proud to represent my company Zivid at Automate this year. I'm from Austin, Texas, and I'm responsible for sales and marketing in the Americas for Zivid. We put our entire focus into creating human-like vision for robots by providing the best point clouds.
Over the next 15 minutes or so, I'll explain why having the best-quality point clouds is important and what they mean for your application.
But first, I just wanted to highlight some of the marketing numbers to put our use case in the market and give it some context.
So there were over 440,000 new robots shipped in 2019 in manufacturing and logistics, and close to one million are expected to ship in 2025.
That gives us a CAGR of over 16% and a CAGR of over 22% for a logistics unit shipments from 2021 to 2025.
And some of that can definitely be attributed to COVID.
So even though the marketing numbers are promising, there's one key problem, which is that a majority of the robots today are blind.
And what I mean by that is that the robots have to operate in a very structured environment.
They have to be able to pick up they can pick up something, but they have to be all that has to be programmed into the system.
So if the objects are dynamically situated, dynamically structured, that may cause problems and all of that limits the types of tasks that the robots can perform and it limits the productivity and flexibility of the robots.
Let's look at some of the primary use cases of industrial robotics in industrial robotic pick and place.
There are three fundamental aspects to picking and placing operations.
The first one is all about detection.
We obviously want to detect all the objects that are relevant for the application and then that operation has to be accurate and reliable, regardless of the types of objects, whether they be different shapes, sizes, materials or surfaces.
When the objects have been detected, the next fundamental operation is to pick the accuracy is, again, one of the key things that we need to watch out for, because in the end, we don't want the robots to have a crash.
Lastly, when you pick the object, you want to be sure not to drop them, depending on your application, that they can be different types of requirements for placing the object.
But regardless of the application, you want to place an object without crashing or destroying the objects or the robot, for that matter.
So let's take a look at 3D cameras in the role that they play in all of these; detection, picking and placing.
Obviously, 3D cameras play a central role in detection that goes without saying you need a 3D camera to see the object.
And then you also need the vision algorithm that can detect certain forms and that could be CAD based, for example, or it could be AI algorithm or something similar.
But then there's something that's underestimated or under appreciated in the industry, and that is the importance of a 3D camera in regards to picking and placing operations as well.
I'll cover that shortly.
But let's take a look at some of the examples that can place applications.
Example applications include piece picking in a warehouse or a logistic setting where the robot is picking various items or individual SKUs that are called for that I call automated order-fulfillment.
I think we're all familiar with that.
Then in a manufacturing environment, there's bin picking where a bin is filled randomly with objects and the robot picks each individual object, maybe at the entry of a factory.
So it's placing it on feeder's conveyers, sorters or something like that.
And then closely related to bin picking we have machine tending, machine tending is the same kind of picking operation, but then you pick or load or unload machines such as CNC machines, for example, you want to place the object inside the machine for processing.
And obviously you then want more accuracy in the placing operation.
Each of these example applications have distinct challenges.
So let's dig a little bit deeper into the main challenges and bin picking one of the challenges is the emptying of the bin of the entire bin.
The object object may be located in different positions and they are hard to imagine sometimes.
Maybe they're shiny as well, which makes it even more problematic.
And piece picking, it's all about SKU coverage, and by that I mean the ability to detect reliably and pick all types of objects that you encounter in a warehouse.
And today, in my opinion, this is just not possible 100% of the time.
The state of the art technology is not getting good enough to put all of the objects all the time.
So the SKU coverage is quite low.
Let me illustrate what I'm talking about here.
Here's an example of the type of data that's presented to robots today.
This is the bin of consumer goods that could be in a warehouse.
The image is from a low end stereo camera that is widely used in the industry.
When you look at this through the image, it is why I say that it's not easy for a robot to pick objects all the time.
You certainly can see some of the objects, but most of the bin is just a blur.
We definitely need to improve the quality of the point that we present to a robot.
This image is from a higher end stereo camera that is also used by several piece picking applications.
The image quality is better than the previous one, however, the image captured time is much slower.
In the previous slide, the point cloud that I showed the image capture time is 66 milliseconds.
And this one is 400 milliseconds.
There's still challenges.
For example, let's Zoom in to the center of the bin that I've circled in this red oval.
So if I look at this detail, the zoomed in image, and I ask you to put yourself in the robot's position, can you pick one of these objects?
I think you all know the answer.
So let's keep that in mind while we jump over to manufacturing.
This is an image of some metal parts.
And the reason I say manufacturing is because oftentimes in the manufacturing applications based on market, you find shiny metal parts as objects that are trying to be picked or place.
This image is just OK, the objects are not jammed together in a bin, and they're actually singulated, but still even then.
You can see that some of the problems with this image, you can see on the top of the screen that the objects seem to blend together.
And they're hard to distinguish.
If this was all in the bin, if all these objects were crammed together in a bin, you can imagine how much of a problem that would cause the vision system.
In manufacturing settings many developers choose to use lasers.
This image is from a high end, laser structured light system and the quality is definitely improved.
But the resolution is low.
When the objects are shiny, you get artifacts and it's almost impossible to understand what's going on because of the noise.
Also, note the slower speed here, 500 milliseconds for image capture in this example.
At some point the speed will affect the required cycle time for the application.
So at my company, we want to give robots a human-like vision.
We believe that in order for robots take over human tasks, they need to see more like humans.
When it comes to object detection, we believe that you need to be able to see all the details and you need to be able to see with confidence.
To us, seeing with confidence means being able to see all the tiny, fine little details in order to separate all of the objects from one another.
When they're clumped together in a bin they cannot be distinguished and the scene becomes a big mess and it's hard to grab the right object.
In regards to SKU coverage, you need to be able to see all different types of objects, a wide variety, regardless of color, surface and material.
Of course, then there are the shiny objects, the ones with the Chrome finish that are highly reflective and make life very difficult for machine vision systems.
The point clouds have a lot of noise.
They're typically very hard or impossible to be used by a robot to do what it needs to do.
They say that a picture is worth a thousand words, so here's what we can do with technology that has been designed from the ground up to solve the problems and challenges related to robotic taking place.
Just as a reminder to recap, we saw this point cloud of a bin of consumer goods just a few minutes ago and we zoomed in.
I asked you to pick one individual object.
And undoubtedly is difficult to do, so let's look at what these objects should look like if we're able to provide robots with human like vision.
This is a point cloud with the Zivid Two 3D camera. The picture speaks for itself, not the speed.
100 milliseconds as compared to 400 milliseconds.
If I'm the robot and I get this image, I can actually pick up the objects more reliably.
And I feel this is what's needed in the market is one of the most important things in order to get more robots out there doing more advanced picking.
Just for completeness, this is an image of the entire bin with that same camera, Zivid Two.
It looks like an image from your smartphone.
It's clear, it's crisp, and this is how it should be.
Here's an image of typical industrial objects, I shared this earlier as well.
There's noise and there's artifacts.
To detect individual objects is almost impossible because of the noise.
With Zivid's Artifact Reduction Technology (ART), you can overcome the noise and get to the real surfaces of the objects and now you can detect these shiny objects and you can actually imagine picking them with the robot.
So that brings me to the second pillar.
We already talked about detection.
Now let's talk about picking and placing.
Any system has uncertainties.
The problem in our market is the representation of reality, where the camera tells the robot what reality is.
The second pillar as I mentioned to successful picking and placing, is picking with confidence.
And that's all about more accurate pics and fine manipulation and less mis-picks and crashes.
This is enabled by very high camera trueness.
Furthermore, the trueness error is to be stable across temperature variations and mechanical stress.
If there's trueness error, that means there's some scaling and rotation errors, including translation errors, you can see that in the following slide.
The red points are what the robot actually sees, but the grey points are the correct ones.
These errors can result in mispicks and crashes.
Let me go toggle back and forth a couple of slides and you can see the importance of trueness.
So this is trueness error in the point cloud, and this is true to reality 3D vision.
So just go back and forth one more time.
You can see the error if you just focus on the top two metal objects are stuck to the side of the bin.
This is this is the error.
And this is reality.
So you can imagine the robot system crashing to the edge of the bin because of missed picking.
The last thing I want to talk about is the opportunities we've seen for on-arm mounting solutions.
On-arm mounting gives opportunities in regards to flexibility in what the system can perform.
If you want to pick from several bins or you want to use the same system to both pick and place, then that can be done with on-arm mounting.
On-arm mounting also helps with getting better images by taking images from different perspectives and positions.
And we do this as humans.
If there's something really shiny to try to get a good image of it with your eyes, you tilt your head one way.
You get a better image, you tilt another way.
OK, now you complete the image in your brain.
We want to do the same thing with our camera mounted on a robot.
This is especially relevant for shiny and reflective objects.
So in summary, universal picking is not yet completely solved, and that's partly due to the legacy limitations of traditional of vision systems, including 3D.
This is something that the market, including my company, is addressing.
Also, detection is not the same as picking.
You need to be able to detect correctly, but also need to have true representation so the picking doesn't fail.
All right, that's I had.
Thank you for attending.
I appreciate your time.
Fill in your email, and we'll send you all the files
and resources from the webinar.
Zivid brings
Gjerdrums
N-0484, Oslo
Norway