false consensus?

by Ettina

I just thought of a possible problem with going by what people agree here. It may be that people,l in some cases, guess in a patterned way, and that'll cause everyone to guess the same thing.

For example, let's say you take a nighttime shot. A wildebeest is readily visible, as well as four indistinct blobs behind him. In reality, those blobs are three wildebeest and a zebra, but there really isn't enough detail to see anything other than that they're vaguely large-ungulate-shaped.

A lot of people will go 'that's a wildebeest, can't make out what the others are, but since wildebeest are herd animals they're probably other wildebeest'. And therefore consistently misidentify the zebra in the picture. And since everyone who actually spotted the four in the background has called them wildebeest, you'd get the impression that people are more certain about their identification than they really are.

Posted December 20, 2012 11:18 AM

by ejbmanning

I also have been queasy about the assumption that if everyone agrees, they must be right. Uniformity might mean precision ... this is not the same thing as accuracy.

Posted December 20, 2012 9:13 PM

by rockhyrax

That's a good point - maybe increasing the required consensus number (which I understand is currently at 5) would help. For example, if you get 5 people say all wildebeest and 1 says there's a zebra, you'd probably go for the 5. But if you had a sample of 120 and 20 said there was a zebra, although it's the same proportion, you'd think twice before accepting the all wildebeest hypothesis.

Posted December 20, 2012 10:14 PM

by dms246 moderator

I think the key thing here is that you're assuming everyone will apply the same "pattern" in their decision making when it applies to uncertain data. In fact, we all have widely varying backgrounds, knowledge, experience, etc, which mean we have a range of different default patterns we apply. You just need to look through the posts in these boards to see how much we differ in our interpretations of "fuzzy" data.

Using the example you give, Ettina - while you and some others use the route you describe to come up with reasoned guesses for the indistinct animals, others might think "That wildebeest looks nervous, and seems to be on its own in the dark - I think those indistinct animals might be lions". There was a post here a couple of days ago where someone had made exactly that interpretation of a single animal with several pairs of eyes shining in the dark beyond it. The sheer size of the "crowd", and the variety that that ensures, is part of the power of crowd sourced projects like this.

It would be different if we were all newly graduated biology students - in that situation we'd all be applying similar logic, having just emerged from similar courses where we were all exposed to similar subjects and theories and knowledge. If that were the case, then yes, there would be a real risk of false consensus. But the variety of backgrounds and knowledge we bring to this project (and other crowd sourced projects) is what enables researchers to make use of the data we provide, in the way described in the blog post about why we don't have an "I don't know" button.

Posted December 20, 2012 10:31 PM

by tirralirra

I understand that they stop at 5 when all 5 response are identical. When differences appear (e.g. 1 zebra and 4 wildebeest) the image stays in the pool longer. There's a blog post with some details of the process.

In effect, our classificiations are just a first swipe at the data. Our classifications give the scientists an idea about which of the millions of images DON'T need their careful attention.

I believe that there is good data about the reliability of crowd-sourced judgements. They stack up well even vis a vis expert opinion.

Scientists: How much of the overall coding burden does our classifying contribute? Do we save you 50% of the work? 30%?

Posted December 21, 2012 5:16 AM

by kosmala scientist

@tirralirra: at least 95%. I haven't had a chance yet to go though the data you all have generated for Seasons 1-3. (I've been working on getting Season 4 ready!) But when we did a beta trial of the site, I found that 2/3 of the images were "easy" in that the first people to see the image all agreed (or all but one person agreed). For another 40% of the images, there was enough agreement after ten people looked at an image to be pretty confident of what it was -- something like the middle image I talked about on this blog post: http://blog.snapshotserengeti.org/2012/12/14/we-need-an-i-dont-know-button/
That's about 95% of the images. The remaining 5% required more than ten views, but generally the consensus of all those views either got the image right or made it clear that it was an impossible image.

It's going to be fun to figure out the best algorithm for those remaining 5%. My goal is to make it truly an automated task, so that we don't inspect any images by hand. That may mean that some very small percentage of images gets labeled wrong in the end, but the idea is to have that be such a small percentage that it won't affect data analysis. For example, it may not matter too much if those eyes in the distance are wildebeest or zebra in one particular image because there are so many images of wildebeest and zebra. So if we're doing a distribution, a few extra (or a few too few) zebra or wildebeest in any one image isn't going to affect our distributions.

We also are generating a subset of of "expert-verified" images -- just a couple thousand or so -- so we can do a concrete analysis of how well your classifications and the algorithm match to having an expert look at those same images. My guess is that it will be near 100% agreement. But we'll see...

Posted December 21, 2012 4:18 PM

by Ingootje

I've got , several times, photo's which i already identified appear again. How does that sum up?

Posted December 21, 2012 7:06 PM

by rockhyrax

@Ingootje, the same is happening to me - just got the following sequence:
topi close-up (with a probable topi lying down in the background)
same topi(s)
mother and childe gnus walking (that I'd already collected either today or yesterday)
those same two topis again!

I didn't identify the topis for a third time, but on topis #2 and gnus #2 I could verify they were duplicates because I had collected them. Oh, and I'm sure I've looked at the same vervet monkey twice in the past day or two.

I'm going to start recording something in the discussion for every picture set, so I can verify how many duplicates I'm getting. (I suspect it could be something to do with the modifications made to address the lack of picture problem from a couple of days ago.)

Posted December 21, 2012 11:04 PM

by rockhyrax

OK, I just got the topis twice more, the two gnus once more, and a couple of zebras standing in the dark while four others were walking, which I also recognised. I think we definitely have a problem.

Posted December 21, 2012 11:09 PM

by khearn

Yeah, I'm getting the Topi legs, the vervet monkey, and the same wildebeests over and over. I've checked back a couple of times today and keep getting the same few images. I suspect someone made a mistake and it'll get fixed before too long. Maybe I'll go count star clusters in Andromeda for a while and take another look at the Serengeti tomorrow. 😃

Posted December 22, 2012 1:25 AM

by paola

I'm seeing the constant repetition of the same pictures too - the topi legs, the vervet monkey etc.

Posted December 22, 2012 8:15 AM

by kosmala scientist

I talked to the development team about this; apparently the repeats are happening due to very few images left in the system. They said they'd load back in a ton more images from Seasons 1-3 to tide you over until Season 4 next week. When I look at the data, I'll do a double-check for multiple identifications by the same person on each image, so that I only use one classification per person for each image.

Posted December 22, 2012 11:09 PM

by tirralirra in response to kosmala's comment.

Wonderful! Thanks Kosmala for those details. I didn't realise that this citizen science project would be as helpful as THAT!

Posted December 23, 2012 3:35 AM