May the name legitimately influence the fate of an application?

Yes.

Cosmic Variance promotes something that feminists consider to be science, namely this paper in PNAS:

Science faculty’s subtle gender biases favor male students

The paper claims that they wrote 127 applications for a lab manager position, attached a random name of the applicant which could be either male or female, and found out that the applications with the male name had better ratings.

The difference was comparable to 2 sigma in the individual cases, not too strong, and one may have doubts whether the research was done properly. Moreover, the paper makes it clear that the policymaking goals, and not the search for the truth, were the primary drivers behind it. The last sentence of the abstract says:

These results suggest that interventions addressing faculty gender bias might advance the goal of increasing the participation of women in science.

We're told about this goal. Whose goal is it? If they were doing impartial research into closely related questions, they should surely try to achieve no goals. As far as I can say, this very sentence is enough to dismiss the authors as ideologically driven hired guns without scientific integrity.

But let's assume that the results are legitimate and one could reproduce them with impartial researchers and greater samples, too. Would the result prove that identical applications are given different ratings? The answer is, of course, No. If two applications have different names, they're not identical. What do I mean?

Imagine that you get these two recommendation letters. First:

Yerevan State University

According to all of our physics faculty, the student is the best one we have ever seen.

Second:

Stanford University

According to all of our physics faculty, the student is the best one we have ever seen.

Now, in the terminology of Cosmic Variance, these are two identical recommendation letters. But when two universities are doing the same thing, it isn't the same thing. Clearly, the recommendation letter with the Stanford stamp should be interpreted a bit differently: the word "best" in that letter is somewhat stronger and "better" than the seemingly identical word "best" in the Armenian letter.

If we decide that the candidate with the Stanford version of the "identical letter" is better, are we discriminating against the Armenians? Yes, of course, we are. But it's a good thing, too. It's vastly more likely that the best student from Stanford will be better than the best student from the Yerevan State University. It's not a 100% guaranteed rule – there can be a better person in Yerevan than anyone who can appear at Stanford – but statistically speaking, we're still getting some information from the Yerevan/Stanford stamps.

One should notice that the recommendation letter – and the whole application folder – doesn't answer all questions about the applicant. There is still lots of uncertainty and lots of potentially "different calibration" used by different people (and lots of these differences are systematic and reproducible). So the extra information tells us something.

Let us do the same thing with the male/female name and look into the reasoning a bit more closely.

My name is Corinne, I am confident I may be a great lab manager for the 5 years of the contract, and I attach identical documents A, B, C, including enthusiastic letters from my instructors.

The competitor says:

My name is Sheldon, I am confident I may be a great lab manager for the 5 years of the contract, and I attach identical documents A, B, C, including enthusiastic letters from my instructors.

Now, they're "identical", except for the name. If the hiring committee were asked not to look at the name (or if the name were hidden) and mechanically evaluate the rest of the folder according to some deterministic rules, it would get the same results in both cases.

However, if the people may look at the name, they may think differently and reach more accurate results. The name is a rather important part of the context. There are lots of reasons why it matters.

First, the recommendation letters may look identical but because they're written for a male and a female applicant, they don't contain equivalent information. Everyone knows that the writers of the letters are nicer to the females in average. A reason is the struggle for a "better image". If you write a letter about a female, you want to look like a Gentleman etc. So that's why females are more likely to get nice letters and males are more likely to get some tough letters.

That's how it works and everyone has noticed this, too. When I say everyone, I also mean both men and women. In fact, the only plausible explanation why the same "anti-female" bias in the experiment is exhibited both by males and females is that there's something more "objective" behind it. So the recipients of the letters have developed a "recalibration device" that tries to eliminate this bias whenever they really want to measure the qualities of the candidates as accurately as possible. Numerical ratings are not too readable, especially if you don't have access to a large enough statistical ensemble, so people aren't afraid of not being Gentlemen – moreover, most of these decisions are anonymous so the same people who wrote the skewed letters don't have to "play games" in this stage, when they're in a hiring committee.

So they just sensibly "tone down" the overenthusiastic propositions in the recommendation letters when the applicant is female. In a sense, it is analogous to the Stanford vs Yerevan example.

There are many other analogous points that have a similar impact on the evaluation. For example, the authors of the PNAS paper explicitly talked about a lab manager position. Women have a rather high probability – one you could estimate quantitatively – to get pregnant, have kids, and abandon or neglect the job in the following years that are relevant for the hiring decision. That's a simple reason why the female name may negatively affect the fate of the application. It's not any discrimination; it's just a sensible estimate of the odds and outcome given all the available information.

But even if female scientists weren't getting pregnant, there's still the most general point that makes this situation analogous to my Yerevan vs Stanford example. The most general point is the track record. Stanford simply has had a much better track record of producing quality scientists than the Yerevan State University. This experience is the ultimate reason why the otherwise "identical" applications from Stanford will get higher ratings than those from Yerevan.

Whether you like it or not, the name contains a similar piece of information as the Yerevan or Stanford stamp. It tells us something about the gender and ethnicity, with a small risk of being completely wrong. When the applicant has a female name, its membership in the set of females is a part of her identity and it gives some information to those who read the application.

Because in recent XY years, whether XY is 5, 15, 50, or 150, it was significantly more likely for a male to become an impressive lab manager, the hiring committee may take this information into account and decide that it's somewhat less likely that the applicant is the most appropriate if her name is female. Needless to say, similar lessons may be learned from the ethnicity of the name and the experience with various ethnicities in similar positions.

Is that a discrimination? Is it illegitimate? Well, it's a discrimination in the sense that it's the process of maximum utilization of the available information. But using the available information is what the hiring committee is all about. When it eliminates a candidate for having failed in his math courses or for being a drug addict, you could also say it's a discrimination against a group – against those who failed math courses or who are drug addicts. But indeed, the discrimination against such people is a part of the reason why the hiring process exists at all.

Now, you could complain that with this logic, a group – e.g. women – who were once underrepresented would stay underrepresented forever, purely because of the inertia that some people could call "bias". So you could say it's wrong to take the name into account. But that isn't the case for a simple reason: the name (and related information from the title page) is just a part, and a relatively small part, of the pieces of information that decide about the fate of the application.

Generously imagine that the name and gender influence 20% of the decision and 80% is determined by the rest of the "folder". Even if this high estimate were accurate, and I am confident it is an overestimate, probably a huge one, the "memory" of the imbalance would be forgotten pretty much within the timescale in which the members of the hiring committee have been working in their field. Because the name and gender only decide about 1/5 of the asymmetry, by my assumptions, it means that approximately after one "generation" (it may be just 5 years or so: I am talking about the years of professional experience), the gender gap caused by the "track record" would decrease to 1/5 of its original value, too.

Now, this is clearly not happening. After 40+ years in which women and other hypothetically "discriminated against" groups have been promoted, often artificially, by various policies, the percentage remains pretty much the same. In some corners, it's been growing, in others, it's been dropping. The theory that the "natural composition" is 50:50, together with the assumption that 1/5 of the hiring decisions boil down to the track record, would predict that the current composition has to be 55:45 or even closer to 50:50 in the relevant fields. It's surely not.

So the "track record" just cannot be a purely social construct. The "track record" mostly reflects a genuine biological underlying signal. It's misguided to have "goals" to send the composition of any professional group to 50:50 when it comes to gender and to "uniform grey" when it comes to the nationality. Nature just doesn't work in this way and a forced composition inevitably leads to a significant inefficiency.

Also, it is counterproductive to deny the difference, e.g. to deny the difference of the track records of men vs women and track records of various groups divided according to different keys. If the PNAS research were done impartially (except for the individual offensive sentences which suggest otherwise) and the asymmetries were more than just noise, it only showed that the people who were hiring did pay attention to all the information and to the track record of the groups in which the applicants apparently belonged.

There's nothing wrong about it, just like there's nothing wrong about noticing whether an applicant is a drug addict or whether he or she has failed a math course.

And that's the memo.

May the name legitimately influence the fate of an application?

0 comments:

Post a Comment

Popular Posts

Recent Comments

Arsip Blog