> So much for that common, popular notion that standardized tests do not predict anything of value.
To be fair, I don't think the debate was ever about the quality or predictive value of the tests. There is a small, but well-organized and vocal subset of the population that hates the idea of excellence and differentiation. They want, and have been quite successful in, the replacement of standards of excellence with vaguely defined (defined by them, of course) buzzwords like "equity" and "diversity".
I've pushed back against standardized testing at certain points of my life, and I don't think this comment even remotely summarizes my views.
If anything, I would say that my views are the opposite -- homogenization creates a lack of differentiation around skill and aptitude based on questionable science (and sometimes outright pseudoscience) and often leads to an oversimplification of human intelligence in general. It always feels very strange to me that people trying to compress aptitude into a single number say that they're defending differentiation or diversity of talent.
MIT's findings here don't really change my view of the value of SATs, although the findings are interesting and I think they're worth looking into further. I'm not sure "they're more predictive than GPAs" is the glowing recommendation that SAT proponents think it is. You can agree or disagree with me on that point, I'm not here to debate the entire idea of testing or IQ or whatever -- I just want to point out the above comment is a pretty big oversimplification and (in my mind) a borderline complete misrepresentation (I assume unintentionally) of what people like me believe. I can only speak for myself though, maybe there are people out there who do hate the idea of excellence.
Well, what you have written just feels like a more favorable to your side explanation of the same thing.
Colleges are not trying to compress aptitude into a single number. It’s even worse. They are trying to compress aptitude into a single Boolean variable, you are either admitted or not. That’s it. And it seems that subject tests and general aptitude tests are very good indicators of college fit. I don’t know what system you envision, but alternatives I have seen always seem far worse.
I'm not sure I understand what you mean. GP writes:
> There is a small, but well-organized and vocal subset of the population that hates the idea of excellence and differentiation.
I don't see how that applies to my comment above, and I don't see how saying:
> They are trying to compress aptitude into a single Boolean variable, you are either admitted or not. That’s it.
is doing anything other than backing up what I said. At the point where you are dividing a subset of the population into binary "in or out" groups, you are in fact advocating for homogenization, for less differentiation between students, and for fewer levels/categories of excellence or exceptionalism.
I'm not here to tell you that's wrong, you do whatever you want. MIT is trying to decide who gets into their specific college, fine. But if you're arguing that the point of SATs is to make a binary determination about students, then it's just strictly inaccurate to say that it's the SAT critics who are all trying to cut down tall poppies.
You conflate vertical differentiation with horizontal differentiation. Horizontal differentiation is what is usually understood as “diversity” and considered good among certain groups of people. Vertical differentiation is what is usually understood as “hierarchy” and considered bad among those groups of people.
MIT like many American universities does only general admission and that’s indeed would be considered weird in other countries, but it seems like a whole nother issue.
> You conflate vertical differentiation with horizontal differentiation.
A binary admissions model reduces both. That's not to say a binary admissions model is wrong, but it does reduce vertical differentiation. Of course compressing an integer value into a binary result reduces differentiation, a boolean represents fewer states than a number.
To go a step further, even if that wasn't the case, vertical and horizontal differentiation still can't ever be completely decoupled from each other. Horizontal differentiation allows for greater vertical differentiation by allowing people to vertically differentiate based on their strengths rather than on a questionably representative average of all of their qualities. And I don't think that's a solely Progressive or Left-wing idea, it's a big part of the reasoning behind why economic specialization leads to more advanced societies.
See https://news.ycombinator.com/item?id=30834517, but I'm here clarifying what my criticisms of the SAT are and what I think its weaknesses are -- and pointing out that my criticism of the SAT is the exact opposite of what tharne says it is. I'm not here to rework the entire admissions process.
I don't have a single-comment answer to replacing the entire SAT and reworking the entire college admissions process, and it's feasible that the SAT might still be preferable to pure GPAs in the meantime. But I don't think saying that requires us to pretend that compressing skillsets into an objectively less granular/descriptive metric is a good thing or that it's somehow increasing our understanding of student skillsets. Saying that MIT might be right to accept SAT scores doesn't mean we need to pretend that the SAT doesn't have very serious flaws. Certainly it doesn't require me to pretend that every argument against SATs are arguments against meritocracy, I think that's just objectively wrong.
Ideally we would have standardized metrics that were more granular, and ideally we would at least have an SAT that was administered differently and more regularly so that they were optimized less for formal test taking skills. But there are a lot of barriers in front of that.
----
I also don't have a single-comment answer for what to replace Github repos with during hiring interviews, or how to make whiteboard coding tests more accurate, and I have criticisms about them too. The answer might be that there isn't an easy single number that represents meritocracy, and we might be fooling ourselves pretending that there is, and it might just be wishful thinking in the first place to pretend that there is a version of admissions processes for colleges that isn't fiendishly difficult and complicated and multifaceted.
When people criticize whiteboard interviews on here, it's reasonable to ask if there's a better system, but I rarely see people saying, "you're only criticizing whiteboard interviews because you hate meritocratic job placements." No, I have criticisms of these systems because they're not good representations of talent.
> If anything, I would say that my views are the opposite -- homogenization creates a lack of differentiation around skill and aptitude based on questionable science
If that were true, you'd expect countries like South Korea, Japan, and German to perform poorly in science and engineering, among other things.
Diversity may be a worthy goal for societal reasons, but it certainly is not a perquisite for excellence, seeing as there are many highly successful countries that are very homogeneous.
> If that were true, you'd expect countries like South Korea, Japan, and German to perform poorly in science and engineering, among other things.
It's wild to me that someone can have the view that the existence of other countries settles the debate over whether or not our school systems encourage well-rounded/successful students given that comparisons to more homogenized schooling environments like China is still one of the more contentious high-level debates about educational quality we have today. Again, I'm not here to convince you one way or another, but that is not a debate that I think most of society considers settled.
> Diversity may be a worthy goal for societal reasons, but it certainly is not a perquisite for excellence
If that's the argument you want to make, then fine, go for it. But then don't say that you're opposing a group that "hates the idea of excellence and differentiation." You are arguing for removing differentiation between different kinds of intelligence and skillsets and compressing that spectrum into an objectively less descriptive metric.
Make up your mind whether I'm arguing for more diversity and more differentiation between people or for less of it.
I'm not completely sure. I think MIT's conclusions might be correct, they might be preferable to GPAs. I also think there might be other alternatives that aren't easy to implement, that require either a restructuring of how we do school or a better distribution of resources than we currently have.
One conclusion that MIT hints at (although it doesn't say it outright) is that SATs might be a better indicator of success across economic levels in part because it's harder to buy a better SAT score with money. Looking at things like extracurricular activity runs into many of the same problems as looking at Github repos during hiring processes -- a lot of people don't have time to do a bunch of extracurricular activities, and access to those extracurricular activities is likely highly correlated with socioeconomic status. It might be difficult to move in that direction when access to school resources varies so much between areas.
I do think the SAT could be improved -- I think one really easy way would be to change how it's administered so that it optimizes less for formal test-taking skill. The really good thing about the SAT is that it's a less school-specific measure than GPA. So a better alternative might be a version of the SAT that kept a standardized metric but that either widened its scope significantly or was administered differently.
I also want to put forward the idea that admissions might just be really hard, period, and there might not be an easy way to assess potential, and trying to figure out the easiest way to do it might be like asking, "what's the best way to teach a child to play an instrument in a single day?"
----
One really important point that I want to get across: there is a difference between a measure being good and a measure being "the least terrible option we have at the moment" -- and confusing the two can cause real harm.
At the top of this thread I see the quote, "so much for that common, popular notion that standardized tests do not predict anything of value." And if that's somebody's attitude, then they're never going to find a better option because the whole thing is being approached through the lens of "see, we were right, this is a good metric."
I think a lot of criticism of standardized testing, IQ, coding tests for hiring, etc... is not necessarily trying to destroy everything, it's just trying to point out that many of these measures are really bad and they shouldn't be treated with the respect they're often given. I think that someone can very easily both have the position, "yeah, MIT probably should use SAT scores alongside GPAs" and the position, "people place way too much confidence in these things as an indicator of success."
I'm definitely not the person you describe, but the idea of standardised testing being equivalent across all factors just strikes me as being fundamentally untrue.
Personally I am very lucky to test well; and I definitely buy the notion that people who test well in SATs may go on to do better in University, but the reasons are probably the same: freedom from worry about financial circumstances will affect grades. 10 times in every 10.
I grew up poor and I achieved some of the highest scores state wide in my country's standardised tests as a child (we get tested at ~8,10,12,14). A lot of my peers at my school were from social housing. My assessment is that their biggest issue wasn't money but their homelife. Parents who didn't value education, or even a basic respect for rules/authority. The kids were wild because their parents were kind of wild themselves. Money wouldn't fix scores for these kids.
If you wish to make a political correct stance, I wouldn't go the money route. I'd say that these kids are victims of intergenerational poverty cycles.
Same. My family was below the US poverty line, but my parents were college educated and most of the extended family placed tremendous emphasis on education, academic performance, and college prep. I always get very annoyed with modern discourse that reduces all successes, even staying out of prison, to family income and nothing else. Most of the people I went to school with were from poor or working class families, and I guess a “normal” proportion went to college, and a “normal” proportion were “smart kids.” Based on my observations, a large factor that I never see discussed is religion. Although I’m an atheist, I think the religiosity of the communities I grew up in was a highly effective mitigator of common social ills.
I think the benefit of religion is that a religious mother/father is less likely to be off on 3-day meth binge compared to a non-religious one. There's a social network to help support people. The social network also encourages a reduction/removal of typical vices that are going to affect a families children (alcohol, drugs, etc).
>but the idea of standardised testing being equivalent across all factors just strikes me as being fundamentally untrue.
What's your take on MIT's stance?
our ability to accurately predict student academic success at MIT 02 Our research shows this predictive validity holds even when you control for socioeconomic factors that correlate with testing.
Not GP, but you should approach this using Bayes' Theorem just like anything else. If one study from MIT causes you to completely flip on any of your beliefs, you need to rethink how you form these kinds of opinions.
MIT's conclusions should cause you to adjust your priors by a certain amount, but they should not cause you to completely flip by themselves -- particularly if you're not in the camp that thinks literally every decision MIT makes is correct by virtue of it being MIT.
If you wouldn't have looked at MIT's original plan of abandoning SAT scores as proof that they didn't matter, you probably also shouldn't look at them picking up SAT scores again as proof that they do matter. MIT's conclusions should lead you to update your priors by some amount dependent on how much you trust you currently have in the accuracy of college admissions processes when they assess student qualifications and outcomes.
----
My personal take on this is that I do absolutely buy that SAT scores could be a leveling factor between kids from different socioeconomic backgrounds and that they could be a better metric than GPA for determining admission. But of course, that's a pretty low barrier of entry to clear, GPA scores are probably close to meaningless when compared across schools. It seems to me that there's a lot of room here for SAT scores to be simultaneously mostly meaningless and at the same time also a reliably better predictor of school success than GPAs.
It's also important to ask what exactly MIT is measuring -- what does it mean by academic success and how much does that definition overlap with "fits in when placed in an environment optimized for people who are good at standardized testing?" And again, even if they are kind of circular or if they're measuring the wrong things, it's still plausible that they're more reliable than GPAs; it's a low bar to clear.
The same factors that lead to success for SATs can lead to further academic success.
I believe that MIT is probably right, in fact, I'm quite certain of it. Many people will drop out of university or perform poorly than their peers for socio-economic reasons, the person working while studying will probably do worse than the person who just studies.
MIT wants the most graduates and especially the most successful graduates, so the institution is right to do this, but I do still think it's more inhumane than I'm personally comfortable with -- but this is part of why I live in Europe where university students in general are seen as an investment by the state and not so much a business to be optimised.
> this is part of why I live in Europe where university students in general are seen as an investment by the state and not so much a business to be optimised
In this specific case, though, I don't think these two things are in conflict at all. By selecting the best candidates on the basis of merit, MIT is doing what's best for both MIT as well as the broader society.
We all benefit from living in a country that produces top-tier scientists and engineers, and MIT benefits from being a place that is known for producing top-tier scientists and engineers.
Funny that you bring up Europe. As far as I know European countries don’t rely on extracurriculars and other nebulous measures as much as US colleges do.
Doesn't most of Europe also rely on standardized testing for university admissions? My country definitely does so, and has for decades, both ore and post communist times. I also know France has the famous Bacalaureat at the end of high school.
> Doesn't most of Europe also rely on standardized testing for university admissions?
They sure do. So does India. In fact, a lot of other countries rely on testing a whole lot more than the U.S. which has interviews, essays, sports, teacher recommendations, etc.
We have to read the sentence very carefully. It's saying that regardless of socioeconomic factors, the number correlates with graduate success rate. This seems like a very easy "duh". The way I read that is "if a student gets in the 99th percentile regardless of whether they grow up rich or poor, they are likely to do well at MIT". This doesn't talk about acceptance rates based on socioeconomic factors.
The point in question is whether the students in a lower socioeconomic situation even has a chance to get into MIT.
It's not saying regardless, it's says controlled for. A subtle distinction, but the former is a raw comparator and latter is an adjustment, which implies even with a bias that still doesn't account for substantive change.
I was hoping someone had more insight into process or metrics on it.
Your own post has an implicit accusation that lower socioeconomic situations preclude high testing, which has all kinds of implications about the quality of education and living standards as a prior for acumen. I suppose my own bias shows in drawing that conclusion though. As Henry said though,
"If I had seven peasants, I could make seven lords. But if I had seven lords, I could not make ONE Holbein."
>To be fair, I don't think the debate was ever about the quality or predictive value of the tests.
It is. The common argument is that GPAs are as predictive as SATs. MIT says it is not. I think the problem is you only need average ability to a good GPA, but a top 1-5% SAT score confers a higher ceiling of ability. MIT wants to admit exceptional students, not just average or above average ones.
The debate is about the quality and predictive value of the tests. Opponents claimed that the tests had a cultural bias so students from some backgrounds would do better than others, that students who had a good education before university would be better prepared, and that studying for tests or taking tests repeatedly has been shown to improve scores but is only accessible to people who can afford it. These are all claims that the tests are not good at predicting aptitude.
The arguments against these tests are, of course, awful. Objective tests are the best way we know of to remove human bias. Aptitude tests (basically IQ tests) are the best way we know of to measure someone's natural ability (determined in early childhood) with little influence from their experience. Since their arguments make so little sense, it is reasonable to wonder about the psychology of opponents of standardized testing. But their arguments are, at least on the surface, about predictive value.
> it is reasonable to wonder about the psychology of opponents of standardized testing
It is, at its core, a fear that testing largely reproduces the status quo. If one accepts the idea that there is an intellectual elite who constitute the highest strata of society, and that their gifts are innate and heritable rather than trained, it follows that social mobility is pretty much dead. It is a bleak vision.
Personally I think there are different problems that are much bigger and woollier which keep people from non-elite backgrounds down, regardless of test outcomes. The structure of the education sector and employment more widely. Expectations about life and the distribution of rewards etc. We rarely have good quality, nonpartisan discussions about these things which I think pushes people to take views which are instrumental rather than informed.
>it follows that social mobility is pretty much dead. It is a bleak vision.
I have always found the idea of social mobility depressing. It assumes that we will always have a hierarchy, with some people who are powerful and prestigious and others who are poor and always feel inadequate. It assumes that we will always have an underclass but at least people can leave it.
The kind of social mobility that SAT has some influence on is not really about "power and prestige", which I also think of as generally pathological dynamics. It's literally about how competent and professional you want to be, and how well you can perform your work duties. It's social mobility within the 'working' class, not really away from it.
Yes. The old saying among Labour party socialists in the UK was "rise with your class, not above it". They were in favour of a high floor on living standards and a low ceiling on wealth. It isn't a stretch to think that a more even playing field would be a better substitute for mobility.
It is actually possible that some people go to MIT because it has more diversity [1] than its very close competitor, Caltech. [2] But it's true that the first filter for these students is undoubtedly world-leading-technical-program.
We're talking about the school's goals in forming a class, not the applicants' goals. Most schools that are among the "best in the world" find they can weigh multiple factors to decide who to admit, and there's no single magic number that does that job for them.
It very much depends on the style of coding test - personally, I'm more than happy to do take-home style tests where I prepare something in a matter of a few hours, but I can't stand "leetcode" interviews or anything where I'm pressured to produce in 30 minutes or less; perhaps that's because that's typically not how I work in the real world and in my experience, they do a really poor job of demonstrating my skill set and experience.
I have terminated interviews before they even got started because of poor interview loop design from employers.
I recently failed a CS coding test. I was asked to solve a problem in 10m. I solved it in 20m and was rejected. I came up with a solution and communicate it right from the start. My solution was totally clear and readable. I just needed time to warm up and attentive to my code. I love CS. I love solving problems and reading books about Algorithm and Data Structure. I implemented them from scratch as a hobby. But the interviewer guy is not caring about that and said process is process. I felt disappointed at first but felt lucky after that since I wouldn't want to work with those people in the future.
10 mins per problem sounds extreme except for something that can be answered in no more than 5 lines of python (no code golf of course). Even then its signal-to-noise ratio (from an interviewer's perspective) can't possibly be too high. Most places would ask you to solve a moderately nontrivial problem in 30-50 minutes
From the other part of the table - we'd lose more candidates if we did take-homes. People in general prefer to study once and use that knowledge for multiple companies at once, you can't optimize take-homes like that.
Sure, but all in all, you just need 100 frequent HN users hating on these interviews to fill all related threads with such complaints, and there are dozens of millions of SWE worldwide. Big tech companies still hire mostly based on this type of interviews and they've been growing a lot lately, meaning that enough people apply to them. Leetcode and the likes have also democratized the process for many candidates.
If anything, I'd say that the proportion of the workforce willing to submit themselves to these tests is increasing, but that's just a guess.
I refuse now and didn't two years ago. Anecdotal, I realise. But I think lots of people have decided that the message heavily test oriented recruitment processes send out indicates a bad work culture and sense of entitlement from employers. I subscribe to this view and vote with my feet.
The SAT has been demonstrated to be effective at predicting success in university. We have almost no evidence about the computer industry's hiring practices. It is completely unscientific. Interviews operate on folklore, not statistics.
This is something your HR department should be very concerned about. If the questions you ask during your interview are not useful in finding a good candidate why are you asking. This isn't just about time either, interviews have some strong laws around them so asking the wrong question could get you in court.
I know when we wanted to do a coding test they told use we need to spend 6 months of giving everyone a coding test, have it independently graded by someone not involved in the hiring process. Then after people have worked here for 6 months we examine our actual results from those we hired and see if the tests at all predicted something useful. (or something like that - there is room in the scientific process for some variation)
The bar below which HR has to be worried is not "we've scientifically determined that our interview questions lead to good on-the-job performance". There has to be some reasonable sense in which you could argue the interview filters for good candidates, but no one is requiring you run studies.
Google once did a retrospective study and found that interview scores for people we ended up hiring were not correlated at all with people's on-the-job performance. I'm pretty sure nothing really changed as a result of this. I think it's a combination of the industry, especially FAANG, being kind of "stuck" on these kinds of interviews, and a lack of clearly better alternatives (I think there are better alternatives but it's not like I can point to studies backing me up).
> I know when we wanted to do a coding test they told use we need to spend 6 months of giving everyone a coding test, have it independently graded by someone not involved in the hiring process. Then after people have worked here for 6 months we examine our actual results from those we hired and see if the tests at all predicted something useful.
This is interesting but also way heavier weight than anything I've ever heard of. OOC where do you work? (Like vague description of kind of company, if you're not comfortable sharing the specific name).
> Google once did a retrospective study and found that interview scores for people we ended up hiring were not correlated at all with people's on-the-job performance.
This sounds like an unsound result. If you select based on a criteria the correlation with the criteria is usually diminished and sometimes even reversed in the selected sub-population.
Like if you select only very strong people to move furniture then measure their performance. Because they're all strong, you won't observe that weak people are bad at it-- plus you'll still have some people who were otherwise inferior candidates who were only selected because they were very strong, resulting in a reverse result. But if you dropped the strength test you'd get many unsuitable hires (and suddenly find strength was strongly correlated to performance in the people you hired).
This is actually confirmed with real world data on this for professional football with player weight and professional basketball with player height.
For Offensive Linemen in the NFL, there is no correlation between weight (which range from 300-360 pounds) and overall performance. A "heavy" 350 pound player is not more likely to do better than a "light" 310 player. But nobody who weighs a mere 250 pounds could realistically make the cut or perform well at the highest level.
For basketball players there is no correlation between height and performance, and there are several standouts examples of players below six feet so there's no cutoff. But if you compare the distribution of the subpopulation versus the general population, you'll see an extremely strong height bias.
> This sounds like an unsound result. If you select based on a criteria the correlation with the criteria is usually diminished and sometimes even reversed in the selected sub-population.
Yeah that's very true and I think was part of why they maybe didn't react to it too much. What you really want is to find the people you rejected and see how well they're doing, but we don't have that data.
Still though, naively I think I would have thought that someone who gets great marks across the board should be able to be more successful at Google than someone who barely squeezes by, and I do think it's kinda telling that that's not the case. But I'm maybe just injecting my own biases around the interview process.
edit: This reminds me a lot of this informal study that found that verbal and math scores on SATs were inversely correlated, which seemed surprising, until people realized they were only ever looking at samples all from a single school. Since people at any given school generally probably had ~similar SAT scores (if they were lower they wouldn't have gotten in, if they were higher they would have gone to a more selective school), the variation you see within a given school will be inverse (the higher you do on math, the lower you must have had to do on verbal to have gotten the "target" score for that school).
At google's scale, if they had an alternative basis for hiring people they could judge candidates by both and hire randomly use one method or the other method to make some of their hires, then compare their performance over time and at least say if there is a significant difference or not.
But as you note, the lack of obvious good alternatives is an issue... and we can't pretend that there isn't an enormous difference among candidates. If we though that unfiltered candidates were broadly similar then "hire at random, dismiss after N months based on performance" would be a great criteria, but I don't think anyone who has done much interviewing thinks that would be remotely viable.
(Though perhaps the differences between candidates are less than we might assume based on interviewing since interviewees should be worse than employment pool in general, since bad candidates interview more due to leaving jobs more often and taking longer to get hired)
>If we though that unfiltered candidates were broadly similar then "hire at random, dismiss after N months based on performance" would be a great criteria, but I don't think anyone who has done much interviewing thinks that would be remotely viable.
I know a fair number of companies that do essentially that. They hire contractors for 6 months, at the end of 6 months the good ones are offered a full time position. The contractor company probably does some form of interview, but they are more interested in their 6 months of overhead from the contractor than quality candidates.
> since bad candidates interview more due to leaving jobs more often and taking longer to get hired
But there are also great people who interview badly.
It's similar but it also brings in a challenging problem: coding tests costs candidates far more than it costs employers in terms of time. I am currently interviewing and two of the companies I am otherwise excited for sent take-home tests that just exhaust me, especially after a long day of otherwise productive work. I've got 12 years of experience under my belt but somehow great references and a killer resume aren't enough to convince them I can find a security vulnerability.
I do coding tests for the first interview. Nothing hard, just enough to do basic data modeling and writing a unit test. I also time cap to under an hour, and the internet is available as a resource.
It's more about labor market dynamics and supply versus demand. If there are plenty of developers available to hire then employers will insert extra hurdles in the process to filter out weak candidates (with the understanding that there will be some "false negatives"). But when the labor market is tight then employers will take a chance on any candidate who seems minimally competent because they need to fill the req.
I think that is more about the disconnect between coding tests and the actual day to day work and skills required to do the job. Example: I could have a high level of competency in software engineering and also not care how a mouse gets out of a bucket.
Those "mouse getting out of a blender" brain-teasers or whatever are pretty unheard of at this point I think. Most people complain about coding questions, generally leetcode-style questions I think.
To be fair, I don't think the debate was ever about the quality or predictive value of the tests. There is a small, but well-organized and vocal subset of the population that hates the idea of excellence and differentiation. They want, and have been quite successful in, the replacement of standards of excellence with vaguely defined (defined by them, of course) buzzwords like "equity" and "diversity".