Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have you worked with biologists? If they could figure out how to use git, let alone github, I would be astounded. No forgetting, these are incredibly smart people, they just are bench-top scientists, not laptop ones. If you want their code, it will be, at best, a copy-paste of badly hardcoded Matlab, years in the making, with for-loops for the sake of for-loops alone, and without ANY comments or documentation whatsoever. Honestly, it would take less time for you to write it yourself and compare, rather than to try to peek under the hood of a spaghetti mess.

Honest to god and on the grave of my mother this happened to a friend. He was going over some old Fortran 88 code, filled with GOTO statements. The code, at best, was a rat's nest. Only a deep and long fight could get it into your brain. At about 3 am after a long day in the lab, she finally gets to a line in the code that says GOTO LINE 12345 with the comment 'HAHA MADE YOU LOOK' . Her boss bought her a new computer after she threw that one off the roof.

That experience is universal with Biologist code.



> Honestly, it would take less time for you to write it yourself and compare, rather than to try to peek under the hood of a spaghetti mess.

That's not actually the point. The point is reproducibility. If you write your own code and get a different result, why? If the reason is the code then having the other code, no matter how terrible it is, allows you to figure out what the difference is. And then somebody is wrong and you know why and can publish that result.


Exactly. IMO, this is a big problem in some areas of computational biology right now. Code to a famous project gets released, and instead of independently validating the results, other researchers (rationally) choose to just build on top of the famous project. It's faster, easier, and the "collaboration" leads to social prestige for everyone involved.

The end result is that empires are built, reproducibility is (paradoxically) harmed, and we're someday going to end up finding out that some big, high-profile projects were built on pillars of sand.


My experience is that it takes a certain kind of twisted intelligence to write code that is below a certain level of quality, but still works. Someone like me who is a mediocre programmer? if my code gets too terrible, it simply stops working. I'm forced to use some base level of software engineering if I want to write a large program that runs, just because there's a limit on how many random side effects I can keep track of.

From what I've seen when supporting EEs? People who are that much smarter than I am don't have this limitation. They can write thousands of lines of spaghetti perl and bash and as long as nobody touches the damn thing, it works fine. God help you, though, if you make a change.

But... there is an established job role for this; I mean, you don't want to make your EEs sysadmin their own LSF cluster, either.


Nope, its always below average programmers who don't seem especially smart, and who don't want to learn who produce the worst code in my opinion.


There's a difference between 'being smart' and 'being able to program well', and that difference is precisely why academic code has its reputation.


My observation is that to make a program run, a less intelligent person needs to organize that program better. But, intelligence is a controversial subject in and of itself, so I don't expect to see agreement.


> God help you, though, if you make a change.

Or need to add functionality. Or, just wait a year.


Sounds to me like there's an opportunity to propel biology forward by teaching some biologists how to manage their code in a way that won't cost large amounts of productivity.


This is exactly the goal of http://software-carpentry.org/workshops/


Ha, it could be worse. I have a friend who's getting his PhD in microbiology and he and his team were running into some trouble with these massive multi-gigabyte CSV files from NIH (gene sequencing data, if I recall). Turns out they were trying to open the little buggers in Excel and wanted to stay in Excel since it was familiar. At some point, I wound up just suggesting they use split on the stupid (but very important) things and people acted like I just blew their minds.

I think sometimes you just get so caught up in your field of expertise that you miss some of the tools that could drastically help you entirely. You have to wonder how much of an efficiency drain that adds up to over the course of even just individual research projects. And that's before you get to bad code.


What is Fortran 88? I was under the impression it jumped from Fortran 77 to Fortran 90?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: