Wednesday, October 28, 2020

A Mashey Gem

http://initforthegold.blogspot.com/2010/10/mashey-gem.html

http://www.realclimate.org/index.php/archives/2009/02/on-replication/langswitch_lang/in/

People are making an error common to those comparing science to commercial software engineering.

Research: *insight* is the primary product.
Commercial software development: the *software* is the product.

Of course, sometimes a piece of research software becomes so useful that it gets turned into a commercial product, and then the rules change.

===
It is fairly likely that any “advanced version control system” people use has an early ancestor or at least inspiration in PWB/UNIX Source Code Control System (1974-), which was developed by Marc Rochkind (next office) and Alan Glasser (my office-mate) with a lot of kibitzing from me and a few others.

Likewise, much of modern software engineering’s practice of using high-level scripting languages for software process automation has a 1975 root in PWB/UNIX.

It was worth a lot of money in Bell labs to pay good computer scientists to build tools like this, because we had to:

- build mission-critical systems
- support multiple versions in the field at multiple sites
- regenerate specific configurations, sometimes with site-specific patches
- run huge sets of automated tests, often with elaborate test harnesses, database loads, etc.

This is more akin to doing missile-control or avionics software, although those are somewhat worse, given that “system crash” means “crash”. However, having the US telephone system “down”, in whole or in part, was not viewed with favor either.

We (in our case, a tools department of about 30 people within a software organization of about 1000) were supporting software product engineers, not researchers. The resulting *software* was the product, and errors could of course damage databases in ways that weren’t immediately obvious, but could cause $Ms worth of direct costs.

It is easier these days, because many useful tools are widely available, whereas we had to invent many of them as we went along.

By late 1970s, most Bell Labs software product developers used such tools.

But, Bell Labs researchers? Certainly no the physicists/ chemists, etc, an usually not computing research (home of Ritchie & Thompson). That’s because people knew the difference between R & D and had decent perspective on where money should be spent and where not.

The original UNIX research guys did a terrific job making their code available [but "use at your own risk"], but they’d never add the overhead of running a large software engineering development shop. If they got a bunch of extra budget, they would *not* have spent it on people to do a lot of configuration management, they would have hired a few more PhDs to do research, and they’d have been right.

The original UNIX guys had their own priorities, and would respond far less politely than Gavin does to outsiders crashing in telling them how to do things, and their track record was good enough to let them do that, just as GISS’s is. They did listen to moderate numbers of people who convinced them that we understood what they were doing, and could actually contribute to progress.

Had some Executive Director in another division proposed to them that he send a horde of new hires over to check through every line of code in UNIX and ask them questions … that ED would have faced some hard questions from the BTL President shortly thereafter for having lost his mind.

As I’ve said before, if people want GISS to do more, help get them more budget … but I suspect they’d make the same decisions our researchers did, and spend the money the same way, and they’d likely be right. Having rummaged a bit on GISS’s website, and looked at some code, I’d say they do pretty well for an R group.

Finally, for all of those who think random “auditing” is doing useful science, one really, really should read Chris Mooney’s “The Republican War on Science”, especially Chapter 8 ‘Wine, Jazz, and “Data Quality”‘, i.e., Jim Tozzi, the Data Quality Act, and “paralysis-by-analysis.”

When you don’t like what science says, this shows how you can slow scientists down by demanding utter perfection. Likewise, you *could* insist there never be another release of UNIX, Linux, MacOS, or Windows until *every* bug is fixed, and the code thoroughly reviewed by hordes of people with one programming course.

Note the distinction between normal scientific processes (with builtin skepticism), and the deliberate efforts to waste scientists’ time as much as possible if one fears the likely results. Cigarette companies were early leaders at this, but others learned to do it as well.

No comments: