My dad was a programmer in the early days. The machines he started on in the 1960s had 8 KB of RAM. Saving a byte then is the equivalent today of saving 1 MB on an 8 GB machine.
Multiply that times, say, the thousands of customer orders you're trying to process and the goofy thing would be burning a lot of additional RAM because it might help somebody 35 years later. Who among us is writing code today worried about how it will be used in 2052?
>Who among us is writing code today worried about how it will be used in 2052?
This decade I knowingly wrote code that will break in 2036 [1]. My supervisor was against investing the time to do it future-proof (he will be retired by 2036), and I have good reason to believe the code will still be around by that time. I don't think I'm the only programmer in that position.
- This decade I knowingly wrote code that will breaks in 2038
Sure, but how bad was it really? Something you could fix relatively quickly with a little time and money, or an instance of Lovecraftian horror unleashed upon the world like so much COBOL code?
Even then, mix in people switching jobs, losing the knowledge of where or what all these landmines are, and add similar but unrelated issues across your entire codebase. This stuff adds up. I like to at least add stern log messages when we are at 10%-50% of the limitation. It's saved my ass before, especially when your base assumption can be faulty.
In one of those scenarios, where we expected the growth of an integer to last at least 100 years, due to certain unaccounted for pathological behaviors, a user burned through 20% of that in a single day. But we had heavy warnings around this, so we were able to address the problem before it escalated.
Last time this came up, I ran the numbers and the cost of the RAM saved per date stored was hundreds of dollars. Not per computer, or per program, but per date. Comparing total memory sizes doesn't tell the whole story, because RAM for a whole machine is so much cheaper now.
Spending that much money on storing "19" just so your code keeps working in the unlikely event that it's still in use 3+ decades into the future isn't a good tradeoff. Obviously things are different now.
Excellent point. Yeah, the machines my dad started on had magnetic core memory [1]. Each bit was a little metal donut with hand-wound wiring.
And in some ways, even "hundreds of dollars per date" doesn't quite convey it. These machines were rare and fiendishly expensive. In 2017 dollars, they started at $1M and went up rapidly from there. Getting more memory wasn't a quick trip to Fry's; even if you could afford it and it was technically possible, it was a long negotiation and industrial installation.
Another constraint that we forget about is the physicality of storage. Every 80 columns was a whole new punch card. That's a really big incentive to keep your records under 80 characters. Each one of those characters took time to punch. Each new card required physical motion to write and read, and space to store.
It really was a different world. I think a lot of programmers don't understand just how different it was (I barely do), and don't realize that modern principles like programmer time being more expensive than computer time are not universal truths about computing, but are just observations of how things are in recent decades.
The interesting thing about this from an engineering point of view is, you quietly pass a threshold where the clever hack which was worthwhile becomes literally more trouble than it is worth. When that happens is a multivariate problem that we couldn't truly predict at the time of the code's creation. (and when it happens, there might not even be anyone on the payroll thinking about it)
You're calculating what it would cost to store a string representation of a date. Which is silly. You should always convert to a timestamp for storage. You can cram way more info into a single integer than you can with a base 10 string. And the bonus is you verify the date's correctness before storing.
Even a 32-bit int could hold 11 million years worth of dates. And if your software is used for longer than that, you can just change it to a 64-bit long and have software that will outlast the sun.
Silly or not, that's the reality of punch card based technology (BCDIC later extended to EBCDIC). Punch cards pre-date electronic computers, and making a relay tabulator set-up working with binary formats is impractical.
As computer hardware grew out of that, it maintained much of the legacy, down to hardware data paths and specialized processor instructions. It was more than a programming convention.
That was the right choice for the era. As mikeash points out, your approach takes more bits and more CPU cycles. But it also takes a computer to decode. Any programmer can look at a punch card, a hex dump, or even blinkenlights and read BCD. Decoding a 32-bit int for the date takes special code. Which you have to make sure to manually include in your program, the size of which you are already struggling to keep under machine limits.
Systems from this era were probably using BCD rather than base-10 strings. A BCD date would take up 24 bits.
Running a complicated date routine to convert to/from 32-bit timestamps would also have cost a huge amount. These machines had speeds measured in thousands of operations per second, and the division operations needed to do that sort of date work would take ages, relatively speaking. All on a machine that cost dozens of times the average yearly wage at the time, and accordingly needed to get as much work done as possible in order to earn its keep.
Sometimes this worry is thrust upon you by the problem domain. I do remember tackling the Y2K38 problem in 2008 - the business logic dictated that the expiration date should be tracked, and some of them were set to 30 years.
But a 2 digit date should take less than 7 bits. Were they using systems that didn't use 8 bit bytes? Why wouldn't the dates work from, say 1900 to 2155?
Back in the days it was probably 7 bits, but the word size is not that important. The problem still exists today with a modern 64 bit computer:
Even if a system internally can store a timestamp with nanosecond precision since the beginning of the universe, all that precision is lost when communicating with another system if it must send the timestamp as a six character string formated as "yymmdd" in ASCII.
My understanding is that the actual number of bits used would generally have been 4 bits per digit, as they were using Binary Coded Decimal [1]. So dropping the 19 would save you a byte per date.
Sure, they could have used a custom encoding. But that increases maintenance cost and extra development work. All to solve a problem that nobody cared about at the time.
You are assuming 8 bits per byte, but a byte can be any number of bits.
With two bytes of 7 bits each, the range is only about 40 years.
Is is also impractical when the storing media is punch-cards, and the systems adder unit only counts in binary coded decimal.
But then you need special code to decode that. Code that you have to write yourself or borrow and include in your program. Remember, no shared libraries. And it means extra CPU time you have to display a date. Whereas BCD has special hardware support.
It means that data interchange is now much more complicated too. How do you get everybody to agree on the same 2-byte representation for dates? This is the 1960s, so you can't just email them. You have to have somebody type up a letter and mail it. Or if you want to get on the phone, a 3-minute international call will cost $12, which is about $100 in 2017 dollars.
Plus then you can't look at a hex dump or a punch card or front panel lights and see the date, so now you've made debugging much harder.
For example, some systems stored the year in a byte, and when printing out a report it printed "19" and that byte - so year 1999 would be followed by year 19100.
Some systems, where storing numbers in columns of characters were common practice (COBOL idiomatic style?) stored the date as two digits (possibly BCD), so the possible range is 00-99 no matter how many bits are used.
But it's worse than that. In the 90s a lot of code used 16-bit values, character strings. That is, it stored a char(2), parsed it as 2-digit number and then converted it to a date by adding 1900.
So it was only really "saving space" when compared with storing a char(4).
But if they wanted to save space why not store a 8 bit number? I imagine it must have something to do with punch card compat or some binary coded decimal nonsense. Still seems inefficient.
If a system gives you two options for storing a date (using 2-digit or 4-digit years), how many dates do you need to store and use in calculations before you end up saving space by creating a new data type and all of the supporting operations to make the storage of the date itself more efficient? In recent years, it's more common to make this type of decision because something else is causing an issue, otherwise we rarely consider the space required for a date (and many languages no longer have a separate type for dates).
I doubt that's true, unless you mean it tautologically.
There are plenty of good programmers working on software that matters that should absolutely not be trading off hazy possible benefits in 2052 for significant costs now.
It's occasionally necessary. When I wrote the code for Long Bets [1], I took a number of prudent steps to make sure things would have a good shot at surviving for decades. But I only took the cheap ones; the important thing was to ship on time.
And I think that's the right choice for most people. Technological change has slowed down some, but 35 years is still an incredible amount of volatility. Betting a lot of money on your theories of what will be beneficial then is very risky.
> There are plenty of good programmers working on software that matters that should absolutely not be trading off hazy possible benefits in 2052 for significant costs now.
I guess it's not obvious, but I think there's really a continuum here. You don't necessarily need to write software that will run perfectly in 2052, but it'd be good if you wrote software that can be comprehended, adapted and altered later on. Maintainability is never a "hazy benefit." (If the problem isn't a total throwaway.)
Sure. Maintainability pays off relatively soon, and often makes systems simpler and cheaper to operate. But the topic in this sub-thread is the Y2K bug, where the proposed solution would have been expensive and provided no benefit for 35 years. And at the time, those benefits would have been very hazy.
I don't know, I think the attitudes that make you a good programmer mean you won't be satisfied leaving broken code in your product, no matter how far out the consequences are.
It's definitely true. Technical change in the 60s was enormous. During that period there was a lot of fundamental architectural change and experimentation; that's when they settled on 8-bit bytes as the standard, as well as many other things. Moore' Law became a thing. In the 70s is when we started seeing operating systems that look familiar to us, and even into the 80s it was plausible to introduce a new OS from scratch (see, e.g., NeXT, or Be).
The iPhone is a decade old; every phone now looks like it, and it's highly plausible that they'll look basically the same a decade from now, possibly much longer. Laptops are 30 years old; they've gotten cheaper, faster, and better, but are recognizably the same. HTML is coming up on 30, and it will be in use until long after I'm dead. TCP is nearly 40; Ethernet is over 40; even Wifi is 20.
So it's just easier now to guess what programmers will be doing in 35 years compared to 1965.
As someone who just wrote a quick hack for a temporary problem, I agree.
It's not just the shitty programmers who do this. Sometimes we have shitty product managers who won't push back against this kind of thing. And you're forced into a creating something evil because most of the job is very good but this one time, you have to suck it up.
My response to that, though I agree with you, is that when a supervisor or PM or whoever gets on you about something you know is bad, you negotiate.
"Yes, I'll do this for now because the company needs it now. But only if you guarantee me the time (and possibly people) to do it right later."
You get agreement in an email, create the ticket and assign it to yourself as mustfix two months from now. And you shove it down their throats.
That's not an ideal place to work if you have to do that, but I have worked at those places and this is how you deal with that situation.
"Yeah, I'll give you a shit solution in 1 day right now. But only if you give me a a couple of weeks for a good solution later."
In reality, I've mostly only had to deal with this situation in startups. Mid-level and mature companies are usually open to pushing back and getting things right. But there are exceptions. Today was an exception. But that's also one of the reasons I don't really want to work at startups anymore.
Shitty solutions are usually the right answer. At least in the areas I work in (mostly startups). I would estimate 99% of the code I write gets thrown away. Most of it is trying something out. Even for code that was intended to hit production, the company/project often gets cancelled before it ever hits production.
I'm not saying this is true in your case. But there are so many different classes and types of programmers and projects that it's hard to generalize.
99% of your shit code isn't getting thrown away. It's sticking around making life hell for people like me.
Stop writing shit code because it's going to get thrown away. If you work for startups, you are always operating in protoduction mode. Everything you write ends up in prod.
Write code that doesn't suck. It doesn't have to be perfect or optimal, but make it not suck before you push.
Hmm no. That's what's happening in your world, but you're imposing that world view on me.
Probably about 80% of the code I write doesn't even get looked at or used by another developer. If the technique/analysis proves useful, it gets rewritten/refactored. That has the added advantage that I then understand the model better.
For me there's a giant difference between code that lasts, which needs to be sustainable, and disposable code, which doesn't. I'm also very big on YAGNI; my code gets so much cleaner and more maintainable when I'm only solving problems that are at hand or reasonably close. Speculative building for the future can get insanely expensive: there are many possible futures, but we only end up living in one.
Indeed, I think a "do it right" tendency can prevent people from really doing it right. If we invest in the wrong sorts of rightness up front, we can create code bases that are too heavy or rigid to meet the inevitable changes. So then people are forced into different sorts of wrongness, working around the old architecture rather than cleaning it up.
Good for you. That's my approach, too. And to rig the system such that technical debt gets cleaned up continuously and gradually without the product managers knowing the details.
When there are real business reasons to rush something, I'm glad to support that by splitting the work like you suggest. But the flipside is them recognizing that not every thing is an emergency, and that most of the time we have to do it right if they don't want to get bogged down.
Well, yeah, I absolutely agree. Replaceability and maintainability go hand in hand in a system. It's a cruel irony that the code that sticks around, often sticks around because it's crap.
(that doesn't stop me from sometimes having a weird admiration for incomprehensible software kept going forever with weird hacks. It's like with movies, sometimes they're so uniquely awful that you have to admire the art of them)
In the 80s I questioned the use of using two bytes for the date. I was laughed at by the experienced programmers. They said the software would be rewritten by then. It should have been, but it wasn't...
But there is a trade off between how much time you spend today vs future compatibility.
My dad was a programmer in the early days. The machines he started on in the 1960s had 8 KB of RAM. Saving a byte then is the equivalent today of saving 1 MB on an 8 GB machine.
Multiply that times, say, the thousands of customer orders you're trying to process and the goofy thing would be burning a lot of additional RAM because it might help somebody 35 years later. Who among us is writing code today worried about how it will be used in 2052?