Museum criticizes Microsoft for 'mutilated' MS-DOS 4 open source release — posting on 'stupid' git blamed for the buggy blunder

(Image credit: Shutterstock)

On April 4, Microsoft released its landmark 1986 MS-DOS 4 operating system's source code on GitHub, listed alongside its other DOS source code releases — however, posting MS-DOS 4 on GitHub in its current altered form seems to have broken some critical files entirely.

As called out by noted developer and operator of blog OS/2 Museum Michal Necasek in How Not To Release Historic Source Code, git failing to preserve timestamps and converting files to UTF-8 pretty much breaks everything. Necasek praised the release of the code but criticized the bugs introduced in the process, saying, "But please please don’t mutilate historic source code by shoving it into (stupid) git."

Of the two issues, source files being converted to UTF-8 may be more severe. This is because the old tools characteristic of those operating systems can't actually parse UTF-8 and likely can't be updated to do so, either. The byte line length limit of MS-DOS 4's MASM was 512 bytes, and the UTF-8 conversion brings specific files above that limit, making them unreadable.

The severity of the file errors varies, but the OS/2 Museum post notes that core system files are a perfect match for the original disk image files. However, the source code also seems to correspond to 4.01, a "quiet" update to 4.00 that fixed a few bugs. It wasn't directly available but it did ship to PC OEMs.

The original post recommended simply releasing the raw files as an archive, with no UTF-8 conversion or anything of the sort. However, the story immediately continues in the comments, where co-developer of the MS-DOS 4 release Connor Hyde, aka Starfrost, acknowledges the issue and discloses legal reasons for not including timestamps.

The discussion between Michal and Starfrost continues briefly in the comments before they take it to emails we won't be pressing them for. Despite Michal's critical tone, it's noted that idiotic corporate policies "obviously" aren't the fault of an indie developer like Starfrost.

Hopefully, these issues can be resolved soon so that MS-DOS 4 can be enjoyed in its proper glory. However, MS-DOS 4 still wasn't well-liked when it worked properly due to its hefty 92 KB RAM usage.

Amusingly, this resulted in the competing DR-DOS skipping 4.0-4.99 versioning entirely in favor of going from version 3.41 to version 5.0 — though MS-DOS 4's multitasking focus still lives on today.

TOPICS

Christopher Harper has been a successful freelance tech writer specializing in PC hardware and gaming since 2015, and ghostwrote for various B2B clients in High School before that. Outside of work, Christopher is best known to friends and rivals as an active competitive player in various eSports (particularly fighting games and arena shooters) and a purveyor of music ranging from Jimi Hendrix to Killer Mike to the Sonic Adventure 2 soundtrack.

16 Comments Comment from the forums

CmdrShepard

By saying the Git is idiotic this guy tells me all I need to know about him.

In other words, he is dumb as a rock and so is everyone who repeated what he said verbatim.

Neither of the two issues he blamed on Git are Git's fault -- they are fault of the committer.

This article parroting his ignorance in the title is why we still have so many developers not using Git but instead putting source code in RAR archives, leaving commented out code in source files, and keeping change history in comments (or if we are lucky in a file called history.txt).
Reply
RichardtST

Personally, I can't stand Git either. Not for the reasons mentioned in the article, but because they change syntax so often that the documentation simply never matches and one spends hours trying to find something up to date that actually works and doesn't muck things up even worse than it was. Honestly I've never wasted so much time on a source control system before... And the new personal access tokens... OMG. Really? MS succeeding in making something even more useless than it already was. I will be happy when Git fades away into oblivion where it belongs...

As for the archive. It simply cannot be that big. Just zip it up, post it, and call it a day. Why do people overkill these things?
Reply
FoxtrotMichael-1

RichardtST said:
Personally, I can't stand Git either. Not for the reasons mentioned in the article, but because they change syntax so often that the documentation simply never matches and one spends hours trying to find something up to date that actually works and doesn't muck things up even worse than it was. Honestly I've never wasted so much time on a source control system before...
Ive used git professionally for over a decade. I make multiple commits, PRs, and even sometimes rebases per day. Never once, in that entire time, have I come across a cli command that changed from the last time I used it or a cli command that didn’t match the documentation. It just doesn’t happen, so I have no idea what you’re talking about here. The git hate really boggles my mind. It’s an incredibly powerful SCM that’s actually fairly easy to understand and use if you take the time to learn it.
Reply
toffty

FoxtrotMichael-1 said:
Ive used git professionally for over a decade. I make multiple commits, PRs, and even sometimes rebases per day. Never once, in that entire time, have I come across a cli command that changed from the last time I used it or a cli command that didn’t match the documentation. It just doesn’t happen, so I have no idea what you’re talking about here. The git hate really boggles my mind. It’s an incredibly powerful SCM that’s actually fairly easy to understand and use if you take the time to learn it.
Qft. I've used other tools in my career and Git, hands down, is the best. Is it straightforward? Outside of normal use, no, but Google and SO have the answer.

As for formatting, i'd say zip it up and have the zip exist side-by-side with the formatted version.

Not sure why people lose their heads over this stuff too...
Reply
TJ Hooker

RichardtST said:
Personally, I can't stand Git either. Not for the reasons mentioned in the article, but because they change syntax so often that the documentation simply never matches and one spends hours trying to find something up to date that actually works and doesn't muck things up even worse than it was. Honestly I've never wasted so much time on a source control system before... And the new personal access tokens... OMG. Really? MS succeeding in making something even more useless than it already was. I will be happy when Git fades away into oblivion where it belongs...

As for the archive. It simply cannot be that big. Just zip it up, post it, and call it a day. Why do people overkill these things?
It sounds like you're maybe confusing github (an MS product) with git (the open source software)? Github has its own, additional CLI that's different than the native git CLI. MS isnt the maintainer of git, I'm not sure if they're even major contributors.
Reply
NinoPino

CmdrShepard said:
By saying the Git is idiotic this guy tells me all I need to know about him.
He said that is a nonsense to use git for historical software, not to use git in general.
This is obviously because there is no need to modify anything so no need for a versioning system that, if used improperly, can change original files in sneaky ways.

CmdrShepard said:

In other words, he is dumb as a rock and so is everyone who repeated what he said verbatim.

Neither of the two issues he blamed on Git are Git's fault -- they are fault of the committer.

This article parroting his ignorance in the title is why we still have so many developers not using Git but instead putting source code in RAR archives, leaving commented out code in source files, and keeping change history in comments (or if we are lucky in a file called history.txt).
Personally, in case of historical software, I totally agree with Michal Necasek.
There is no need to use complex tools like git because there isn't a single advantage in using it versus standard archives (like zip for example). Archives are more simple, fast, secure and compatible.
Reply
CmdrShepard

NinoPino said:
He said that is a nonsense to use git for historical software, not to use git in general.
Here's a quote directly from his blog post:

But please please don’t mutilate historic source code by shoving it into (stupid) git.
Emphasis mine.

He rambles on about how bad Git is:

First of all, git does not preserve timestamps, which causes irreversible damage.
While it is true that git doesn't restore them, you can use git-restore-mtime command to work around that with some limitations.

As for why it doesn't restore file times (even though it preserves them in the log) here's a simple explanation:

It's because it would break every build system like make, maven, gradle, etc. that depends on file modification times to know what needs to be rebuilt. If a git checkout or a git pull pulls in commits that are older than the last executable you built, it would give those files an older timestamp. make therefore won't detect them as an updated dependency, and won't include those in a new build without doing a make clean first. This is super annoying.

There is git log for finding the last time a file was modified in version control and ls for finding the last time it was modified on your local disk, and it turns out there's good reasons for keeping those separate.
As for the rest:

For practical purposes, old source files are not text files. They are binary files, and must be preserved without modification. It is not OK to take an old source file and convert it to UTF-8.
No, they aren't binary files. They are text files with specific encoding (CP437 most likely) and DOS line endings (CR/LF).

Finally, it wasn't Git who converted those files, at least not automatically.

NinoPino said:
This is obviously because there is no need to modify anything so no need for a versioning system that, if used improperly, can change original files in sneaky ways.
Git is perfectly capable of not modifying anything -- you can safely store EXE, DLL and all other forms of files in it without having a single issue.

NinoPino said:
There is no need to use complex tools like git because there isn't a single advantage in using it versus standard archives (like zip for example). Archives are more simple, fast, secure and compatible.
Again, it's not a problem in what was used, but how was used. He is outright wrong.
Reply
NinoPino

CmdrShepard said:
Here's a quote directly from his blog post:

Emphasis mine.

He rambles on about how bad Git is:

While it is true that git doesn't restore them, you can use git-restore-mtime command to work around that with some limitations.

As for why it doesn't restore file times (even though it preserves them in the log) here's a simple explanation:

As for the rest:

No, they aren't binary files. They are text files with specific encoding (CP437 most likely) and DOS line endings (CR/LF).

Finally, it wasn't Git who converted those files, at least not automatically.

Git is perfectly capable of not modifying anything -- you can safely store EXE, DLL and all other forms of files in it without having a single issue.

Again, it's not a problem in what was used, but how was used. He is outright wrong.
You continue to decontextualize his words but in this situation I not agree with your conclusion.
Git for certain is capable of manage this type of source files, but I remain of the opinion that it is not the best tool to use when we have to preserve historical files.
Can you give me a single advantage on using Git versus normal archives when you have this type of files ?
Reply
snemarch

I've used Git for more than a decade, and while it's not perfect, it's the best versioning tool I've used for day-to-day development work so far.

But I agree with Michal Necasek – it's a stupid choice **for releasing historical code**. Yes, you could do hacks for timestamp preservation, and you could commit all files as binary... but for a historical release, a compressed archive is simply a better solution.

If you want to do development on a fork of historical code, Git could be a fine choice - you probably wouldn't care about timestamps at that point, and part of the initial work would be ensuring you're able to build and release from the new repo, get tooling fixed, et cetera.
Reply
CmdrShepard

NinoPino said:
You continue to decontextualize his words
No, I am just quoting what he said and what you seem to be ignoring -- he said "stupid Git".

There's nothing to decontextualize -- saying that the most commonly used distributed version control system which even Microsoft uses to store Windows source code written by none other than Linus Torvalds for the purpose of managing Linux kernel codebase is both rude and unprofessional regardless of the context simply because it isn't true.

NinoPino said:
Git for certain is capable of manage this type of source files, but I remain of the opinion that it is not the best tool to use when we have to preserve historical files.
We can debate whether it's the right choice until we are blue in the face, but it's Microsoft's choice. Perhaps they plan to add newer versions of MS-DOS into the same repo later on in an attempt to reconstruct the code change history? They wouldn't be able to do that with archives.

NinoPino said:
Can you give me a single advantage on using Git versus normal archives when you have this type of files ?
What is important is that there are no disadvantages as long as it is used correctly which wasn't the case here.
Reply

Show more comments

Recommended reading

Stay On the Cutting Edge: Get the Tom's Hardware Newsletter