Reposting my HAD comment:
This is fraudulent, at best. While at some level this project is an interesting commentary on what can be done with a LLM, to say it resulted solely from binary analysis and clean-room re-implementation is actively disingenuous (aka. a lie). There are literal references to the apple source files that originated the C sources in this project. These are not referenced in the whitepaper at all!
For example, see these files (there's much more...). As of the current revision these files are still present and presumably used as-is.
Contribute to Kelsidavis/System7 development by creating an account on GitHub.
github.com
Contribute to Kelsidavis/System7 development by creating an account on GitHub.
github.com
I don't know how commonly it's known outside the mac community, but the entire System 7.1 / "SuperMario" ROM source was leaked a few years ago and can be browsed in this repository:
https://github.com/elliotnunn/supermario . As someone who is quite familiar with that source, it was obvious skimming through the project sources that there was more "understanding" than could be explained by any amount of time spent solely in debuggers and binary reverse engineering tools. Many common strings, method names, concepts, etc that you will /never/ recapture exactly through any amount of binary analysis.
Clearly, the entire apple source must have been fed into an LLM, along with indeterminate middle steps as we can't trust methodology from the paper, the result being an "act-alike" sharing some concepts but no practical relation after manual cleanup/additional implementation to get something that can boot. The sad part is that alone would have been interesting without the fake premise, but to coach this project as any kind of "reverse engineering by LLM" is fake until proven otherwise.
Oh, and this is straight up bunk: