So, you want to be a ROMhacker? (2006)

derefr · on Feb 25, 2021

In the modern era, I feel like ROMhacking is almost "obsolete" as a pursuit. By bit-twiddling a ROM image file, you're effectively forced to work within the constraints of an arbitrary static-linked memory-map generated by a linker decades ago to suit the precise needs of the particular assembler source fed in at the time (and the needs of the ROM chip that said source would be burned onto.)

IMHO, with the tools we have available today, it makes much more sense to not directly hack on ROM images themselves, but rather to first disassemble the ROM; to factor the produced assembly into modules; to reverse the assembler data constant sections into separate named resource files, with a Makefile that converts them back; and so forth. At each step, with each change to increase legibility, you ensure that typing `make` continues to reproduce the original ROM image byte for byte.

We've done this enough times now (e.g. https://github.com/pret/pokered et al) that the process is now as well-documented and tooling-supported as the process of hacking directly on ROMs is. With these tools, this reversing process is effectively no additional work over that required to develop a complete understanding of what the original game's ROM "says" — which you'd need to do either way. (Just, in the case of ROMhacking, the output of that understanding process are messy rambling docs, rather than an annotated ASM codebase under version control.)

Once you have a source repo containing a clean, original-ROM-reproducing disassembly, you can then fork that repo, and develop any "hacks" you like on top of it, with a regular IDE and cross-compile toolchain. At this level, you don't have to worry about overrunning allocated strings-sections, rewriting pointers, patching subroutines with jumps because there's no room left at the call-site, etc. The assembler and linker handle all that, and sort+fixup all the symbols and entries after knowing how big they are. (There's also nothing† stopping you from just bumping up your target ROM image size if you run out of headroom.)

If you do go through this reversing process on your way to producing a hack, then as a side-benefit, you'll have also produced an artifact (the clean disassembly) that documents the original game for anyone who wants it, e.g. people who want to know how the game's algorithms work to build speedrunning tools; people who want to learn old programming techniques lost to time; people who want to conserve the game by porting it to new platforms; etc.

And, of course, you make it far easier for anyone who comes after you to also produce hacks. They don't have to go through the reversing process; they can just start with your clean disassembly code. (That "anyone else" also includes "future you who might want to come back and finish a years-old hack, and has forgotten how this all worked!")

If you're a programmer, and you feel drawn to make ROMhacks, you should reverse-engineer the source game first, if it hasn't been already. It benefits you now; it benefits you later; it benefits the community. When you reverse an old game, everybody wins.

-----

† Well, okay, maybe you'll be limited by the fact that the system architecture only has an address-space so large, and the original game as written was small enough that it didn't use banking, but upon expanding it you now need to use banking. That's a hurdle, but not an insurmountable one... unless the system didn't even support banking. (Are there any systems that didn't?)

grawprog · on Feb 25, 2021

I'm not sure rom hackers and people who can entirely dissassemble a rom always overlap.

You don't need full knowledge of assembly or your target system's architecture to open a rom in a hex editor and play with some values.

The skills and tools needed to dissassemble a rom, reprogram it and recompile it are a bit more involved.

Saying, nobody should bother hacking roms unless they have the skills to dissassemble and recompile the game is going to discount a lot of the people that have brought the state of rom hacking and modification to what it is today.

milesvp · on Feb 25, 2021

Yeah, while I agree with the parent about taking the time to disassemble a rom for larger projects, I feel he may be ignoring how low a barrier to entry it is to just modify some hex values and see them effect the rom you’re playing. It used to be a common first thing budding programmers did was write trainers for games, which often amounted to little more than changing a hex value from 0x03 to 0xff. Now you have 255 lives instead of the stock 3! Very heady stuff for someone in grade school, especially if those 3 lives cost a quarter in the arcade. There are a lot of people out there with little computer chops who would get a real thrill just replacing some of the sprites in their favorite game too.

kmeisthax · on Feb 25, 2021

I spent about a year doing this to the Telefang translation patch itself. No, not Telefang, the patch to Telefang. It was made with a version control system of "hex edit the last patched ROM to do what you want and post an IPS on the forums"... even though we were also writing tools to compress graphics and reinsert scripts (from MediaWiki, no less); as well as hand-assembled patches to get VWF text going. Needless to say this made fixing certain bugs or revising old work extremely difficult. For example, there were multiple versions of useful custom functions sitting in bank 0 (which had around 200 bytes free); developers had recreated certain utilities independently and assembled them separately into the HOME.

Also, this was a multi-version game, because late-90s mongames demanded it. We only had worked on Power Version, so anyone who wanted to play Speed Version in English was SOL. Fortunately, both versions were similar enough that once I had disassembled our own patch, and modified the title screen a bit, Speed Version just "worked". Such is the power of having source code.

ThomasWinwood · on Feb 27, 2021

You make it sound like there's a machine you can put a Game Boy ROM into and get out a disassembly, which is kinda true (https://github.com/mattcurrie/mgbdis) but it doesn't automatically split out data blocks or anything like that - it just tries to crawl the ROM and disassemble any code it can find. It's certainly not "effectively no additional work" than making targeted alterations to the binary and documenting your work.

And that's before you get to platforms where most if not all games are written in C - I question whether a mere disassembly of a game like Pokemon Emerald would even be useful to anyone, whereas the pokeemerald decompilation (https://github.com/pret/pokeemerald) is clearly useful but was a heck of a lot more work to produce.

> That's a hurdle, but not an insurmountable one... unless the system didn't even support banking. (Are there any systems that didn't?)

Depends what you mean by "support". I don't think any system has a built-in mapper - they just assign a chunk of memory space to the cartridge bus, and if your game is larger than that chunk of memory space you include a mapper on the cartridge. Nintendo provided standard mappers for machines like the NES and Game Boy because it's very hard to include a substantial game in the wedge of memory space you get on the processors in those machines, whereas only one game on the Genesis/Megadrive needed one.

LocalH · on Feb 26, 2021

>With these tools, this reversing process is effectively no additional work over that required to develop a complete understanding of what the original game's ROM "says" — which you'd need to do either way.

Respectfully disagree. Figuring out what binary blobs do, and editing/injecting those blobs, is much easier than reversing the entire logic to usable source, even with the advanced state of the scene making the latter easier than ever.

jsmith45 · on Feb 25, 2021

Clean disassembly can be extremely hard for some games, and not always do what you want anyway.

I mean sure, it is always possible to dump out "something" that can reassemble to the original if you don't change anything. It is less easy to get everything correct such that if you change the length of a code block or some data array the rom still fully works. This potentially requires having properly identified all the code vs data, and all data that contains static pointers to other data (including immediate instruction operands) and replacing them with labels. But on some systems that is still not too hard.

On others it is terrible. For example the SNES where instruction sizes vary depending on the accumulator or index register mode sizes. This along with occasional use of non-standard flow control (like subroutine calls that never return, but instead pop the return address off the stack, and use it to find a jump table that occurs right afterwards), makes most attempts at automated disassembly fail. Semi-automated disassembly guided by information about instruction widths from an emulated playthrough helps, but getting such traces for all the code in a game can be really difficult.

Even if you do end up with a completely accurate disassembly, significant changes are not always easy. SNES ROMs code banks may have little room remaining. Subroutines that can be called cross bank are different from those that can be called in the same bank, and there is very much visible differences to the assembly for cross bank data access vs "current bank" (which is not always the same as the current code bank). Assemblers that do the right thing here don't really seem to exist, and would be based on fallible heuristics that will silently misassemble in some cases leaving you with a crash.

While obviously nothing is insurmountable here, it does mean that using ROM Hacking style techniques to avoid disturbing the bank layout can sometimes be easier than making the change you want, and then trying to shuffle things so that your new code will still fit.

Things get even worse if you want to maintain glitch level compatibility with the original game (as is common for things like practice hacks for games that have glitch speedrun categories). Some glitches very much depend on out of bounds data accesses returning a specific value which happens to be some code, or a specific value in some later table. It can be difficult to preserve those while allowing code and data to be relocated. (Other glitches don't always require these sorts of things, and Nintendo has accidentally preserved many glitches in some games while porting from assembly to C.)

I will admit that the SNES is MUCH, MUCH worse than most other consoles in the sort of headaches I describe above. Most other consoles tend to have an unambiguous disassembly that can be far more automated, newer console tend to avoid having near/far pointers and near/far subroutines, etc. The glitch compatibility considerations still apply though.

I'm not arguing against full reverse engineered assembly of games, that can be wonderful. And starting from those can be a great way to develop a hack on some systems. But it hardly obsoletes traditional rom hacking, which can require much much less effort to get started with, and for some scenarios may be the better fit even with a complete disassembly available.