Friday, October 2, 2020

Sentimental Shooting Graphics Files

So at the end of my previous post, I left a juicy little nugget of information about me poking around the game's graphics files.  I've "completed" a major milestone in doing so, and while there's still stuff left to do, I have enough to actually post about.

The game's graphics are in the PCG and UME files.  Judging by file extension alone, you wouldn't expect them to be any kind of standard file format that's documented outside of a dark corner of the internet, but... you'd be wrong.  Both are standard Windows BMP files, changed ever so slightly so that they don't appear to be proper files if you change their extensions and open them.

The PCG files are the omake pictures, and simply have 0x70 added to the first 0x03E8 bytes of the file.  That's it.

The UME files are the stage backgrounds, clothing fragments, enemy and boss sprites, menus, and everything else.  System.ume is actually a BMP that's just renamed, but the rest are very close to being proper BMP files.  Some have extra bytes at the end that any viewer or editor can easily ignore, but the real kicker is that they have improper data for some of the header fields.  You can make extremely minimal changes to allow an image editor to open them, but... for the stage backgrounds at least, we have the ability to make the game output the file directly.  Making these minimal changes does not result in a file that matches the one generated by the game.

Thus began my journey for matching output.  I discovered that the stage backgrounds saved by the game in the SNAPSHOT directory have various values pulled in from System.ume, including the color palette.  Most of the UME files for stage backgrounds actually have a zeroed out section of their palette where the palette from System.ume should go, which made it easy to spot.  Something about how the game handles the data zeroes out the padding in each row by the time it ends up in the SNAPSHOT directory, and any extra bytes at the end of the file get removed.  Towards the end I identified two groups of files that needed different behavior, but couldn't pinpoint how to detect this from the file contents alone, so I gave in and implemented dirty hacks that check hardcoded filename lists.  It's finally over, I have matching output for all 24 stage backgrounds, and no visible corruption in any of the other files, but... it doesn't feel right with the dirty hacks still in there.  This is why I put the word "complete" in quotes in the first paragraph: It works, but I don't like it.

One issue is that some files exhibit... interesting changes if the System.ume palette is imported, even though they have the space for it.  Some of these changes are easily identified to be incorrect, but others I'm not so sure.  I have an idea how to proceed here, but it's low priority in the long run.

A major sticking point is that I have no idea if my code that generates matching output for the stage backgrounds also does so for any of the other files.  I have no way to make the game output these files, and thus no way to see what needs to be done.  Or, do I?

I have reached a point where all of my remaining logical steps involve disassembling the game to see how it loads the UME files, handles the color palettes, and outputs snapshots.  This will be difficult for me for three reasons:
  1. I don't know x86 assembly
  2. x86 assembly is far more complex than the assembly languages I've learned so far (Z80 and 65816)
  3. I don't trust the NSA so I'm not installing Ghidra (so there goes its much-touted decompilation feature)
I can still think logically enough to know how to proceed after getting a non-Ghidra disassembler installed, though.  I've already looked at SGSTG.EXE in HxD, and I can clearly see the strings the game uses.  Windows executables include the names of the various API functions called by the executable, and I can see the various file names as well; so labelling all the strings I'm interested in and then searching the code for references to them seems like a cromulent first step.  From looking around at different disassemblers, "identification of Windows API calls" is a feature that pops up a lot, which would be handy.  Figuring out how it builds the palette from reading the files shouldn't be that difficult once I have my foot in the door, it's just a matter of copying data around and that's generally pretty easy to follow in any assembly language.

Still though, I now have PowerShell scripts that work for both the PCG and UME files, both of which produce output that matches what the game outputs on its own for every file I can possibly verify it with.

Fixing the palette corruption issues that the Windows compatibility modes mitigate by tweaking the corresponding UME files looks to be possible.  Logo.ume defines a full 256-color grayscale palette but only uses a small portion of it.  Enemy2.ume, which contains the sprites in which I'd noticed some palette corruption, has plenty of unused palette entries.  I just need to figure out which palette entries get corrupted, and move them to areas of the palette that were previously unused.  The disassembly can potentially help with that, if I can catch where the palette corruption happens and discover its true extent.

The pipe dream would be to fix whatever code is causing the corruption in the first place and just patch SGSTG.EXE.

I think I'll leave it there for now.  I'm at the point where I can open a BMP file in HxD and identify all the fields without having to consult an external reference.  If I don't post anything on this subject for a while, I've probably drowned myself in x86 assembly just to tweak a silly hentai game, or given up.

No comments:

Post a Comment

I moderate comments because when Blogger originally implemented a spam filter it wouldn't work without comment moderation enabled. So if your comment doesn't show up right away, that would be why.