Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
wesleyjohnson

P_CheckMissileSpawn bug

Recommended Posts

Posted (edited)

You would think that every bug would be found by now.

 

The grenade launcher bug with DoomLegacy was tracked down to a bug in P_CheckMissileSpawn.

I found the same code in Boom, PrBoom, Eternity, and Chocolate Doom.  Every port derived from these probably has it.

 

The function P_CheckMissileSpawn in p_mobj.c has this


  // move a little forward so an angle can
  // be computed if it immediately explodes

  th->x += (th->momx>>1);
  th->y += (th->momy>>1);
  th->z += (th->momz>>1);

The problem is that the function P_UnsetThingPosition depends upon the position not being changed while the thing is linked into the block list.

These three lines modify the thing position, without doing an UnsetThingPosition first.

The UnsetThingPosition of the TryMove will try to unlink the thing from the blockmap based upon a block indexing dependent upon those x and y positions.

That small change is enough to put the x,y into a different block of the block map.

 

When the x,y of the thing is changed without unlinking from the blockmap list first, it tries to unlink in the wrong blockmap list.

 

The UnsetThingPosition in p_maputl.c has some Killough code for unlinking from the blockmap,  It does not check if the thing is in the blockmap or if anything is in the blockmap.

It only checks if the thing->bprev is NULL.  If it is NULL then that thing MUST have been at the head of the blockmap list, and so the blockmap list is set to thing->bnext.

This corrupts that list, and leaves the thing still at the head of the other blockmap list.

 

Later, when the thing gets linked into a blockmap list again, it can get linked to itself in the list.

Doing a blockmap scan on such a corrupted list will result in a tight loop, and the program will be hung.

 

 

This is the fix that I have used.


// Cannot change thing position while it is set in blockmap.
// Errors occur during unset because get different block.
#if 1
    // A slight difference may make demos lose sync.
    // move a little forward so an angle can
    // be computed if it immediately explodes
    fixed_t t_x = th->x + (th->momx >> 1);
    fixed_t t_y = th->y + (th->momy >> 1);
    th->z += (th->momz >> 1);  // does not affect blockmap
#else
    // move a little forward so an angle can
    // be computed if it immediately explodes
    th->x += (th->momx >> 1);
    th->y += (th->momy >> 1);
    th->z += (th->momz >> 1);
#endif

    if (!P_TryMove(th, t_x, t_y, false))
    {

 

Without such a fix the grenade launcher (dt-gren.zip) will cause corruption by halfway through the first level of doom2.  I have verified that with several attempts, and it is consistent.

I have verified that with this fix the faults stop and I was able to play through three levels using the grenade launcher (dt-gren.zip) without triggering the blockmap corruption detection code.

This was on DoomLegacy instrumented for the bug and with blockmap detection code in place.

 

 

-----

Another idea.

The unlinking can be made more idiot proof my keeping the link to the head of the blockmap in the bprev ptr when it is at the head of a blockmap list.

Then if the thing position is changed, no problem, because the unlinking from the blockmap is now independent of the current x,y position.

The first thing in the list has its bprev ptr set to point at the head of the blockmap list (instead of setting bprev to NULL).

It is detected by testing the bprev ptr as being between blocklinks and blocklinks_end, which is actually cheaper than the complicated test that it replaces.

If it points to the head then that target is updated as the head.  Otherwise it is updated as another thing.

When the head thing gets unlinked, this ptr can be copied to the next thing, same as before.

 

 

Edited by wesleyjohnson

Share this post


Link to post

AIUI blockmap corruption is only a vanilla bug. In Boom there is no dependence on the actor's current position in P_UnsetThingPosition. It is unlinked from whichever blocklinks list the previous P_SetThingPosition linked it into, regardless of changes to thing->{x,y} in the interim. It cannot be unlinked from the wrong list.

 

The pointer-to-pointer scheme (thing->bprev being a pointer to either the list head, if it is the first item on the list, or otherwise the previous item's bnext pointer, not simply a link to the previous item) is quite clever & elegant but took me considerable effort to understand.

Share this post


Link to post

My C# port had a hang-up once due to an infinite loop, but I couldn't identify the cause, which has been bothering me. I think this might be the reason 🤔

Share this post


Link to post
Posted (edited)

In the copy of Boom that I have (with uppercase file names), this is what appears in P_MAPUTL.C.

(Of course I looked at my copy of Boom, before I followed your link to a GIT repository).

 

//
// P_UnsetThingPosition
// Unlinks a thing from block map and sectors.
// On each position change, BLOCKMAP and other
// lookups maintaining lists ot things inside
// these structures need to be updated.
//

void P_UnsetThingPosition (mobj_t *thing)
{
  if (!(thing->flags & MF_NOSECTOR))
    {
      // inert things don't need to be in blockmap?
      // unlink from subsector
      if (thing->snext)
        thing->snext->sprev = thing->sprev;

      if (thing->sprev)
        thing->sprev->snext = thing->snext;
      else
        thing->subsector->sector->thinglist = thing->snext;

        // phares 3/14/98
        //
        // Save the sector list pointed to by touching_sectorlist.
        // In P_SetThingPosition, we'll keep any nodes that represent
        // sectors the Thing still touches. We'll add new ones then, and
        // delete any nodes for sectors the Thing has vacated. Then we'll
        // put it back into touching_sectorlist. It's done this way to
        // avoid a lot of deleting/creating for nodes, when most of the
        // time you just get back what you deleted anyway.
        //
        // If this Thing is being removed entirely, then the calling
        // routine will clear out the nodes in sector_list.

      sector_list = thing->touching_sectorlist;
      thing->touching_sectorlist = NULL; //to be restored by P_SetThingPosition
    }

  if (!(thing->flags & MF_NOBLOCKMAP))
    {
      // inert things don't need to be in blockmap

      if (thing->bnext) // unlink from block map
        thing->bnext->bprev = thing->bprev;

      if (thing->bprev)
        thing->bprev->bnext = thing->bnext;
      else
        {
          int blockx = (thing->x - bmaporgx)>>MAPBLOCKSHIFT;
          int blocky = (thing->y - bmaporgy)>>MAPBLOCKSHIFT;
          if (blockx>=0 && blockx < bmapwidth &&
              blocky>=0 && blocky <bmapheight)
            blocklinks[blocky*bmapwidth+blockx] = thing->bnext;
        }
    }
}

 

Right at the bottom is the indexing by thing->x and thing->y that selects the wrong blocklinks[].

 

PrBoom has different code for UnsetThingPosition, which I think is a version of what follows, but the PrBoom is done in that Killough style that is hard to read.

I assume that the RjY link to the GIT repository is for PrBoom.

Eternity has the PrBoom method, and Chocolate has the Boom buggy unlink.

I am not sure if this is the kind of bug that Chocolate would want to keep as the main effect of it is to crash or hang the program.

 

I actually had looked at that PrBoom code before posting, but as it was the tail end of an all nighter, not all the brain cells were still awake, and I needed them all to unwind that Killough code to recognize what it was.  His comments actually make sense in hindsight, they usually don't help at all.

 

---

Unlink that is not dependent upon thing x,y.

I actually came up with this before RjY pointed out the PrBoom Killough code that seems to do the same thing.  I have not yet entirely convinced myself as that would require another painful session of trying to figure out exactly what that Killough code does.

 

// In mobj_t
  mobj_t *   bnext;
  mobj_t * * bprev;  // only used for unlinking

// To link
  headp = &blocklinks[...];

  thing->bnext = *headp;
  thing->bprev = headp;  // ptr to head
  if( *headp ) {
    *headp->bprev = & thing->bnext;  // ptr to thing link
  }
  *headp = thing;

// To unlink
  // Do not need x,y to unlink.
  if( thing->bprev ) { // IF NULL then is not linked, could be error.
      *(thing->bprev) = thing->bnext;  // unlink from prev head or thing.
      if( thing->bnext )
          thing->bnext->bprev = thing->bprev;  // propogate back ptr
  }

  // Can set bnext and bprev to NULL, or not.
  thing->bnext = NULL;
  thing->bprev = NULL;

 

---

The CheckMissileSpawn still violates a rule that the rest of the program follows.  It still should be fixed.  Putting in a sneaky fix somewhere else and not even commenting about it in the offending code is how you make fragile, hard to maintain code.

 

Edited by wesleyjohnson

Share this post


Link to post
Posted (edited)

 

2 hours ago, wesleyjohnson said:

PrBoom has different code for UnsetThingPosition, which I think is a version of what follows, but the PrBoom is done in that Killough style that is hard to read.

I assume that the RjY link to the GIT repository is for PrBoom. 

Eternity has the PrBoom method, and Chocolate has the Boom buggy unlink.

I am not sure if this is the kind of bug that Chocolate would want to keep as the main effect of it is to crash or hang the program. 

  

 

 

Why should it be changed?

This sounds to me that all relevant ports that care about such things already perform stable blockmap management.

And this is precisely the kind of thing Chocolate Doom will never - ever - fix, as its stated mission goal is perfect recreation of any and all vanilla quirks, i.e. if vanilla crashes, so will Chocolate, and if Chocolate crashes, so will vanilla. Why should this one be different than, say, a visplane overflow crashing the engine?

 

Yes, CheckMissileSpawn is somewhat bugged, but if that gets changed it will also affect demo compatibility, which for many of the affected ports is important - they simply can't without breaking one of their most important selling features. It could only be done as an option, but that's only useful if observable odd behavior is caused by it.

It's something I wouldn't even want to touch in GZDoom because it's impossible to foresee what side effects it may cause to change game logic like that.

Share this post


Link to post
5 hours ago, wesleyjohnson said:

The CheckMissileSpawn still violates a rule that the rest of the program follows.  It still should be fixed.  Putting in a sneaky fix somewhere else and not even commenting about it in the offending code is how you make fragile, hard to maintain code. 

 

 

To me, the comment reads like it was changed so that it didn't have to do a blockmap lookup if it's the head for a performance / code simplicity standpoint. Killough also notes that: "Also more robust, since it doesn't depend on current position for unlinking." So the fact that it fixed P_CheckMissileSpawn was simply an accident and wasn't the reason any the unlink code was changed in the first place. I'm not sure what the point of changing P_CheckMissileSpawn would be, you even said this yourself earlier:

 

14 hours ago, wesleyjohnson said:

Another idea.

The unlinking can be made more idiot proof my keeping the link to the head of the blockmap in the bprev ptr when it is at the head of a blockmap list. 

Then if the thing position is changed, no problem, because the unlinking from the blockmap is now independent of the current x,y position. 

 


But this was interesting anyway. Didn't know this was an issue in the original code.

Share this post


Link to post
Posted (edited)
19 hours ago, wesleyjohnson said:

The CheckMissileSpawn still violates a rule that the rest of the program follows.  It still should be fixed.  Putting in a sneaky fix somewhere else and not even commenting about it in the offending code is how you make fragile, hard to maintain code. 

 

FWIW A_VileAttack does not bother to call P_{Uns,S}etThingPosition while moving the fire, although A_Fire does. So there is somewhere else to fix if you are fixing this class of bugs. Also in P_Move, monsters on icy floors will be P_TryMove'd, then immediately returned and given an equivalent momentum push instead, without an unset/set. Edit: In looking up the previous link I discover that MBF proper actually does use unset/set here. Only PrBoom does not. Fascinating. I am not able to find any explanatory comment or commit message. I wonder what the thinking was. Edit: maybe performance, since P_SetThingPosition calls R_PointInSubsector, necessitating a BSP tree walk.
 

FWIW I also agree with others that the switch to the pointer-to-pointer scheme was for general robustness, not made specifically to fix P_CheckMissileSpawn. I think we are all agreed that it is an improvement since the first list item can find its head without having to guess from the current x,y. I am probably just making noise here, sorry.

Share this post


Link to post
Posted (edited)

Thank you RjY , you always have some of the best in depth knowledge.  That is why I waited on committing code until I read these comments today.
I am adopting the ptr scheme as I described, and as PrBoom also implemented, just because it prevents unlink problems.  It is also faster and smaller.

 

Changing CheckMissileSpawn makes me nervous too, which is why I checked how far the effects of using temps would carry.

It looks like TryMove will save of the old position, and changes to the new position values without even looking at the old position.

The old position is only used to check for crossing a special line.

The way CheckMissileSpawn updates x,y before calling TryMove looks to like it would prevent the missile from crossing special lines at spawn.

What does a missile do if it crosses a special line anyway ?  I have been working on too many projects to remember a missile being able to do anything except collide with one.

I also cannot see how in TryMove it avoids triggering special lines for Missiles.  I must be missing something, or I am just too tired now.


I think CheckMissileSpawn was buggy code.  If found it in Heretic and Boom, with the old position sensitive blockmap unlink.

Later MBF (Aug 98) made this change to the blockmap unlink, and I agree with doubts that Killough did it to fix anything.

The bug would not get triggered very often.  Only when a bouncy grenade was used was it even noticed in DoomLegacy, after 22 years.

 

For DoomLegacy it would not matter, I could go either way in CheckMissileSpawn actually.  Either requires a whole lot of explaining in comments.

 

The Vile attack is probably an exception, as that fire is not a physical object, and I don't think it appears in the blockmap at all.  MT_FIRE is MF_NOBLOCKMAP in info.

I doubt it has any effect in the sector link list either.

 

On ICE, without that blockmap update, shooting at them will sometimes fail, as the blockmap they are linked into is not the one you will be aiming at.

Looks to me to be a bug to not have it as in MBF .

 

I really want to document what effects are produced by these unusual x,y updates, so they can be maintained later.

It is like a latent bug waiting for someone.

With the change to unlinking, it is certainly possible to now relax the requirements on bracketing thing x,y updates with unset and set position.

The MBF unlinking is probably required by now, just to prevent corruption due to these other suspicious x,y updates.

 

 

Edited by wesleyjohnson

Share this post


Link to post
Posted (edited)

I just noticed in info.c  that every MISSILE is also MF_NOBLOCKMAP.

This was never noticed because missiles are not in the blockmap, normally.

It takes a DEH hack to create a missile object that is in the blockmap to trigger this bug.

 

 

Got to sleep now, bye.

Share this post


Link to post
11 hours ago, wesleyjohnson said:

I just noticed in info.c  that every MISSILE is also MF_NOBLOCKMAP.

This was never noticed because missiles are not in the blockmap, normally.

It takes a DEH hack to create a missile object that is in the blockmap to trigger this bug.

 

 

Got to sleep now, bye.

Probably such a missile is created in ROOM.WAD when punching self in the face, a BFG projectile is created with DEHACKED.

 

Could you tell me how to fix that in DEH file so that the projectile is MF_NOBLOCKMAP?

Share this post


Link to post

The MF_NOBLOCKMAP flag has value 16, so you'll have to add[*] 16 to the value of the "Bits" field of the thing in question.

 

[*] Technically, you'll have to binary-OR the current value and 16.

Share this post


Link to post

You will have to look into DT-GREN, file  MBFONLY.DEH how they did it.

The bouncing grenade started all this by triggering the bug.

And if one can do it, then there will likely be others.  Still got to fix it enough so that it does not crash the program.

For the rest, I would suggest at least adding comments that it violates an assumption but is allowed due to safer unlink in UnsetThingPosition.

 

 

Share this post


Link to post
Posted (edited)
# Turn a rocket launcher into a grenade launcher
# By Lee Killough
# And Don Tello (gwhittl@ibm.net)
# Works only with MBF

Thing 34
Bits = BOUNCES | DROPOFF | TOUCHY
Hit points = 1
Death sound = 82
Initial frame = 968
Death frame = 969
Missile damage = 200
Mass = 90

And look who is responsible for the thing.   Of course, he only tested it on MBF.

Share this post


Link to post
2 hours ago, wesleyjohnson said:

And look who is responsible for the thing.   Of course, he only tested it on MBF.

 

Yes, of course, because MBF was the first port to introduce the concepts of BOUNCY and TOUCHY and the only port to support them at that time. 

Share this post


Link to post

Not that it has any effect here, but I got curious.

They made two dehacked files for the grenade launcher.

The DT-GREN deh file for normal Doom (DT-GREN.DEH) has the following:

Patch File for DeHackEd v3.0

# Note: Use the pound sign ('#') to start comment lines.

Doom version = 17
Patch format = 6


Thing 34 (Rocket (in air))
Speed = 1638400
Mass = 20
Bits = 66576

The Bits translate to  0x00010410 = MF_MISSILE | MF_DROPOFF | MF_NOBLOCKMAP.

They tried to do the same grenade, same graphics, except that it does not bounce.   Makes me wonder why the NOBLOCKMAP made it into the one and not the other.  Did they find out something when they tested it?

Note that the names are not on this version, so I wonder if Killough only made the MBF version.  The whole header and layout is different.

 

Kindly note what would happen to the program, if that NOBLOCKMAP has been left out of the grenade launcher for the normal Doom.  Anyone can make such a DEH file.

Depends if you want your port to survive malformed DEH files, or want to authentically crash like Vanilla Doom.

(YESS, malformed does depend upon if your port can survive loading the DEH or not).

Share this post


Link to post
1 hour ago, wesleyjohnson said:

Makes me wonder why the NOBLOCKMAP made it into the one and not the other.  Did they find out something when they tested it?

Actually, the fact that projectiles without NOBLOCKMAP will randomly crash vanilla Doom has been common knowledge since as early as 1995 (the phenomenon is mentioned in Fun with DeHackEd!, for example), though until this thread I don't think anyone has known why.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×