Making NPCs hear sounds in Unity

Background

A critical part of any stealth game (and most FPS games, realistically) is detection. The player obviously detects enemies through the use of sound effects and visual cues from the screen, but the way NPCs detect the player is substantially different: most NPCs do not "see", nor do they "hear". They don't use a rendered output from a camera position at their eyes to identify the player, nor do they use any clever system for picking out sound effects. 
For visual detection, they typically use raycasts - from their eyes to points on the player, which is already well documented for Unity and other engines. Sound detection is not so well known, though, as it does not apply to as wide a set of games. There is very limited literature available for sound detection, so I decided to come up with my own system.

Objectives

When setting out to conceptualise a system of sound detection, I stated a few requirements:
  1. The volume of a sound should affect the distance at which it can be detected
  2. The volume and possibly type of a sound should affect the stimulus sent to the NPC (Gunshots and footsteps > debris and physics noises)
  3. Sounds of a lower importance order should not override sounds of a higher order (i.e. If an NPC chases the sound of a gunshot, they should not be able to be immediately distracted by a small sound behind them)
I also wanted the system to be performance cheap, and to avoid over-use of instantiation of game objects and occupation of memory.

Implementation - Detectable Sounds

I already had in place an implementation for collision-based sound generation. Simply put, I have a script attached to every object that I want to generate sounds upon impact other objects, including moving props like barrels and crates as well as static objects like fences or terrain.
They discriminate between "hard" and "soft" impacts based on a parameter that acts as a relative velocity threshold. Using the OnCollisionEnter function and checking if the relative velocity of the impact is above either of these thresholds allows for quick and cheap physics sounds (to extend this to be more realistic, use OnCollisionStay and keep track of the contact points the object has, and play a sound whenever a new one is added)


With this collision sound effect system in place, I set out to devise a way to make the sound effects themselves detectable.

In the interest of performance, I opted for a typical object-pooling method; I have a fixed array of these detectable sounds, which are dequeued and activated when needed, and disabled and enqueued when destroyed. This allows us to circumvent the overhead of instantiating an object in Unity.
I also created a class to more easily represent and access the important information of these detectable sounds. The C# code for this is as follows:

public class SoundSphere {
 private SphereCollider _collider;

 private GameObject _obj;

 public SphereCollider collider {
  set {
   _collider = value;
  }
  get {
   return _collider;
  }
 }

 public GameObject obj {
  set {
   _obj = value;
  }
  get {
   return _obj;
  }
 }

 public Transform transform {
  get {
   return _obj.transform;
  }
 }

 private float _radius;
 public float radius {
  set {
   _radius = value;
   collider.radius = value;
  }
  get {
   return _radius;
  }
 }

 private Vector3 _position;
 public Vector3 position {
  set {
   transform.position = value;
   _position = value;
  }
  get {
   return _position;
  }
 }
 
 public SoundSphere() {
  obj = new GameObject("Sound Sphere");
  obj.SetActive(false);
  obj.transform.parent = null;
  obj.tag = "DetectableSound";
  obj.transform.position = Vector3.zero;
 
  collider = obj.AddComponent<SphereCollider>();
  collider.isTrigger = true;
 }

 public void Create(Vector3 position, float radius/*, float lifetime*/) {
  this.radius = radius;
  this.position = position;
  obj.SetActive(true);
 }

 public void Destroy() {
  obj.SetActive(false);
 }
}

All this is simply to encapsulate the components (such as the sphere collider) in a way that can be accessed like a field, using C#'s property structures. This makes accessing the relevant information and calling the relevant functions much less verbose.

Then, in a new component (DetectableSoundScript), I create a queue filled with these SoundSphere instances. I also create a single Unity-stock sphere collider, that I will use for moving sounds (e.g. a radio producing a constant sound that moves as it does)

private static Queue<SoundSphere> soundSpheres = new Queue<SoundSphere>(0);
private static List<SoundSphere> currentlyActiveSpheres = new List<SoundSphere>(0);

private SphereCollider movingSoundCollider;

private const int MaxSpheres = 25;


Then in the Awake function, I do a singleton-style initialisation wherein I create the static queue of SoundSphere instances, of the length MaxSpheres, as defined above. I also initialise the single moving detectable sound accordingly.


void Awake() {
 
 // Static
 if (soundSpheres.Count <= 0) {
  soundSpheres = new Queue<SoundSphere>(MaxSpheres);
  currentlyActiveSpheres = new List<SoundSphere>(MaxSpheres);
 
  for (int i = 0; i < MaxSpheres; i++) {
   soundSpheres.Enqueue(new SoundSphere());
  }
 }
 
 GameObject movingSoundObj = new GameObject("Moving Sound Sphere");
 movingSoundObj.SetActive(false);
 movingSoundObj.transform.parent = this.gameObject.transform;
 movingSoundObj.transform.localPosition = Vector3.zero;
 movingSoundObj.tag = "DetectableSound";
 movingSoundObj.transform.position = Vector3.zero;
 movingSoundCollider = movingSoundObj.AddComponent<SphereCollider>();
 movingSoundCollider.isTrigger = true;
}

You'll notice the tag "DetectableSound" is used for both the moving and static sounds - this is to allow them to be quickly checked against in functions like OnTriggerEnter, which I use later for the hearing of sounds.



public void CreateDetectableSound(Vector3 position, float radius, float lifetime) {
 SoundSphere active = soundSpheres.Dequeue();
 active.Create(position, radius);
 currentlyActiveSpheres.Add(active);
 StartCoroutine(RemoveDetectableSound(active, lifetime));
 lastRadius = radius;
 lastSoundPosition = position;

}

This is the function for instancing the static sounds. Simply, it grabs a sound sphere off of the queue, calls its Create function, which sets the position and radius of the sphere and activates it, before adding it to the list of active spheres.
Then, I do something that might look unusual to a lot of Unity developers - I start a coroutine. Simply put, this allows code to execute on a delay. It's identical to Unity's own Invoke function, but allows for arguments to passed in. The coroutine in question, RemoveDetectableSound, looks like this:


IEnumerator RemoveDetectableSound(SoundSphere sphere, float lifetime) {
 yield return new WaitForSeconds(lifetime);
 currentlyActiveSpheres.Remove(sphere);
 sphere.Destroy();
 soundSpheres.Enqueue(sphere);
}

The function waits for the sphere's lifetime to be up before removing it from the list of active spheres, and calling the SoundSphere.Destroy method, disabling it, and adding it back onto the end of the queue of available spheres.


As for the moving spheres, it's pretty much as you'd expect:

public void CreateDetectableMovingSound(float radius, float lifetime) {
 movingSoundCollider.radius = radius * radiusMultiplier;
 movingSoundCollider.gameObject.SetActive(true);
 movingSoundCollider.transform.localPosition = Vector3.zero;
 Invoke("RemoveMovingSound", lifetime);
}
 
private void RemoveMovingSound() {
 movingSoundCollider.gameObject.SetActive(false);
}

Implementation - Hearing

I'm going to exclude my NPC script itself, since there's many better guides out there to tell you how to do different alertness levels and behaviour etc., but this is an outline of my chosen implementation:
  • The NPC has a function SendStimulus, that allows other scripts to send events that alter the NPC's "alertness" state
  • The NPC will update its behaviour if a stimulus of the same alertness/importance or higher is sent
  • Over time, the NPC will decay back through the alertness levels - Alert to Suspicious, Suspicious to Idle etc.
  • The NPC will wander if Idle, and chase the target/target's last known position if suspicious or alert
With that SendStimulus function, we can create some "detectors" that will send stimuli based on any overlapping SoundSphere objects;

I created a script, HearingScript, that checks its own trigger for overlaps with objects tagged as "DetectableSound".

public OrcaScript guard;
private SphereCollider sphereDetector;
 
 
private float _pollRate = 1.0f;
private float _pollCurrent = 0.0f;

The OrcaScript here is just my NPC script, that has public functions for sending stimuli etc.
The SphereCollider is the trigger we'll use for detecting sounds, and the polling variables are for switching on and off the trigger after detecting sounds, to avoid flooding the NPC with multiple sounds at a time (This is optional, you may find you have better ways to address this problem). Should you choose to use a polling system for the trigger, it should look like this:
private void Update() {
 if ((_pollCurrent += Time.deltaTime) > _pollRate) {
  sphereDetector.enabled = true;
  _pollCurrent = 0f;
 }
}

I also specify some fields that determine how alerting sounds detected by this sphere are, and how "loud"/big they have to be to be detected at all:
[SerializeField]
private float radiusThreshold = 0.0f;
 
// The higher the value, the more stimulating/resolving the detection will be
[SerializeField]
[Range(1, 3)]
private uint detectionDegree = 1;
A higher threshold here means sounds below that threshold will not be detected at all.

The real work is done in the OnTriggerEnter function:

private void OnTriggerEnter(Collider other) {
 if (other.tag == "DetectableSound") {
  SphereCollider sc = other.GetComponent<SphereCollider>();
  if (sc.radius >= radiusThreshold) {
   // If the radius of the sphere is many times more than the threshold of this listener, increase the detection degree (div by square to reduce)
   guard.SendStimulus(LookerScript.detectionTypes.OBJECT, detectionDegree , other.transform);
   sphereDetector.enabled = false;
  }
 }
}

Here we check the tag to ensure the object is a detectable sound, then we just check the radius of the sphere collider of the detectable sound to determine if it's loud enough. Remember, the radius of the collider is determined upon creating the sound and is intended to represent the volume and range of the sound.
If it is indeed over the threshold of this particular hearing sphere, we send a stimulus to the NPC script matching the sphere's degree with the position of the object we detected. We then disable our own sphere collider, to be re-enabled by the polling in the update function.


On my NPC, I have 4 sphere triggers of increasing radius and threshold, and decreasing alertness level, to give the effect of range-limited hearing on this character. This will mean the slightest noise close to him will be detected sharply, but at a far enough distance you'd have to make a very loud noise to be heard.

Lastly, we need to have something actually make the sounds, so going back to the collision sound effect script I have on noise-making objects, I can just add a DetectableSoundScript and a few lines to produce the detectable sounds: 

public void PlaySoft(Vector3 point, float vol) {
 if (producesDetectableSounds) {
  dss.CreateDetectableSound(point, detectableSoundVolMultiplier * (vol / 2f), 0.1f);
 }
 AudioSource.PlayClipAtPoint(_softImpacts[Mathf.RoundToInt(Random.value * (_softImpacts.Length - 1))], point, vol);
}
public void PlayHard(Vector3 point, float vol) {
 if (producesDetectableSounds) {
  dss.CreateDetectableSound(point, vol * detectableSoundVolMultiplier, 0.1f);
 }
 AudioSource.PlayClipAtPoint(_hardImpacts[Mathf.RoundToInt(Random.value * (_hardImpacts.Length - 1))], point, vol);
}

The code is pretty self explanatory; either function produces a detectable sound sphere if the script instance has the producesDetectableSounds flag checked - this is to allow control over which sound-effect generating objects should be detectable.

I have DetectableSoundScript attached alongside other scripts that should make detectable sounds, such as my weapons and footstep scripts, attaching them programatically in the Start functions since the script doesn't need any fields to be set in the editor and thus doesn't necessarily need to be attached before starting the game.

Possible Extensions

Here are some ways this can be extended:
  • Only produce a sound sphere if it is either loud/big enough, or if it is far enough away from the last one, or if there are no spheres currently active. This will prevent many small events from occurring at the same sort of area, which would realistically be a waste of spheres, but still allowing for loud or spaced-out sound effects to generate spheres.
  • Increase the detection level sent to the NPC based on the volume - i.e. the farthest reaching, lowest level hearing sphere would typically send a "Suspicious" level stimulus, but could send an "Alert" level if the sound is many times over the threshold of the hearing sphere
  • Increase the sensitivity of the NPC or lower the hearing spheres' thresholds when the NPC is at a high alert level, and decrease them when the NPC is idle, to emulate heightened attention and focus when in danger
  • Have a memory system to allow an intelligent character to memorise an object making noises, so it isn't alerted by it multiple times
  • Tie the hearing into vision, to allow a humanoid character to confirm a detected sound upon seeing its source (For example, a soldier could hear a tin fall off of a shelf, enter the room, and deduce that it was nothing to investigate further)
  • Contextually prevent sounds from causes any behavioural changes - such as the player's walking footsteps in a crowded military compound, which soldiers would conflate with the steps of an ally in a real scenario and thus ignore.

Comments

Popular Posts